2008年7月31日木曜日

Cloud Computing: The Nine Features of an Ideal PaaS Cloud

PaaSベースのCloud Computingに必要な要件を整理した記事。
多くのCloud Computing環境が登場する現在、オープンインタフェースの必要性を謳うケースが多くなってきており、本記事もその一例。 
複数の共通化すべきインタフェースを整理している事と、Cloud Computingとして前提となるScalabilityやLoad Balancingなどの機能を自動化すべき、とういポイントをよく抑えている。

Cloud Computing: The Nine Features of an Ideal PaaS Cloud

Clouds Should Be Open, Not Proprietary


What sort of cloud computer(s) should we be building or expecting from vendors? Are there issues of lock-in that should concern customers of either SaaS clouds or PaaS clouds? I've been thinking about this problem as the CEO of a PaaS cloud computing company for some time. Clouds should be open. They shouldn't be proprietary. More broadly, I believe no vendor currently does everything that's required to serve customers well.

What's required for such a cloud? I think an ideal PaaS cloud would have the following nine features:

1. Virtualization Layer Network Stability

Cloud computers must operate on some sort of virtualization technology for many of the following features to even be feasible. But as general purpose computing moves from dedicated hardware to on-demand computing, one key feature of the dedicated model for web applications is a stable, static IP address. If the virtualization layer borks (and this happens), when the cloud has recovered the cloud instances of compute, the developer should be able to rely on the web application just working without having to re-jigger network settings.

2. API for Creation, Deletion, Cloning of Instances

Developers should be able to interact with the cloud computer, to do business with it, without having to get on the phone with a sales person, or submit a help ticket. In other words, the customer should be able to truly get on-demand computing when they demand, whenever they demand. Joyent only began to offer this recently through Aptana and their Aptana Studio product. However, the API is only available to Aptana at this point. The API needs to be publicly available to everyone. Provide a credit card (that works and is yours) and you should get compute, storage, and RAM on-demand. The challenges for cloud computing companies is to figure the just-in-time economics that allow us to provide on-demand infrastructure without having lots of infrastructure sitting around waiting to be used. I think this means that cloud computing companies will, just like banks, begin more and more to "loan" each other infrastructure to handle our own peaks and valleys, But in order for this to happen we'd need the next requirement.

3. Application Layer Interoperability

Cloud computers need to support a core set of application frameworks in a consistent way. I propose that cloud computers should support PHP, Ruby, Python, Java and the most common frameworks, libraries, gems/plugins, and application/web servers for each of these languages. Essentially, a developer should be able to move between Joyent, the Amazon Web Services, Google, Mosso, Slicehost, GoGrid, etc. by simply pointing the "deploy gun" at the cloud (having used the API mentioned above to spin up instances) and go. Change DNS, done. But, no cloud computing company is innovating by providing better application layer solutions. We ought to support the most popular languages and move on. However, for a developer to truly have cloud portability, we need to support another requirement.

4. State Layer Interoperability

This is the most difficult problem to solve when scaling a web application, and, consequently, the area in which cloud computing companies are innovating while sacrificing interoperability. It's not simply a question of deciding that we should all support MySQL or Postgres because we will find that the needed requirement ("Automatic Scaling") is practically impossible to achieve with these tools. Amazon is innovating with SimpleDB, Google has BigTable as solutions for the problem, but developers can't leave either cloud because neither SimpleDB nor BigTable are available anywhere else. What is needed, and I'm looking ahead to the next requirement when I say this, is an XMPP-based state-layer that can flush out to some SQL-y store. Think open-source Tibco. The financial markets fixed these problems years ago. This datastore needs to speak SQL, be built using open-source and free software, and be easy for developers to adopt. The value cloud computing companies provide to developers is running the state layer for them, without requiring developers to use some proprietary state layer that may or may not provide scalability upon success and represents lock-in.

5. Application Services (e.g. email infrastructure, payments infrastructure)

A cloud computer should provide scaled application services consumable by developers in developing and delivering their own applications. There are two types of application services. The first group is delivered using open protocols/formats. Examples would be IMAP/SMTP, LDAP/vCARD, iCAL/ICS, XMPP, OpenID, OPML. All clouds should offer these open protocols/formats so that developers can move between clouds without having to rewrite their application. The second group is delivered as web services, are often proprietary to the cloud (therefore a means of differentiation), and include services such as payments, inventory.

6. Automatic Scale (deploy and forget about it)

All things being equal, a competent developer should be able to deploy to a cloud and grow to five billion page views a month without having to think about "scale". Just write the code, the cloud computer does the rest.

Is this achievable? Today, no. No cloud computer automatically scales applications. Part of the problem lies in the state layer. Part of the problem lies in what it means to scale. What is the measure of scale? Responsiveness? Scaling the state layer (e.g. the database) is a black art. Scaling the application layer or the static assets layer relies, in part on load balancing and storage.

7. Hardware Load Balancing

The cloud computer should provide the means to achieve five billion page views a month. I picked that number because it is big. If you're writing an application, and you want to be able to achieve tremendous scale, the answer shouldn't be to move off the cloud onto your own "private" cloud of dedicated servers. Of course, if the cloud computer is open as we've described, you can build your own cloud. It's also true you can generate your own electricity from coal, if you want to bother. But why bother? Software load balancers will get you nowhere close to the throughput required to achieve 5 billion page views per month. The state of the art is hardware load balancers.

8. Storage as a Service

Storage should be available to developers as a service. Where this is done today, it is done using a proprietary API and represents lock-in. The storage service should allow customers to consume endless amounts of storage and pay for only what is used. Objects on the storage service should be accessed by developers as objects rather than as nodes in a hierarchical tree. This way developers don't have to understand the hierarchy.

WebDAV could be an open protocol version of the storage service, but fails to provide the abstraction of treating objects as objects rather than nodes in a hierarchical tree. At present, I don't believe there is a reasonable solution to the problem that isn't also proprietary. We need to develop one that is open and free.

9. "Root", If Required

The cloud computer vendor can't think of everything a developer or application might need or want to do. So the cloud needs to be hackable and extensible by the developer and that means an administrative account of some sort that allows the developer to shape and mold the cloud to their specific needs. By definition, cloud computers must be built on top of some sort of virtualization technology, so the developer never has "root" to the cloud, only "root" to the developer's part of the cloud.