2009年11月30日月曜日

次世代データベースはHadoopだけではない。他にもいろいろ登場しており、その代表格、HBaseのCassandraの比較=>

両方とも、オープンソースとして提供されるNon-SQL型のデータベースエンジンで、大規模なデータセットを高速に処理することを強みとした、クラウド向けの技術である。

HBaseがGoogleのBigTable（これはHadoopベース）に非常に近い構造を持っているのに対して、Cassandraは、BigTableとDynamoと呼ばれるデータベースシステムのハイブリッド的な機能を持つ。

インターネット上のデータベースシステムは分散型であることが基本である事から、それを評価する基準として、CAP（Consistency：一貫性、Availability：可用性、Partitioning：分割)という指標が採用される事が多く、下記でそれをベースに比較が行われている。

結果として、高速なデータ書き込み重視でConsistencyがあまり必要ないアプリではCassandraが有利、データの書き込み重視の場合はHBaseが有利、という評価をしている。

HBase vs. Cassandra: NoSQL Battle!

Bradford

Distributed, scalable databases are desperately needed these days. From building massive data warehouses at a social media startup, to protein folding analysis at a biotech company, "Big Data" is becoming more important every day. While Hadoop has emerged as the de facto standard for handling big data problems, there are still quite a few distributed databases out there and each has their unique strengths.

Two databases have garnered the most attention: HBase and Cassandra. The split between these equally ambitious projects can be categorized into Features (things missing that could be added any at time), and Architecture (fundamental differences that can't be coded away). HBase is a near-clone of Google's BigTable, whereas Cassandra purports to being a "BigTable/Dynamo hybrid".

In my opinion, while Cassandra's "writes-never-fail" emphasis has its advantages, HBase is the more robust database for a majority of use-cases. Cassandra relies mostly on Key-Value pairs for storage, with a table-like structure added to make more robust data structures possible. And it's a fact that far more people are using HBase than Cassandra at this moment, despite both being similarly recent.

Let's explore the differences between the two in more detail…

CAP and You

This article at Streamy explains CAP theorem (Consistency, Availability, Partitioning) and how the BigTable-derived HBase and the Dynamo-derived Cassandra differ.

Before we go any further, let's break it down as simply as possible:

Consistency: "Is the data I'm looking at now the same if I look at it somewhere else?"
Availability: "What happens if my database goes down?"
Partitioning: "What if my data is on different networks?"

CAP posits that distributed systems have to compromise on each, and HBase values strong consistency and High Availability while Cassandra values Availability and Partitioning tolerance. Replication is one way of dealing with some of the design tradeoffs. HBase does not have replication yet, but that's about to change — and Cassandra's replication comes with some caveats and penalties.

Let's go over some comparisons between these two datastores:

Feature Comparisons

Processing

HBase is part of the Hadoop ecosystems, so many useful distributed processing frameworks support it: Pig, Cascading, Hive, etc. This makes it easy to do complex data analytics without resorting to hand-coding. Efficiently running MapReduce on Cassandra, on the other hand, is difficult because all of its keys are in one big "space", so the MapReduce framework doesn't know how to split and divide the data natively. There needs to be some hackery in place to handle all of that.

In fact, here's some code from a Cassandra/Hadoop Integration patch:

+ /*

+  FIXME This is basically a huge kludge because we needed access to

+ cassandra internals, and needed access to hadoop internals and so we

+ have to boot cassandra when we run hadoop. This is all pretty

+ fucking awful.

+  P.S. it does not boot the thrift interface.

+ */

This gives me The Fear.

Bottom line? Cassandra may be useful for storage, but not any data processing. HBase is much handier for that.

Installation & Ease of Use

Cassandra is only a Ruby gem install away. That's pretty impressive. You still have to do quite a bit of manual configuration, however. HBase is a .tar (or packaged by Cloudera) that you need to install and setup on your own. HBase has thorough documentation, though, making the process a little more straightforward than it could've been.

HBase ships with a very nice Ruby shell that makes it easy to create and modify databases, set and retrieve data, and so on. We use it constantly to test our code. Cassandra does not have a shell at all — just a basic API. HBase also has a nice web-based UI that you can use to view cluster status, determine which nodes store various data, and do some other basic operations. Cassandra lacks this web UI as well as a shell, making it harder to operate. (ed: Apparently, there is now a shell and pretty basic UI — I just couldn't find 'em).

Overall Cassandra wins on installation, but lags on usability.

Architecture

The fundamental divergence of ideas and architecture behind Cassandra and HBase drives much of the controversy over which is better.

Off the bat, Cassandra claims that "writes never fail", whereas in HBase, if a region server is down, writes will be blocked for affected data until the data is redistributed. This rarely happens in practice, of course, but will happen in a large enough cluster. In addition, HBase has a single point-of-failure (the Hadoop NameNode), but that will be less of an issue as Hadoop evolves. HBase does have row locking, however, which Cassandra does not.

Apps usually rely on data being accurate and unchanged from the time of access, so the idea of eventual consistency can be a problem. Cassandra, however, has an internal method of resolving up-to-dateness issues with vector clocks — a complex but workable solution where basically the latest timestamp wins. The HBase/BigTable puts the impetus of resolving any consistency conflicts on the application, as everything is stored versioned by timestamp.

Another architectural quibble is that Cassandra only supports one table per install. That means you can't denormalize and duplicate your data to make it more usable in analytical scenarios. (edit: this was corrected in the latest release) Cassandra is really more of a Key Value store than a Data Warehouse. Furthermore, schema changes require a cluster restart(!). Here's what the Cassandra JIRA says to do for a schema change:

Kill Cassandra
Start it again and wait for log replay to finish
Kill Cassandra AGAIN
Make your edits (now there is no data in the commitlog)
Manually remove the sstable files (-Data.db, -Index.db, and -Filter.db) for the CFs you removed, and rename files for CFs you renamed
Start Cassandra and your edits should take effect

With the lack of timestamp versioning, eventual consistency, no regions (making things like MapReduce difficult), and only one table per install, it's difficult to claim that Cassandra implements the BigTable model.

Replication

Cassandra is optimized for small datacenters (hundreds of nodes) connected by very fast fiber. It's part of Dynamo's legacy from Amazon. HBase, being based on research originally published by Google, is happy to handle replication to thousands of planet-strewn nodes across the 'slow', unpredictable Internet.

A major difference between the two projects is their approach to replication and multiple datacenters. Cassandra uses a P2P sharing model, whereas HBase (the upcoming version) employs more of a data+logs backup method, aka 'log shipping'. Each has a certain elegance. Rather than explain this in words, here comes the drawings:

This first diagram is a model of the Cassandra replication scheme.

The value is written to the "Coordinator" node
A duplicate value is written to another node in the same cluster
A third and fourth value are written from the Coordinator to another cluster across the high-speed fiber
A fifth and sixth value are written from the Coordinator to a third cluster across the fiber
Any conflicts are resolved in the cluster by examining timestamps and determining the "best" value.

The major problem with this scheme is that there is no real-world auditability. The nodes are eventually consistent — if a datacenter ("DC") fails, it's impossible to tell when the required number of replicas will be up-to-date. This can be extremely painful in a live situation — when one of your DCs goes down, you often want to know *exactly* when to expect data consistency so that recovery operations can go ahead smoothly.

It's important to note that Cassandra relies on high-speed fiber between datacenters. If your writes are taking 1 or 2 ms, that's fine. But when a DC goes out and you have to revert to a secondary one in China instead of 20 miles away, the incredible latency will lead to write timeouts and highly inconsistent data.

Let's take a look at the HBase replication model (note: this is coming in the .21 release):

What's going on here:

The data is written to the HBase write-ahead-log in RAM, then it is then flushed to disk
The file on disk is automatically replicated due to the Hadoop Filesystem's nature
The data enters a "Replication Log", where it is piped to another Data Center.

With HBase/Hadoop's deliberate sequence of events, consistency within a datacenter is high. There is usually only one piece of data around the same time period. If there are not, then HBase's timestamps allow your code to figure out which version is the "correct" one, instead of it being chosen by the cluster. Due to the nature of the Replication Log, one can always tell the state of the data consistency at any time — a valuable tool to have when another data center goes down. In addition, using this structure makes it easy to recover from high-latency scenarios that can occur with inter-continental data transfer.

Knowing Which To Choose

The business context of Amazon and Google explains the emphasis on different functionality between Cassandra and HBase.

Cassandra expects High Speed Network Links between data centers. This is an artifact of Amazon's Dynamo: Amazon datacenters were historically located very close to each other (dozens of miles apart) with very fast fiber optic cables between them. Google, however, had transcontinental datacenters which were connected only by the standard Internet, which means they needed a more reliable replication mechanism than the P2P eventual consistency.

If you need highly available writes with only eventual consistency, then Cassandra is a viable candidate for now. However, many apps are not happy with eventual consistency, and it is still lacking many features. Furthermore, even if writes do not fail, there is still cluster downtime associated with even minor schema changes. HBase is more focused on reads, but can handle very high read and write throughput. It's much more Data Warehouse ready, in addition to serving millions of requests per second. The HBase integration with MapReduce makes it valuable, and versatile.

2009年11月29日日曜日

1150万ユーザを抱えるオンラインゲーム、World Of WarcraftをサポートするBlizzard社の事情=>

AT&Tがバックボーンインフラを提供するこのゲーミング環境は、世界中に点在する１０箇所のデータセンタによって支えられる。次のようなスペックが公表されている。

２万台のサーバで1.3ペタバイトのストレージを使用
内World Of Warcraft向けには、13,250台のサーバブレード、75000台のCPUコア、112.5テラバイトのRAMを割り当て、世界中に点在する、1150万人のユーザをサポート
このシステムと世界中に存在するネットワークを68人のスタッフでサポート

グローバルにネットワークを構築し、多数のユーザに対してサービスを提供し、高度な技術で管理をする環境を維持するインフラはゲーム業界独特なアプリケーションとは言え、非常に興味深い世界である、といえる。

これをクラウドというかどうかは別として、クラウド関係者から関心を寄せられているのは確か。

WoW's Back End: 10 Data Centers, 75,000 Cores

November 25th, 2009 : Rich Miller

It takes a lot of resources to host the world's largest online games. One of the largest players in this niche is Blizzard, which operates World of Warcraft and the Battle.net gaming service for its Starcraft and Diablo titles. World of Warcraft (WoW) is played by more than 11.5 million users across three continents, requiring both scale and geographic scope.

Blizzard hosts its gaming infrastructure with AT&T, which provides data center space, network monitoring and management. AT&T, which has been supporting Blizzard for nine years, doesn't provide a lot of details on Blizzard's infrastructure. But Blizzard's Allen Brack and Frank Pearce provided some details at the recent Game Developer's Conference in Austin. Here are some data points:

Blizzard Online Network Services run in 10 data centers around the world, including facilities in Washington, California, Texas, Massachusetts, France, Germany, Sweden, South Korea, China, and Taiwan.
Blizzard uses 20,000 systems and 1.3 petabytes of storage to power its gaming operations.
WoW's infrastructure includes 13,250 server blades, 75,000 CPU cores, and 112.5 terabytes of blade RAM.
The Blizzard network is managed by a staff of 68 people.
The company's gaming infrastructure is monitored from a global network operating center (GNOC), which like many NOCs, features televisions tuned to the weather stations to track potential uptime threats across its data center footprint.

The AT&T Gaming Core Team was formed in 2004 to host gaming operations using AT&T's IP network. The team consists of engineers and hosting specialists who provide round-the-clock support to companies offering MMO games.

2009年11月28日土曜日

クラウドと法務：E-Discoveryの観点から検証：事例が以上に少ないのが悩みの一つ=>

IT関連の法務問題でよく話題にされるe-Discoveryという用語。

企業内の機密情報で、裁判などで証拠情報として提出を要求されるケースにおいて、タイムリーに正確で、尚且つ完全な情報を大量のデータの中から短時間で抽出する技術のことをさす。

クラウドコンピューティングを利用するインフラにおいては、クラウド業者がe-Discoveryの責務を追うケースが多くなる、と指摘、それを整備する必要性について論じている。

記事は弁護士事務所向けの記事の為、少々内容が難しいが、ポイントは、クラウドに関する過去の裁判事例が少ないため、難しい判断を強いられる、という事を指摘している。

Posted on November 27, 2009 by Tanya Forsheit

Legal Implications of Cloud Computing -- Part Four (E-Discovery and Digital Evidence)

Back by popular demand, this is Part Four in our ongoing series, Legal Implications of Cloud Computing. This installment will focus on digital evidence and e-discovery, and follows up on Part One (the Basics), Part Two (Privacy), and Part Three (Relationships). After all, what better topic than the cloud to tackle on the day after Thanksgiving, recovering from tryptophan and wine? As with many other areas previously discussed in this series, the cloud does not necessarily change the legal analysis, it just highlights the need to think through and anticipate the many areas of legal concern that could/are likely to arise when using the cloud. As a litigator, when I think about the challenges posed by the cloud, the one that seems most intuitive is e-discovery/digital evidence. It is always difficult to fully appreciate and digest the scope and volume of information that may be called for in litigation or in an investigation. The presence of corporate data in the cloud multiplies those considerations.

Some, but by no means all, of the digital evidence issues that should be considered in negotiating cloud arrangements and contracts (whether you are putting data in the cloud or designing and marketing a cloud offering), are as follows:

preservation/retention/disposal;
control/access/collection;
metadata;
admissibility; and, cutting across all of the foregoing
cost.

As I will discuss below, like other forms of electronically stored information (ESI), one of the best ways for addressing data in the cloud in the discovery and evidentiary context is to plan ahead and discuss treatment of cloud data (a) in records retention policies well in advance of litigation; and (b) at the Rule 26 conference once litigation has commenced. And, if you read to the end, I will comment on the paucity of case law referencing the cloud (and describe the few references that have appeared in federal and state case law to date).

1. Preservation/Retention/Disposal

Organizations often have records retention policies and procedures in place to promote accessibility of information, protect sensitive information, and reduce the costs associated with storage of data that no longer serves any business or legal purpose. Those policies and procedures often call for the routine elimination of electronic information when it has outlived its business purpose and is no longer required to be retained for any legal reason. Numerous statutes and regulations, federal and state, including but not limited to tax, securities, SOX, and employment regulations, mandate that different categories of documents be maintained for certain periods of time. Making matters more complicated, numerous additional regulations require that information that is no longer needed for a business or legal purpose be destroyed such that it cannot be read or reconstructed (see, e.g., the FACTA data disposal rule).

Organizational records retention policies and procedures also address the need to suspend routine disposal and recycling of information in the event of a litigation hold requiring the ongoing preservation of certain categories of data that may be relevant to current or future litigation. These litigation holds are put in place pursuant to an organization's duty (not created by, but conveniently restated in, Zubulake IV, Zubulake v. UBS Warburg LLC, 220 F.R.D. 212 (S.D.N.Y. 2003)) to preserve relevant evidence if they are sued or reasonably anticipate litigation or an investigation. "The obligation to preserve evidence arises when the party has notice that the evidence is relevant to litigation or when a party should have known that the evidence may be relevant to future litigation." Zubulake IV, 220 F.R.D. at 216.

Needless to say, data preservation, retention, and disposal obligations extend to data in the cloud. Data in the cloud is just one more category of discoverable ESI. One of the unique attributes of the cloud is the ability to quickly and inexpensively replicate data for backup and disaster recovery purposes. Cloud users may not even realize how many copies of their data exist in a cloud environment (or where, but we discussed that in Part Two).

Cloud users should incorporate such cloud data into records retention policies, data maps, litigation holds, and disposal procedures. Further, in the event of a litigation hold, a cloud user may need to take special steps to ensure that data in the cloud, which may be continuously replicated and/or overwritten, is preserved in a forensically sound manner. If data is already subject to a litigation hold, potential users of the cloud should evaluate whether such data should be placed in the cloud in the first instance.

2. Control/Access/Collection

Under Rule 34 of the Federal Rules of Civil Procedure, a party may serve on any other party a request within the scope of Rule 26(b): (1) to produce and permit the requesting party or its representative to inspect, copy, test, or sample the following items in the responding party's possession, custody, or control. Who has control of data in the cloud? Well, the data owner. Ordinarily, that will be the organization that is putting data in the cloud, not the cloud provider. However, both users and providers of cloud services should carefully review and negotiate the terms of service level agreements to specify who technically owns the data in the cloud.

Service level agreements should also address how the cloud user and cloud provider will cooperate in responding to party or non-party discovery requests. The agreement should address the following questions, among others: In the event of a Rule 34 request to the cloud user, how will the cloud user access the data in the cloud? Rule 34(b)(2)(A) provides 30 days to respond in writing to a document request. How quickly will the cloud user be able to access the data in order to review it for discovery purposes? In the event of a subpoena to a non-party cloud provider, how will the cloud provider respond? Will the cloud provider notify the cloud user, and how quickly? Will the cloud provider seek a protective order to prevent and/or limit the disclosure of the cloud user's data? Is the cloud provider even legally required to turn over the data under the Stored Communications Act or other statutes?

This blog post does not address itself to the even more complex considerations that arise if the EU Data Protection Directive applies to the cloud data that is the subject of the document request (e.g., if the data involves EU residents and is being transferred between the EU and the US, and who knows what other jurisdictions, while swirling around in the cloud). The mere processing of such information could very well violate the Directive and member country laws. That is the subject of past and future posts.

3. Metadata

Of course, litigants may also discover metadata. The default rule, in the absence of a stipulation or court order, is that a party must produce ESI in a form or forms in which it is ordinarily maintained or in a reasonably usable form or forms. Rule 34(b)(2)(E)(2). Almost inevitably, ESI in the form in which it is ordinarily maintained will contain metadata.

Cloud users responding to Rule 34 requests need to determine in what form they will produce ESI in the cloud. They also need to consider, in advance, the potential need for special protections and objections with respect to that cloud metadata -- it may be too late to consider such objections once the cloud data review is underway. Further, cloud providers (and users alike) need to consider the possibility that certain metadata will only reside with the cloud provider and how that affects the parties' discovery obligations (especially if the cloud provider might be considered the data owner for purposes of that metadata).

4. Admissibility

The flipside of the explosion of case law and commentary addressing e-discovery over the past several years, particularly since the amendments to the Federal Rules in late 2006, is the stunning lack of case law addressing admissibility of ESI. One of my favorite decisions, for that very reason, is United States Magistrate Judge Paul W. Grimm's treatment of these issues in Lorraine v. Markel Am. Ins. Co., 241 F.R.D. 534 (D. Md. 2007). Lorraine was an unlikely candidate to spawn a 100-page opinion on authentication of electronic evidence -- it involved a yacht struck by lightning. However, Judge Grimm, clearly disappointed by the parties' failure to authenticate even basic e-mails (they were simply attached to the parties' motions as exhibits), took the opportunity to provide much needed guidance.

I am unaware of any case law specifically addressing admissibility of ESI in the cloud. (More on that lack of case law regarding the cloud generally below.) In the interim, Judge Grimm's guidelines, going back to basics, are well worth a read. Like any other litigant purporting to introduce ESI as evidence, a litigant introducing cloud data must be able to demonstrate that the ESI is relevant and authentic, that it is not precluded by the hearsay rule (or fits within one of its exceptions) or the best evidence rule, and that its probative value is not substantially outweighed by the danger of unfair prejudice. As noted by the court in Lorraine,

Whether ESI is admissible into evidence is determined by a collection of evidence rules that present themselves like a series of hurdles to be cleared by the proponent of the evidence. Failure to clear any of these evidentiary hurdles means that the evidence will not be admissible. Whenever ESI is offered as evidence, either at trial or in summary judgment, the following evidence rules must be considered: (1) is the ESI relevant as determined by Rule 401 (does it have any tendency to make some fact that is of consequence to the litigation more or less probable than it otherwise would be); (2) if relevant under 401, is it authentic as required by Rule 901(a) (can the proponent show that the ESI is what it purports to be); (3) if the ESI is offered for its substantive truth, is it hearsay as defined by Rule 801, and if so, is it covered by an applicable exception (Rules 803, 804 and 807); (4) is the form of the ESI that is being offered as evidence an original or duplicate under the original writing rule, of if not, is there admissible secondary evidence to prove the content of the ESI (Rules 1001-1008); and (5) is the probative value of the ESI substantially outweighed by the danger of unfair prejudice or one of the other factors identified by Rule 403, such that it should be excluded despite its relevance.

Litigants may find a number of these evidentiary hurdles particularly challenging when it comes to cloud data, especially authenticity and hearsay. The proponent of even an email, blog post, IM, tweet, or other communication that resides only in the cloud may need to secure declarations, deposition testimony, or even live testimony of the author(s), the recipient(s), the data custodian, and/or the cloud provider itself. The same analysis must be considered for each and every such communication.

5. Cost

The costs associated with any e-discovery can be substantial. In the absence of well-drafted agreements between cloud users and providers, the presence of data in the cloud can only exacerbate those e-discovery costs. The parties to a cloud services agreement must determine which party will cover the costs associated with preserving, accessing, collecting, reviewing, and establishing admissibility of data in the cloud. Parties considering use of the cloud for certain kinds of data should evaluate whether the cost savings associated with using the cloud for that particular purpose outweigh the costs associated with processing data for discovery purposes if and when that becomes necessary.

Some Final Thoughts -- Current Lack of Case Law on the "Cloud"

I sometimes get questions about existing case law regarding the cloud. There is very little case law that actually uses the terminology.

Up until late July of this year, a search of Westlaw for "cloud computing" in all federal and state cases produced only one hit, Rearden LLC v. Rearden Commerce, Inc., 597 F. Supp.2d 1006 (N.D. Cal. Jan. 27, 2009). That case did not actually involve the substance of cloud computing. It was a trademark infringement matter. As one of the arguments in support of their position that defendant's "Personal Assistant" software directly competed with plaintiffs' incubation and/or movie production services, plaintiffs maintained that both parties used "Cloud Computing" (the court's opinion used the term in quotes and initial caps). The court, referring to a party declaration, described "Cloud Computing" as "a term used to describe a software-as-a-service (SAAS) platform for the online delivery of products and services." (Compare the court's description to the NIST definition of cloud computing discussed in Part One.) It rejected plaintiffs' argument that defendant's primary business was "Cloud Computing," finding that "Cloud Computing" was merely the platform, not the end product: "plaintiffs erroneously conflate[d] a platform by which defendant launches its end service to consumers (i.e., software) with the end product itself (i.e., a web-based marketplace). Indeed, plaintiffs state that it is the technology developed on the SAAS platform that will likely compete with other SAAS/ Cloud Computing companies. Plaintiffs do not discuss the product itself, but merely the underlying platform used to create it." Rearden LLC, 597 F.Supp.2d at 1021.

There are two more recent decisions that now come up in the same Westlaw search for "cloud computing": an unpublished procedural ruling in International Business Machines Corp. v. Johnson, 2009 WL 2356430 (S.D.N.Y. July 30, 2009), and an Oregon state court opinion in a criminal matter, State v. Bellar, 231 Or.App. 80, 217 P.3d 1094 (Sept. 30, 2009).

Johnson only mentions cloud computing in passing. The court rejected IBM's second attempt to obtain a preliminary injunction that would stop a former Vice President of Corporate Development from working in any role at his new employer, Dell, that would involve mergers and acquisitions, "as well as any role that would require him to advise Dell on its strategies related to such matters as enterprise services, servers, storage, so-called 'Cloud' computing and business analytics." The court rejected the second preliminary injunction request on procedural grounds.

The most recent opinion mentioning cloud computing, Bellar, involved an appeal regarding a motion to suppress in a prosecution for 40 counts of encouraging child sexual abuse in the second degree. The dissent discussed the defendant's privacy rights with respect to information in the cloud:

Nor are a person's privacy rights in electronically stored personal information lost because that data is retained in a medium owned by another. Again, in a practical sense, our social norms are evolving away from the storage of personal data on computer hard drives to retention of that information in the "cloud," on servers owned by internet service providers. That information can then be generated and accessed by hand-carried personal computing devices. I suspect that most citizens would regard that data as no less confidential or private because it was stored on a server owned by someone else.

In 2010, we will undoubtedly start to see judges using cloud terminology and analyzing the consequences of the rapid spread of different kinds of data (trade secrets, privileged information, PII) in the cloud, both in pretrial discovery, at trial, and with respect to the merits of cases involving such information. In the meantime, as always, technology races ahead of the law.

Amazon社CTO Werner Vogels氏のインタビュー：クラウドの理解が高まり、その価値が認められていると言及=>

ヨーロッパでのインタビューの内容。

過去6ヶ月の間にクラウドコンピューティングの認知度は一気に高まり、その価値に対する理解度が高まっている、と説明。実例として：

グローバル企業が世界各国のICT事業を統合するのにクラウドが利用されている
企業内の研究開発等の短期プロジェクトにCCを利用している。
世界規模での障害対策システムが可能（Availability Zoneで対応可能との事）

AWSはクラウドコンピューティングをこれだけ市場に広げる大きな役割を担った事に対しては異論が無いし、大きく評価すべきことである一方、クラウドだけのICソリューションを提供する事に徹している点では、Microsoft、IBM、等と大きく異なる事も理解する事が必要。

Interview: Amazon CTO Werner Vogels on why CIOs love clouds

Cliff Saran

Friday 27 November 2009 09:28

Chief information officers are more comfortable with the idea of cloud computing than they were six months ago.

Companies are happy to use the public cloud for software projects, research and development and to run web marketing campaigns.

Amazon has been one of the pioneers of cloud computing with Amazon Web Services (AWS), which provides users with a relatively low-cost way to access Amazon's vast IT infrastructure on a pay-per-use basis. Werner Vogels, vice-president worldwide architecture and chief technology officer at Amazon, has been on a European tour this week to find out from CIOs what they need from the public cloud.

He says, "CIOs are now much better informed about the different cloud computing offerings available. People are using AMS not only for software development, testing and prottyping new applications, but also to support collaboration using applications like Microsoft SharePoint hosted on AWS."

AWS is one of the earliest cloud services and has become popular with software developers looking to build, test and prototype applications without having to invest heavily in software development tools and server infrastructure. "Moving software development into the cloud is a good way for users to understand how cloud computing can be used in a production environment," says Vogels.

He says CIOs are also interested in using the cloud to facilitate global collaboration. If a collaboration platform such as Microsoft SharePoint is hosted in the internet cloud, it can be accessed from anywhere, which makes it far easier for geographically dispersed teams to work together than if internal IT was wholly responsible for connecting the users into a shared workspace.

"Eli Lilly is doing collaborative drug research using external researchers who collaborate over AWS," says Vogels.

This means that IT does not have to spend months procuring servers to support collaborative research projects. Vogels says that on AWS, Eli Lilly is able to set up servers to support the research projects in a matter of minutes.

Vogels says the cloud has other benefits for collaborative projects. "You can also tear down the collaborative environment very quickly. It is easy to restart and there is no need for up-front IT investment."

Rival drug firm Pfizer is using AWS as a computational grid to enable it to run programs which analyse the human genome to identify potential new drugs.

The UK media sector has been a big user of AWS. The Guardian, Telegraph and Channel 4 used AWS to cover the MPs' expenses scandal story. Vogels says AWS was used to provide the news sites with the scalability to support demand if millions of people tried to access the stories.

Similarly, he says, "Marketing campaigns can attract millions of customers. In fact, anything that uses social media is going to succeed." But internal IT may not be able to cope with the huge peak in website traffic, which Vogels says is where AWS can step in to provide the extra server capacity to meet demand.

In terms of applications, he says CIOs are putting Windows, open systems and Linux software on AWS. Commercial software can also be licensed to run on AWS, although each software firm sets its own licensing policies.

Oracle users can move their Fusion, eBusiness Suite or Fusion Middleware licences onto AWS, while IBM charges a small fee on top of the AWS service for access to DB/2 and WebSphere middleware. Microsoft SQL Server and Windows Server licences can also be transferred to the cloud, but users will still require a client access licence to provide end-users with access to the Microsoft server software.

Vogels believes the cloud offers businesses a big opportunity to create their own web services accessible over the internet. Software companies can provide web services that in-house applications and commercial products can use to add extra functionality. He says companies that are not in the IT sector may see an opportunity to use the cloud to commercialise some of their internal IT systems as external web services. This is already happening with the telecommunications companies, but Vogels believes there is no reason stopping other types business offering cloud-based web services.

On security, Vogels says Amazon offers the concept of a virtual private cloud, where users allocate a set of their own IP addresses to a closed off space on AWS. With a virtual private cloud, IT management tools such as BMC Patrol, which IT admin staff use to manage their datacentres and software deployments, also work across the Amazon cloud. In effect, AWS becomes an extension to the company's datacentre.

"We offer the concept of regions, which allows CIOs to specify the geographic location of the servers they use on AWS to support regulations in different countries," says Vogels.

Furthermore, Vogels says Amazon provides "availability zones", which give CIOs the option to provide multiple datacentre sites via the Amazon cloud to support failover and disaster recovery.

"CIOs spend a lot of money on traditional disaster recovery. With AWS we offer flexibility, allowing them to allocate servers up-front, which can be used to support disaster recovery. When the disaster recovery service is not invoked the servers are reallocated, so they can be used to run applications or software development."

2009年11月26日木曜日

クラウドのID管理：その必要性は謳われながらもやっとMicrosoftがPDCでWindows Idenitity Foundationを発表=>

セキュリティのソリューションの重要な核を占めるユーザのID情報（パスワード等の情報）を数々のSaaSアプリケーション、もしくはクラウド環境上のアプリケーション上で統合管理する必要性は、SaaSアプリ、クラウド環境の選択が広がると共に大きくなってきている。

Microsoft以外にもQuest社や、Ping Identity等のベンダーが活動している。

OpenIDというものも存在するが、もう少し企業向けのセキュリティ性の高いものが求められているのでは、と考える。

Microsoft adds identity to cloud

Releases Windows Identity Foundation, formerly the Geneva project

Security Identity Management Alert By Dave Kearns , Network World , 11/25/2009

Everyone eyeing Azure, their candidate for cloud-based computing, can at least agree on one thing: Redmond is late to the party that's dominated by Salesforce.com, Google, Amazon and a host of others. How can they hope to differentiate themselves?

Microsoft's JG Chirapurath, director of marketing for the Identity and Security Division, knows exactly how, and he told me about it last week. Identity is the key differentiator.

Last spring ("Identity management is key to the proper operation of cloud computing,") I noted that some people were finally beginning to realize that identity had a part to play in cloud-based computing, but very little has been done. Until Microsoft's announcements last week at their Professional Developers Conference (PDC), that is.

with lots of info about Azure, Microsoft also rolled out what's now called the Windows Identity Foundation (formerly the Geneva project). This is the glue that's needed for third-party developers to work with Windows Cardspace (and other information card technologies) to secure -- among other things -- cloud-based services and applications.

The release of the identity framework puts Microsoft ahead of all of the other cloud-based solution providers (many of whom are still struggling to attempt to adapt OpenID, with its security problems, to their cloud scenarios).

In a related announcement, Quest Software noted the launch of its first set of software-as-a-service Windows management solutions. Called "Quest OnDemand" the services will be hosted on Windows Azure, securely managing IT environments by leveraging the Windows Identity Foundation (WIF) and Active Directory Federation Services (ADFS) 2.0. Quest's first modules are available in beta. They are:

* Quest Recovery Manager OnDemand for Active Directory -- provides backup and object-level recovery of Active Directory data. It is designed to enable flexible, scheduled backups without manual intervention, facilitating quick and scalable recovery of Active Directory data.

* Quest InTrust OnDemand -- securely collects, stores, reports and alerts on event data from Windows systems, helping organizations comply with external regulations, internal policies and security best practices.

Both products are expected to be generally available in Q1 2010 on a subscription basis without requiring on-premises deployment and maintenance.

Microsoft intends to be the winner in the cloud-based computing game, and the Windows Identity Foundation is their trump card.

Rackspace、Google、Microsoftのストレージサービス比較：予想通り、ストレージSaaSでは価格破壊が起きています。軍配は=>

一応、一番安いRackspace社に上げられているが、これ以上、簡易性、価格の安さで勝負するのは難しくなってきているのでは、と感じる。

ビジネスモデルとしては、安い（もしくは無償の）ストレージを提供する一方、そのデータを使った付加価値サービス事業で儲けるようにしないと生き残れない、ということだと思う。

Analysis: Rackspace Beats Google, Microsoft At Hosted Storage

By Edward F. Moltzen, ChannelWeb

9:32 HM EST 4. 11. 25, 2009

Rackspace has very quietly outflanked giants Microsoft (NSDQ:MSFT) and Google (NSDQ:GOOG) in online storage and backup, giving the hosting and cloud computing upstart bragging rights as the market races toward the new IT model in 2010 and beyond.

Unveiled last week, Rackspace's Cloud Drive provides a quick, easy-to-deploy solution for "cloud-based" storage and file backup. Here are the basics: At a price of $4 per month, per user, Rackspace will offer a company or workgroup 10 GB of file storage per user. Using the interface of the Jungle Disk Workgroup Activity Manager (Jungle Disk is a Rackspace subsidiary), files can be managed and backups can be scheduled and tailored to a specific need. From a desktop or server, the files are copied onto Rackspace's storage infrastructure where they can be managed or retrieved.

Some takeaways from a look at Cloud Drive from the CRN Test Center:

It's simple to set up and deploy, taking about 15 minutes from signing up for the service and scheduling regular backups of specific folders on a PC.

The syncing and sharing features work for the most part. One overnight, scheduled backup failed with this message:

"Unknown SSL protocol error in connection to g4.gateway.jungledisk.com:443 [g4.gateway.jungledisk.com] Error Location: JungleHTTP.cpp:825 JungleHTTP::MakeRequest via JXRTransport.cpp:634 JDGatewayConnection::ExecuteAsyncThread"

A manual backup of the same folder was successful without rebooting the desktop, the server or the network. (It may have been other, overnight scheduled system activity that caused a conflict.) On two other nights, scheduled backups took place with no problems.

The Jungle Disk Workgroup Activity Monitor is a desktop application; it did require a system reboot to start. However, over the past week Rackspace launched a beta version of a Web interface to this service for file access. It appears to work fine.

Unlike other offerings, Rackspace provides security in the form of AES-256 encryption -- a smart move that may make the service more attractive to those with compliance concerns.

Overall, the experience using the Rackspace cloud-based storage and backup wasn't perfect, but it was good. Importantly, it's been delivered to market in a better fashion for smaller enterprises in a much more useful way than Google has arranged its online storage, which is delivered through its Picasa photo-album app and Gmail, or through Microsoft's lackluster Sky Drive. The ability to schedule backups and perform system restorations inside the Rackspace offering -- in addition to the company's functionality that allows employees to sync and share files across the network -- puts it to the head of the line over its two larger rivals in the cloud.

It should be noted that Google recently increased its storage offering to 20 GB of online capacity for $5 per year, shared between Picasa and Gmail, with as much as a full terabyte of storage available to those willing to pay $256 per year. But Google lacks the syncing, sharing and administrative functions of Rackspace.

Also know that Microsoft has made some improvements to Sky Drive since the product first launched, including the ability to share files with people accepted into a personal network. And, while Sky Drive offers a maximum of 25 GB of storage for free, it also lacks the automation tools provided by Rackspace. And Sky Drive still has a gawky and awkward design and Web-based layout.

Here's a quick Tale of the Tape:

Capacity and Price: Rackspace offers 10 GB of capacity that can be shared between colleagues for $4 per month per user; Google offers 8 to 10 GB for free or 20 GB for $5 for one year; Microsoft offers 25 GB for free on Sky Drive.

Automation: Rackspace's Jungle Disk offers scheduled backup to its cloud service; Google and Microsoft provide no automation tools.

Collaboration: Rackspace allows for syncing and collaboration among colleagues via its desktop console; Google storage is limited to sharing provided in Gmail or Picasa; Sky Drive allows for link and file sharing between users on a personal network.

Security: Google and Microsoft provide password-protected access to accounts; Rackspace provides password protection and AES-256 encryption.

Availability: One Rackspace scheduled backup failed over the course of several days; Gmail has experienced several high-profile outages this year; Sky Drive has been largely available with varying amounts of latency.

While not free, like Google's storage or Sky Drive, Rackspace's service provides nice automation and administration and management tools. Packaged with Rackspace's hosted e-mail offering, it puts the company right in the game with the industry giants and should be on the map for small businesses considering moving partially or slowly to a cloud-based IT model. It's not perfect, but marks a solid start for Rackspace as the industry stands at the threshold of the cloud era.

The bottom line: Free, online storage for the masses has been available for almost 15 years, since Yahoo (NSDQ:YHOO) introduced its now-defunct "Briefcase" service. Google and Microsoft continue to tweak and improve what they have in the market, but Rackspace has gone the extra distance to make its Cloud Drive offering not only user-friendly but business-friendly as well. Improvements to Cloud Drive should be expected over time, but for now it is outflanking two of its biggest rivals.

2009年11月25日水曜日

2010年、クラウド市場の予測：正直言ってあんまり驚くほどのことを書いていませんが、感謝祭も近いので、少し記事の質が落ちているようです。

しいて言えば、クラウドコンサルティングがはやる、ということ。

クラウドも市民権を得ると、SI市場になるということですか。

2010 Cloud Computing Predictions: The Year of Realism

Contributed article by Ian Knox, Senior Director of Product Management at Skytap

2010 Cloud Computing Predictions: The Year of Realism

The buzz around cloud computing reached fever pitch this year, culminating in Gartner placing cloud computing at the peak of its hype cycle in July. We saw controversy around the Open Cloud Manifesto, the federal government getting into cloud computing with apps.gov, and almost every IT vendor trying to put a cloud spin on their marketing message (whether their offerings are cloud-like or not!).

However, hype aside, there is a widespread recognition that the operational and economic model of cloud computing will transform IT over the next few years. As we close 2009, which represented a year of uncertainty, reduced budgets and cautious recovery for most in the IT industry, I believe we are at the tipping point for widespread adoption of cloud services in 2010. With that background, here are Skytap's five cloud computing predictions for the upcoming year:

1. Hype Replaced by Pragmatic Adoption

Many enterprises see the potential of the cloud computing model, but have been trying to understand the proven use cases before taking the plunge. Fortunately, we've seen many early adopters lead the way this year and there is an emerging consensus around the top scenarios which take advantage of the scalable, multi-tenant and on-demand nature of cloud computing. These scenarios include: (1) Development and Test, (2) IT Protoyping and Proof-of-Concepts (POCs), (3) Scalable Web Hosting, (4) Email, (5) Collaboration, (6) Grid Computing/Scientific Calculations and (7) Virtual Training. In 2010 we'll see these scenarios become well-defined blueprints for enterprise adoption of cloud services.

2. Moving Beyond 'VMs On Demand' to Cloud Solutions

We now see a slew of companies that are offering 'Virtual Machines on Demand' for a few cents an hour. However, this still requires organizations to do much of the work to make these 'Infrastructure-as-a-Service' offerings available to their employees. Over the next year, we'll see vendors who offer cloud services as complete solutions win over basic infrastructure offerings. For instance, solutions that help integrate internal and external clouds, provide enterprise single sign-on and security, offer billing and chargeback mechanisms, enable business processes and workflow, and automate complex tasks will prove more valuable to enterprise customers than hosted virtual machines.

3. Emergence of Best Practices

If 2009 was the year of defining key cloud computing scenarios, 2010 will be the year of best practices. As more and more IT practitioners gain experience with cloud services, this knowledge will disseminate in the industry and best practices around security, networking, 'hybrid clouds', application architecture and IT policies will become widespread. This will also include best practices around negotiating service level agreements (SLAs) (both internal and external) and contracts.

4. Cloud Consolidation and Brokerage

As more and more enterprises adopt cloud solutions, we'll see vendors differentiate on support and services, enterprise integration and pricing models, performance and SLAs. In tandem with vendors differentiating their offerings, we'll start to see some consolidation in the industry as major vendors look to build out a portfolio of offerings that can be leveraged through their existing channels and sales organizations. Finally, it's likely that some early cloud service brokers will gain traction to shield enterprises from negotiating with multiple vendors around SLAs, security credentials and pricing.

5. The Rise of Cloud Consulting

Given the high interest in cloud computing and organizations looking for impartial advice in the face of a confusing vendor landscape, we'll see consulting practices built around the design and implementation of the major cloud scenarios. Initially, this will be driven by boutique consulting shops, but by the end of 2010 we'll see many of the major consulting players offer consulting practices and advice for adopting a cloud-based approach to IT.

As we move into "The Year of Realism," keep your eyes on these trends as they will no doubt shape the next phase of cloud computing.

About the Author

Ian Knox is Senior Director of Product Management at Skytap where he is responsible for all aspects of the company's product management, market positioning, demand generation and go-to-market strategy.

Springboard Research社によると、中国のSaaS市場は2010までに58%成長し、３年後には=>

従来のITサービスを超える規模になる。

理由は、SaaS導入の低い初期費用にある。

Chinese SaaS set for growth (19/11/2009)

A new report has predicted growth for the Chinese Software-as-a-Service (SaaS) sector.

Springboard Research has suggested an increase in revenues of 56 per cent for the industry by the end of 2010.

This would take money generated in the country from SaaS to $171 million (£102 million).

According to the study, growth in the sector will continue over the next three years and will outstrip that of traditional IT services.

Devin Wang, business analyst for emerging software at Springboard Research, said: "The appeal and growth of SaaS in China are based on the advantages of SaaS applications compared to traditional software such as lower upfront costs, easier maintenance and quick roll-outs."

He added that his firm envisages "aggressive demand" for SaaS in China over the coming months.

Earlier this week, Wael Mohamed, an industry expert, commented at a Sydney conference that SaaS solutions are "essential" for many companies, CRN.com reported. ADNFCR-1370-ID-19468633-ADNFCR

2009年11月23日月曜日

AT&TのSynaptic Compute as a Service、戦略はAmazon Web Service対抗ではなく、補完すること=>

とは言え、Amazon Web Serviceに完全に対抗した戦略。

違いはAmazon Web Serviceの不安要素である、信頼性を解消している、という点を強調し、"開発"はAWSで行い、"実運用"はAT&Tで行う、という事を提唱している事。

こういう戦略は非常に賢い、と正直に感じる。

AT&Tはプライベートクラウド運用もサポートしており、エンタプライスのニーズをよく研究している、と評価できる。

AT&T Cloud Adds Compute As a Service

November 16th, 2009 : Rich Miller

AT&T continues to build out its suite of cloud computing services to offer features similar to Amazon Web Services. Today it is announcing Synaptic Compute As a Service, which offers processing power that can be used for "cloudbursting" of in-house apps or as a testing and development platform. The service can run as a public cloud, or as a private cloud on AT&T's infrastructure, connected to a customer data center by AT&T's network.

The new offering expands AT&T's cloud portfolio, which also includes AT&T Synaptic hosting – essentially a managed hosting offering using cloud technologies – and Synaptic Storage As a Service. With the storage and compute services, AT&T is hoping to leverage its familiar brand and network to win over enterprises seeking a comfort level with cloud computing.

"This will enable customers to create a provider-based private cloud accessed either via the public Internet or private connections, which many companies will already have with AT&T," said Steve Caniano, Vice President, Hosting and Cloud Services for AT&T. "We're looking to address the concerns customers have related to cloud computing. The model that folks like Amazon have introduced is of interest to a lot of customers. We're offering the same kind of value proposition to enterprises, but without the issues that scare them a little bit."

2009年11月21日土曜日

MS Azureの新しい3つのコードネーム：Sydney、Dallas、AppFabric =>

元々Azureのコンセプトには、Dynamic、SharePoint、.Net、Live等のアプリケーションコンポーネントが含まれていたが、今回はそれが取り除かれ、3つのコンポーネントが残った。

Project Sydney：

On-Premiseアプリケーションと、Azureクラウドアプリケーションを連携させるインフラ。 IPSec、IPV6に加え、Genevaと呼ばれるID管理のコンポーネントが採用される。

Dallas :

Data-as-a-Serviceモデル。クラウド上にデータを保管し、自由にアクセスできる仕掛け

ApFabric :

Dublin、Velocity、と呼ばれる開発プラットホームに加え、.Net開発環境をサポートする。

Three new codenames and how they fit into Microsoft's cloud vision

Posted by Mary Jo Foley @ 2:09 pm

Any Microsoft Professional Developers Conference (PDC) wouldn't be complete without a few new codenames. On November 17, Microsoft introduced three new ones that all are related to Microsoft's evolving cloud-computing vision and infrastructure.

During the Day One set of keynotes, Microsoft officials attempted to explain further how the company's three-screens-and-a-cloud vision will take shape in product and service form.

Last year, when it rolled out its first Windows Azure Community Technology Preview, Microsoft showed a "layer cake" type diagram which showed all of the various Azure layers and components as a comprehensive whole. (See last year's layer cake at right.)

This year, there was no diagram. The new message is that Microsoft's cloud is comprised of Windows Azure (the Red Dog operating system), SQL Azure and a new AppFabric development platform. That's it. Gone are the Live Services, .Net Services, SharePoint Services, and Dynamics CRM Services that wer all part of the original platform.

Did Microsoft decide its original vision was too ambitious? It seems more the case that it has decided some of the original pieces didn't belong as part of the core Azure platform, such as Live Services, which are now part of Windows/Windows Live. In other cases, Microsoft has repackaged other elements of its original platform in different ways (example: the slimmed-down .Net Services is now part of AppFabric).

In the midst of all this movement, Microsoft introduced the three new cloud-related codenames today. How do they fit into Microsoft's newly flattened cloud cake?

* Project Sydney: Technology that enables customers to connect securely their on-premises and cloud servers. Some of the underlying technologies that are enabling it include IPSec, IPV6 and Microsoft's Geneva federated-identity capability. It could be used for a variety of applications, such as allowing developers to fail over cloud apps to on-premises servers or to run an app that is structured to run on both on-premises and cloud servers, for example. Sydney is slated to go to beta early next year and go final in 2010.

* Dallas: Microsoft's "data-as-a-service" offering. Dallas is a new service built on top of SQL Azure that will provide users with access to free and paid collections of public and commercial data sets that they can use in developing applications. The datasets are available via Microsoft's PinPoint partner/ISV site. Dallas is hosted on Azure already and is available as of today as an invitation-only CTP. No word on when Microsoft is hoping to release the final version of the service.

* AppFabric: AppFabric is a collection of existing Azure developer components, including the "Dublin" app server, "Velocity" caching technology, and .Net Services (the service bus and access control services). The version of the Windows Server AppFabric on-premises version of the product is available for download today, with final availability slated for 2010. Community Technology Previews (CTPs) of the Windows Azure AppFabric version are slated to be available during 2010. No word on when the final Azure-based version will be out.

Microsoft made available last week a November release of its own Windows Azure SDK and related tools. The new releases include an update to Windows Azure Tools for Microsoft Visual Studio, which extends VS 2008 and VS2010 Beta 2 so they can create, configure, build, debug and run Web apps and services on Windows Azure.

Roger Jennings, a cloud computing expert and author of the Oakleaf Systems blog said that the November release of the Windows Azure SDK includes "something Azure devs have been asking for and needed to compete with AWS EC2 (Amazon Web Services' Elastic Cloud 2): Variable-size virtual machines (VMs). Using that featue, Azure developers may now specify the size of the virtual machine to which they wish to deploy a role instance, based on the role's resource requirements. The size of the VM determines the number of CPU cores, the memory capacity, and the local file system size allocated to a running instance, Jennings noted.

In a similar vein, Amazon quietly released on November 11 version 1.0 of its Amazon Web Services (AWS) software development kit for .Net. The SDK allows developers to "get started in minutes with a single, downloadable package complete with VIsual Studio project templates, the AWS .Net library, C# code samples and documentation," according to a note Amazon forwarded me over the weekend.

Googleの発表したChrome OS、完全クラウドオンリー=>

という事で、従来のOSとかなり異なる。

ネットワーク接続が前提、オフラインはGoogle Gearsを使う
オープンソースでありながら、Certification（認証）プロセスはかなり厳しい模様
ディスクはSolid Stateオンリー、ハードディスクはサポートしない模様（今日発売されているPCは殆どNG)
デバイスサポートについては不明
セキュリティについては何やらData Liberation Frontで定義している機能をベースにやっているらしい

等、かなり斬新なアプローチが見られる。

Crazy Google Kids at it Again with Chrome OS

Google kicked off the launch of its Chromium OS project today with a presentation on Chrome OS. The first thing you'll notice is that the name of Google's consumer product will be Chrome OS, while the open source project is named Chromium OS. My guess: Google will bless the usage of the Chrome OS name by granting trademark rights to those who comply with Google's standards. Google didn't say that, but that's what I would do.

The next thing I noticed is that Chrome OS will be completely "cloud-based". As in, no local data. As in, all web apps all the time. As in, it's only useful to the extent that there's an internet connection. This will likely prove to be a Google Rohrschach test. Those already predisposed to disliking anything Google does will find this horrifying. Those who think Google is the bee's knees will conclude that it's not completely evil and, indeed, is the next logical evolution of desktops-in-the-cloud technology.

This is at first glance a radical change from what we've previously called an operating system. At second glance, it really does make a lot of sense. I and several others have been trying to advocate the usage of open source software as a platform and delivery system for automated services over a network. Most companies in the commercial open source space have been, for reasons beyond my comprehension, slow to completely embrace this strategy. Now Google is taking this concept to the extreme. The question is, is it too extreme?

The benefits are enormous - without all the overhead that most operating systems have to deal with, Chromium engineers are free to optimize to their heart's desire without breaking existing code. Users can tap into a wide array of web-based applications, whether on the internet or an intranet. Google claims that they will store all of your user data (cue rioting by privacy and standards advocates). This wasn't mentioned in the video, but Chris DiBona pointed out that this is why Google sponsors the Data Liberation Front, to assure users that their data is always accessible and can be moved out of (and into) Google products. Chrome OS also features some additional security measures that make it more difficult for someone to pwn a machine and then use it to access the data stored on Google's servers. There's a very clever trust mechanism that I'll link to if I can find a good description.

Of course, that doesn't - and shouldn't - mollify those of us worried about data privacy. Then again, most of us already entrust Google with much of our data. You can either not use Google products, or you can "trust but verify." Either way, the concern is understandable, and Google faces increasing opposition to data mongering with every data scraping product they make.

The downsides to Chromium OS are both obvious and non-obvious. The above concerns of privacy and lock-in are obvious and are not new to followers of Google. Another obvious downside is that any Chrome OS machine is pretty much DOA when it comes to enterprise computing. The inability to run any legacy application is a non-starter for most organizations. Then again, Google isn't targeting that market, so perhaps it's not an issue - yet. And let's not forget that if you travel somewhere without a net connection, you're out of luck, unless you've set of Google Gears to create local copies. There was no word on whether the plan is to make that the default approach.

Some perhaps less obvious downsides involve its certification policy. At one point in the presentation, there was some mention of Chrome OS only running on machines with solid state drives, which led me to wonder whether this was going to be a self-contained box, a la Apple. Soon afterwards, however, there was some discussion about working with hardware partners and certifying machines to run Chrome OS. Not mentioned was how tightly they would police the Chrome OS trademark. Yes, Chromium OS is an open source project, but we've seen in the past that enforcing trademarks too strictly can incite community rebellions. See, eg. Red Hat, Mozilla, Java, Twiki, et al. Personally, I can see the case for using a certification process and enforcing it with trademarks. Assuming this is the direction they go, one hopes they're smart about it. Communicated very clearly, however, was that Chrome OS will not run on most netbooks sold today, thus the need for a rigorous certification process.

There are also open questions as to how well Chrome OS will work with devices and peripherals. It remains to be seen how enthusiastic device manufacturers will be about supporting yet another platform. Based on today's presentation, it looks like Google will go with the approach of working with a select few hardware vendors. Given that these netbooks will be targeted at consumers, it remains to be seen how well they will be able to dictate the distribution channels for both the base netbooks as well as peripherals. One can easily envision someone buying a cheap consumer device and being disappointed that it won't work with their shiny new Chrome OS-based netbook.

For more on Chromium OS, see Chromium.org as well as today's blog announcement.

Related Activities

2009年11月18日水曜日

NTT Americaがクラウド事業開始予定：OpSourceのインフラを採用、北米市場に展開

ターゲット顧客層がどこなのか、興味深い。 SMBなのか、Enterpriseなのか。両方攻める、というのは難しいのでは、と思う。

NTT America Unveils Cloud Computing Service

By Gary Kim

Contributing Editor

NTT America (News - Alert), a wholly owned U.S. subsidiary of NTT Communications Corp., announced that the company is working OpSource to deliver a new service it calls “OpSource Cloud,” a new cloud computing solution.

OpSource Cloud is designed to allow IT departments to manage their security as they would within their internal IT infrastructure, while offering end users Web browser access to enterprise applications.

OpSource Cloud offers online sign-up and pay by the hour usage for enterprise quality computing capability when needed. OpSource Cloud is now in public beta and available for on-line purchase by the hour from www.opsourcecloud.net.

OpSource’s customers include Adobe, which uses OpSource to deliver Acrobat.com, and SAP (News - Alert)/Business Objects which uses OpSource for its “Crystal Reports.” Other smaller companies rely on OpSource as well, including Taleo, a ecruitment and performance managementSaaS ( News - Alert) vendor, and Xactly, which specializes in on-demand sales performance software, OpSource says.

Gary Kim is a contributing editor for TMCnet. To read more of Gary’s articles, please visit his columnist page.

Appirioがクラウドベンダ市場のエコシステムマップを公開：何とも表現が難しいんですが、一度訪問してみてください。

Appirio Launches First Interactive Cloud Computing Ecosystem Map

Plots Key Offerings to Define Market and Accelerate Enterprise Adoption of the Cloud

Appirio Cloud Ecosystem Map

SAN MATEO, CA--(Marketwire - November 16, 2009) - Appirio, a cloud solution provider, today launched the first interactive cloud ecosystem map. The map builds on Appirio's experience moving hundreds of enterprises to the cloud, and seeks to provide better transparency into the evolving cloud ecosystem. It aims to help enterprises accelerate their adoption of the cloud by creating a standard taxonomy and definitions for a market that has quickly become crowded and confusing.

"The cloud ecosystem is evolving so quickly that it's difficult for most enterprises to keep up," said Ryan Nichols, head of cloudsourcing and cloud strategy for Appirio. "We created the ecosystem map to track this evolution ourselves, and have decided to publish it to help others assess the 'lay of the land.' With broader community involvement, we can create a living, breathing map where anyone can access, drill down and interact with dynamic information. This will bring some much-needed clarity to the cloud market."

The cloud ecosystem map breaks out 70 different layers of technology across applications, platforms and infrastructure. It distinguishes between offerings that are on-premise versus hosted single tenant, versus multi-tenant and illustrates which elements are available directly versus bundled versus provided by a partner. The offerings of leading vendors are highlighted across the stack, as are point solutions from emerging vendors. Links are provided within the map to learn more about each offering.

The map was created in collaboration with Troy Angrignon, a well-respected technologist, cloud expert and co-chair of the 13th Under the Radar conference. Originally developed as an internal training tool to benchmark and validate cloud offerings, the cloud ecosystem map is now available to help companies navigate through the layers of technology and evaluate relevant vendors.

"A sure sign of the acceleration of cloud computing is the mind-boggling array of new offerings that are coming to market," said Vinnie Mirchandani, founder of Deal Architect and former Gartner analyst. "As an advisor to large enterprises trying to make sense of these offerings, it is great to see Appirio contribute this framework to the conversation."

Appirio's cloud ecosystem map is accessible to the public -- please visit http://www.appirio.com/ecosystem and contribute to the map's evolution over the coming weeks and months.

About Appirio

Appirio (www.appirio.com), a cloud solution provider, offers both products and professional services that help enterprises accelerate their adoption of the cloud. With over 2500 customers, Appirio has a proven track record of implementing mission-critical solutions and developing innovative products on cloud platforms such as salesforce.com, Google Apps, and Amazon Web Services. From offices in the U.S. and Japan, Appirio serves a wide range of companies including Avago, Hamilton Beach, Japan Post Network, Ltd, Pfizer and Qualcomm. Appirio was founded in 2006, is the fastest growing partner of salesforce.com and Google, and is backed by Sequoia Capital and GGV Capital.

Contact:
Julie Tangen
Kulesa Faul for Appirio
831-425-1083
Julie@kulesafaul.com

Click here to see all recent news from this company

Amazon Web Services がWindows .Net開発環境を発表。Azureを意識したタイミングです=>

EC2,S3,SimpleDB,をサポートするので、基本的に機能は揃っている、といえる。コンポーネントをダウンロードして、ローカルマシンで開発した.NetアプリをAWSにアップロードする、という仕組み。

Amazon Web Services releases Windows cloud computing development kit

Updated: 2009-11-16

Amazon Web Services, Amazon.com's cloud computing services division, announced a new .Net software development kit to help Windows software developers make programs for Amazon's EC2 platform.

Users of the software development kit will be able to make solutions for Amazon infrastructure services, including Amazon Elastic Compute Cloud, Amazon SimpleDB and Amazon Simple Storage Service. The kit will come as a single, downloadable package, and will include templates for Visual Studio projects, C# code samples, the Amazon Web Services .NET library and documentation.

"The AWS SDK for .NET makes it even easier for Windows developers to build .NET applications that tap into the cost-effective, scalable, and reliable AWS cloud," said Amazon Web Services in a statement.

Amazon Web Services first launched EC2 in 2006. The platform is designed to improve the ease of web-scaling computing for developers.

Analysts say this new development kit, its first to aim specifically at Windows developers, may be an attempt to compete with the release of Microsoft's new Azure cloud computing platform on Monday. As a Microsoft platform, Azure will also aim at Windows developers and users.

Microsoft社がAzureを発表：来年頭からサービス開始、周辺アプリケーションの方に注目：=>

これからさまざまなISVがAzureサポートを表明するのでその動向を良く見る必要アリ。

Microsoft to launch Azure cloud service early next year

Tue Nov 17, 2009 6:51pm GMT

LOS ANGELES (Reuters) - Microsoft Corp said on Tuesday it will launch its long-awaited Windows Azure cloud computing system on Jan 1, as it looks to take advantage of the growing interest in internet-based software and services.

Azure, which provides an online platform for software developers to create their own programs, and space for customers to store data, was rolled out for experimentation a year ago.

The service will go fully live at the beginning of next year, Microsoft's chief software architect Ray Ozzie told the company's annual software developers conference on Tuesday.

The first month of the service will be free, and billing will start in February, said Ozzie.

Microsoft is expected to be a big player in the cloud computing market -- broadly the trend toward running software in remote data centers and accessing it over the Internet -- but has lagged behind pioneering rivals such as Amazon.com Inc, which already sells cloud-based storage, and Google Inc, which offers a range of free, online software.

2009年11月17日火曜日

IBMがBlue Insightを発表。Business Intelligence事業のクラウド化=>

他のクラウド事業者、SaaS事業者もBIに関してはCRMに続くアプリ、と見ており、追随するものと思われる。セキュリティ性の高いストレージを大量に消費するアプリケーション、という特徴があり、この点についての配慮が必要になる。

IBM launches private business analytics cloud; Eyes 'easily consumable' BI for the masses

Posted by Larry Dignan @ 2:02 am

IBM on Monday will unveil Blue Insight, a massive business analytics cloud that will hold more than a petabyte of data. This internal cloud computing environment will be the basis for future external services.

Internally, IBM’s effort is dubbed Blue Insight, a business analytics cloud that will give 200,000 employees access to key corporate data around the world. Blue Insight will suck in data from 100 different data stores and warehouses. The data will then be dished out to salespeople and developers.

According to IBM, Blue Insight is a showcase of the “eat your own dogfood” mantra. The system is built using Cognos, IBM’s business intelligence software, and hardware systems such as System Z, the company’s mainframe (right).

Going forward, IBM said it will add structured and unstructured data to Blue Insight. Some of this data will include revenue forecasts and sales quotas, product breakdowns, queries from real-time data and inventory levels and defects.

Increasingly, companies like IBM and HP are revamping their internal operations and then using those learnings to sell to customers. In IBM’s case, the architecture behind Blue Insight will be used to form the Smart Analytics Cloud for customers.

The Smart Analytics Cloud aims to provide “easily consumable business intelligence services, systems and software.” The bundle will include business intelligence services, Cognos and mainframes.

IBM added that it plans on focusing on the easily consumable part. To make business intelligence easier to digest, IBM said it will use Web 2.0-ish dashboards. In a backgrounder, IBM writes:

A key focus area of the Smart Analytics Cloud is rapid service deployment and end user acceptance. With agile Web 2.0 toolkits, user registration applications are easily created. Corporate processes are automated using IBM freeware and guidance documentation.

AT&Tがクラウドサービスを開始：Amazon EC2に近い内容=>

テレコム事業者が市場参入すると、市場の大きな広がりに加え、価格の低下が促進される、と予測する。

Amazon Web Serviceは対抗策を講じる必要が大いにある、と思われる。

AT&T plots compute cloud similar to Amazon Web Services

Posted by Larry Dignan @ 3:57 am

AT&T on Monday rolled out a cloud computing system that mimics the approach of Amazon Web Services.

Dubbed the AT&T Synaptic Computer Service, the telecom giant is offering on-demand computing via self-service. AT&T said its service will allow corporate customers to scale up computer requirements quickly. As corporate customers increasingly ponder Amazon Web Services, hosting providers and major enterprise IT players are racing to offer similar services. Simply put, many traditional enterprise IT players are likely to emulate Amazon’s on-demand computing cloud model.

The offering, outlined in a statement, is built on software from VMware and Sun Microsystems (Techmeme). Notably, AT&T will use the Sun Open Cloud Platform, Sun Cloud APIs and architecture.

The service will launch in the fourth quarter in the U.S. AT&T will offer international services in the future. AT&T’s feature list includes a portal to order computing power, multiple billing options, storage as a service, no feeds and a service level agreement for the platform.

The service comes in three server sizes:

Small (1 CPU and 4 GB of memory)
Medium (2 CPUs and 8 GB of memory)
Large (4 CPUs and 16 GB of memory)

And the storage options:

100 GB of storage provided with each server image (on the same virtual infrastructure in the same IDC)
Two supplementary storage options for an additional charge:
Purchase up to 2 TB additional disk storage per virtual server
Connect to AT&T Synaptic Storage as a Service
24×7x365 monitoring of the virtualized infrastructure
Service level agreement of 99.9% for availability of the infrastructure

AT&T has been adding to its cloud computing lineup for the last year. The company didn’t reveal pricing for its compute as a service offering in its statement or Web site.

AT&T’s storage as a service offering costs 10 cents per GB of data transferred. If you have two copies of data in one location it’s 25 cents per GB. Two copies in one location and backup in another will run you 35 cents per GB.