2010年1月14日木曜日

Google Codeでcloudmapreduceの登場:MapReduceをAWS上に実装:特長は=>

次の3点:

  • Hadoop等、他のnonSQLシステムと比較して非常に高速
  • クラウド上での実装なので、スケーラビリティの面、信頼性の面で非常に優れている
  • ソースコードがたった3000行、と非常にシンプル(Hadoopの1/4)
Amazonからすれば、これをベースに今後宣伝をするのでは、と想像する。


 

Cloud MapReduce was initially developed at Accenture Technology Labs. It is a MapReduce implementation on top of the Amazon Cloud OS.

By exploiting a cloud OS's scalability, Cloud MapReduce achieves three primary advantages over other MapReduce implementations built on a traditional OS:

  • It is faster than other implementations (e.g., 60 times faster than Hadoop in one case. Speedup depends on the application and data.).
  • It is more scalable and more failure resistant because it has no single point of bottleneck.
  • It is dramatically simpler with only 3,000 lines of code (e.g., two orders of magnitude simpler than Hadoop).

See details in Cloud MapReduce Technical Report.

See the Tutorial to learn how to use Cloud MapReduce and how easy it is to launch a job. See the how to guide on how to write your first Cloud MapReduce application. If you want to contribute to the project or if you want to port Cloud MapReduce to other clouds (e.g., Windows Azure or an internal cloud), see the code architecture page to learn how it is designed.

Also see Command line options for details on how to specify a job run, Pre-built AMI for how to use the pre-built AMI image to make running the job easier, and performance tuning page for tips to optimize the performance.

Cloud MapReduce is at its early stage. We welcome and appreciate your contributions. We have exciting plans to support multiple clouds and to make the system more robust/easy to use. Please join the mailing list to see where you can contribute and post any questions (either development coordination or questions on how to use) to the developers.

http://code.google.com/p/cloudmapreduce/