2009年10月30日金曜日

クラウド アナリティクスについて。既存のアナリティクスとの違いについて=>

解説しています。特にHadoopの様なクラウド固有の巨大データ管理が活用できる、等明らかにメリットが見える面もある。

Cloud Expo: Article

Cloud Analytics Checklist

What are enterprise users looking for from a cloud analytics solution?


Cloud Data Analytics on Ulitzer

In the previous article we looked at how realtime cloud analytics looks set to disrupt the $25B SQL/OLAP sector of the IT industry. What are users looking for from a next-generation post-SQL/OLAP enterprise analytics solution? Let's look at the requirements:

  • Realtime + Historical Data. In addition to analyzing (historical) data held in databases (Oracle, SQLServer, DB2, MySQL) or datastores (Hadoop, Amazon Elastic MapReduce), a next-gen analytics solution needs to be able to analyze, filter and transform live data streams in realtime, with low latency, and to be able to "push" just the right data, at the right time, to users throughout the enterprise. With SQL/OLAP or Hadoop/MapReduce, users "pull" historical data via queries or programs to find what they need, but for many analytics scenarios today what's needed instead, to handle information overload is a continuous "realtime push" model where "the data finds the user".
  • External + Internal Data. In the past it was so simple, an enterprise had only to deploy a few large specialized systems (ERP, CRM, Supply Chain, Web Analytics) to handle the internal data flowing through the organization. Today, in order to be able to operate with peak efficiency, a large enterprise will need to have a detailed realtime integrated awareness of all kinds of data sources that could impact the business, for example, information on: customers, partners, employees, competitors, marketing, advertising, pricing, web, news, markets, locations, gov data, communications, email, collaboration, social, IT, datacenters, networks, sensors.
  • Unstructured + Structured Data. SQL/OLAP analytics was built on the idea that data would be held in relational databases, and that the data would be highly structured. Today, this no longer applies. Much of the most valuable data to an enterprise today is either semi-structured or unstructured.
  • Easy-To-Use. SQL/OLAP has proved to be too complex for most enterprise users who need access to analytics for their work. Excel with its simple charting, visualization, sharing and collaboration features provides a much more attractive interface for most users. Other products and services such as Qlikview and GoodData also provide ease-of-use, but none of them (Excel included) offers the kind of realtime analytics, scalability and parallel processing required in analytics today. Despite its complexity and lack of mainstream adoption within the enterprise, a few companies have taken SQL/OLAP and made it even more complex by adding in features to support realtime stream processing. None of these StreamSQL solutions seem to have achieved any widespread adoption to date.
  • Cloud-Based, Pay-Per-Use. Every company looking to compete in the next-generation analytics market will have to have at least a public cloud offering, and most will also have virtual private cloud and private cloud offerings. Since enterprise data will often be held on more than one cloud, it will be increasingly important to have an "intercloud" capability, where analytics apps can be run across multiple (public and/or private) clouds, e.g. across Amazon AWS and Windows Azure.
  • Elastic Scalability, Parallel Processing, MapReduce. With exponentially growing data volumes it will be essential to offer the elastic scalability and parallel processing required required to handle anything from one-off personal data analysis tasks up to the most demanding large-scale analytics apps required by the world's leading organizations in business, web, finance and government.
  • Seamless Integration With Standard Tools (Excel). With 40 Million analytics power users using Excel, this is a must for any analytics solution looking to achieve significant market adoption.

At Cloudscale, we've compiled a Cloud Analytics Checklist, showing how various analytics products/services measure up against this set of requirements. If you're thinking about cloud analytics and would like a copy of the Checklist then send a request with your email address via the Cloudscale website (no signup required) or by email to checklist@cloudscale.com, with the word Checklist in the Subject line.