2008年12月5日金曜日

Building EMC Atmos

EMC社が自社のCloud Computingの戦略の一つとして、Atmosという名前で発表を行っている。
 
これは、大規模なデータ管理をEMC社が運営しているCloud Computingインフラを使って行う、というもので、COS (Cloud Optimizsed Storage)と呼ばれる、SANやNASとは異なるコンセプトとアーキテクチャで構成されている。 
 
COSで管理されるデータはオブジェクトと呼ばれ、ポリシーで管理される。  オブジェクトはペタバイト級の大きさで、LUNやRAIDのようなSCSIインタフェースに必要なデータもなし。  COSで管理されるのは実態であるオブジェクトとメタデータのみ。  オブジェクトへのアクセスはSOAPやRESTを経由して行われる。 
 
物理的にオブジェクトがどこにあるかはまったく関係なく、統一された管理コンソールインタフェースでオブジェクトの管理が行われる。 
 
 

EMC Atmos is EMC's first Cloud Optimised Storage offering designed for policy based information storage, information distribution and information retrieval at a global scale. GA code shipped at the end of June and customers and partners have been deploying Atmos repositories in their own environments since the second half of 08.

While some competitors were flapping their gums and asking whatever crazy questions came into their heads EMC was shipping a product whose team didn't miss a single milestone and met their ship date. Now that the marketing machine has spun up and the EMC Sales sledgehammer is about to drive those competitors into the ground I'll be following their backtracking with some enthusiasm.

So what is EMC Atmos? What Atmos isn't is a clustered file system or a warmed over NAS offering clustered or otherwise. Atmos(phere) was designed by the Cloud Infrastructure and Services Division (CISD) from the ground up with a number of distinct characteristics.

  • Information inside the Atmos repository is stored as objects. Policies can be created to act on those objects and this is a key differentiator as it allows Atmos to apply different functionality and different service levels to different types of users and their data. Managing information, which is what we should be doing, as opposed to wrangling blocks and file systems as we tend to do.

  • There is no concept of GBs or TBs to EMC Atmos, those units of storage capacity are too small, Atmos is designed for multi-Petabyte deployments. There are no LUNs. There is no RAID. There are only objects and metadata.

  • There is a unified namespace. Atmos operates not on individual information silos but as a single repository regardless of how many Petabytes containing how many billions of objects are in use spread across whatever number of locations available to who knows how many users.

  • There is a single management console for management regardless of how many locations the object repository is distributed across. This global scale approach means that Atmos had to be an autonomic system. Automatically reacting to environmental and workload changes as well as failures to ensure global availability.

COS What those traits should highlight for you is that Atmos isn't a SAN offering isn't a NAS offering and neither is it a CAS offering. It's a COS offering, cloud optimised storage, with web services such as SOAP and REST for access.

By now there's a lot of info on Atmos on the various blogs and up on EMC.com but this entry is about "Building EMC Atmos" and for that information I went to one of the Atmos architects. Dr. Patrick Eaton.

Patrick Eaton received his PhD from Berkeley and was one of the primary members of Professor John Kubiatowicz OceanStore project. As I learned from speaking to him he's been thinking about stuff like this for a number of years and if he wasn't building globally distributed storage systems he'd be indulging his passion for music working in the field of digital sound for a company like Yamaha or Korg.

With a tinge of regret he tells me that these days he's more of a consumer than a creator of music but as he's been busy building something new from the ground up that's understandable.

In person he's taller and younger than I had expected, he smiles easily and comes across as an open personality. Clearly not one of these academic types who had their sense of humour surgically removed before they submitted their thesis.

As I was to learn Atmos started with five people out at the EMC Cambridge facility working on it's floor to ceiling whiteboards looking to solve a problem.

"Fundamentally this was a distributed systems problem. How do you take a loose collection of services distributed across a wide area and make them operate as you want them to operate?"

Fortunately for me this isn't a question I have to answer or I'd need more than the floor to ceiling whiteboards but he pauses for a split second before moving on.

"EMC is really good at selling high end storage to really high end people. If you can drop tens of dollars per GB on a storage system man does EMC have offerings for you but data growth is continuing to explode and not everybody has data which justifies that level of expenditure or has the financial resources to justify spending that much money on storage. So, EMC was coming across a customer segment for whom they didn't have an offering and the goal for Atmos was to provide a low cost bulk storage system for these emerging markets, like Web 2.0 companies or other industries with lots of user generated content.

Yes you can put that stuff on regular SAN or NAS systems and that's what customers have been doing as the only other option was to start writing and maintaining their own storage software and build their own storage hardware. That's far from ideal as the value of these companies is in their applications and the services those applications provide. 

What we needed to do was provide a Terrabyte at something like ten or more times cheaper than existing SAN or NAS storage systems can offer. That is the problem Atmos was designed to solve and a key part of the product vision comes from the policy driven features of Atmos. Yes you're targeting the bulk storage market, the TME and Web 2.0 spaces with those mountains of user generated content, but people want to use that storage in very different ways. Some people want to have one data centre, some want two others want many more. Some need to support different types of workloads, various types of object sizes, control where they locate specific objects and how they get them close to their customer regardless of where on the planet the customer is located in relation to where the data was first stored.

EMCAtmosArchitecture At the core of the Atmos design is how we allow customers to define policies as to how data actually hits disk. There are no administrators saying "Joe's photos should be on this particular piece of spinning rust", rather they write policies to describe how Joe is a subscription customer therefore his files require a certain number of copies associated with them for backup and should have a certain rolling retention policy in case he cancels his account. Thus they should be in this data centre here and not in one thousands of miles away.

But if Joe packs up the family and the dog and moves across country his data may be replicated to the data centre now closest to him depending on the policies applied to his files.

Information management is something EMC talks about a lot so providing a storage solution designed with policy based information management at it's core is a big thing we wanted to do with Atmos. You're not just storing information, you're replicating it to where it's needed and putting it as close to the user as possible. You're compressing it, de-duplicating it or deleting it depending on what policies are applied to it and if it hasn't been accessed in a while you can even spin down the drives inactive objects are stored on to save power.

Multi-tenancy, could we talk about that a bit more? Could I offer storage as a service to different users or organisations?

"Yes you could. Multi-tenancy means that Atmos can support many different tenants with logical isolation. Each tenant can have their own private namespace under the Atmos namespace but tenants are not aware of other tenants or the objects belonging to those tenants.

You could be providing services to users out on the Internet and hosting application test and dev as well as providing services to your internal business units, but none of those tenants would know about each other."

We were talking about this being a low cost solution, what's low cost at the scale we are talking about here? Sure there's capacity cost but it's not just that..

"Well not only does the initial cost of delivering the product to the doorstep have to be low but also it has to be something that the customer can maintain very easily and we're talking about the Petabyte range when we're talking about deploying this so one of the key design elements was how to provide a customer installable configurable and maintainable implementation.

Going back to the traditional EMC model of "We'll make sure it works but you're going to pay for it", where parts show up at your door with a service engineer attached well that shoots the entire low cost target out of the water if you have to do that more than a few times a year. That's why a lot of the installation, configuration and maintenance can be done by the customer themselves.

Low cost, low touch, incredible scale and density. Billions of objects globally distributed with policy based information management. Petabytes of storage which could be in the same room or distributed around the world but with a single point of management. Those were some of the design goals." 

Okay so you've built and shipped Atmos, we're were talking about having this pre-announcement chat back when you were just about to head off on holiday this past summer right after the code went GA, so what have you learned from building a product as opposed to working on a project?

"I learned a lot about managing cross continent teams. Maybe 50% of our developers and 80% of our QA is split between Beijing and Shanghai China. That's a 12 hour difference which can be challenging since there's no overlap during the day and there are cultural communication differences to factor in.

When the group was smaller I was exposed more to customer interactions and it was always interesting to get feedback and find out how they plan on using Atmos as opposed to how you think they'll use it. Now it's up and running in their environments I get a different kind of feedback as I'm watching how they're actually using the product in production.

I was also blessed to join this group when there were five of us. I've been able to grow with the group and assume some responsibility and some leadership which has stretched me and it's a stretching that a lot of freshly minted PhDs don't get so early on in their career. It was pretty natural when there was five people here and maybe ten over there that I could take well defined pieces of the system and then lead them through implementation. Now that we've grown to over a hundred people you can't take the people who've been there the longest and have them doing that.

I've been really blessed that way and really fortunate to have been able to join an organisation in it's infancy and be able to grow with the organisation. The opportunity here has really been amazing."

You moved from California to Massachusetts to join EMC and build Atmos from the ground up how did the move to the east coast turn out for you?

"We love it here. My wife and I are from the mid-west, which does have winters, so the seasons have made a welcome return. California has beautiful weather but it can start to feel like Groundhog Day while here the seasons are refreshing. The city is nice and I tell my manager all the time that we need to recruit more in California as there's not a whole lot of places you can draw from in the US and with a straight face tell them that Boston has more affordable houses and better commutes.

Californians you can say that to and it's true."