How many servers does it take to power Amazon’s massive cloud infrastructure? A fair question, considering they are one of the biggest cloud providers out there. Amazon, for obvious reasons has not been forthcoming with the information, so Huang Liu, a research manager at Accenture Technology Labs, set out to find the answer. According to his calculations, which he writes about on his blog, the Amazon Elastic Compute Cloud (EC2) is home to nearly half a million servers.
Liu’s findings are based on a combination of internal and external IP addresses, which he uses to come up with an estimate of the number of server racks in each region. He then extrapolates: if each rack has a 4 10U chassis, and each chassis holds 16 blades, that gives you a total of 64 blade servers per rack.
In table form, Liu shows the number of servers contained in each of Amazon’s seven regions, for a grand total of 454,400. It’s worth noting that the US East hub, Amazon’s first, has the lion’s share with 321,920. Based on this, Liu infers that “it is hard to compete with Amazon on scale in the US, but in other regions, the entry barrier is lower. For example, Sao Paulo has only 25 racks of servers.”
Liu has also charted the expansion of Amazon’s US-based infrastructure over the past six months, from August. 23, 2011, to February 23, 2012, remarking on the impressive growth rate. According to his work, the US East region has been adding an average of 110 server racks per month. Liu points out that although the growth rate is linear, it has slowed down some over the past couple of months.
How did he do it? Liu writes:
Figuring out EC2′ size is not trivial. Part of the reason is that EC2 provides you with virtual machines and it is difficult to know how many virtual machines are active on a physical host. Thus, even if we can determine how many virtual machines are there, we still cannot figure out the number of physical servers. Instead of focusing on how many servers are there, our methodology probes for the number of server racks out there.
There’s a lot more to it than that, and Mr. Liu lays out his methodology in detail, providing the following notes as a summary of the process:
- Enumerate all public IP addresses EC2 uses
- Translate a public IP address to its public DNS name (e.g., ec2-50-17-204-150.compute-1.amazonaws.com)
- Run a DNS query inside EC2 to get its internal IP address (e.g., 10.2.13.243).
- Derive the rack’s IP range from the internal IP address (e.g., 10.2.12.x/22).
- Count how many unique racks we have seen, then multiple it by the number of physical servers in a rack (I assume it is 64 servers/rack).
Mr. Liu is quick to point out that these figures are estimates, based on a number of educated assumptions, but they are the best figures we have so far, and are helping to inform the larger cloud conversation. Besides as Liu notes, “the methodology is fully documented.” He invites “inquisitive minds” to read over his findings and to point out flaws in his process. For its part, the community has done just that; the story has already been picked up by a number of news outlets in the last few days.