British multinational BP revealed it is building a new datacenter in Houston to house a 2-petaflop supercomputer. When installed in 2013, it will likely be the most powerful system deployed by a commercial entity, at least of the ones that have been publicly revealed. The upcoming petaflopper will support the company’s oil and gas exploration efforts and other research objectives.
According to the press release, BP’s existing datacenter in Houston has topped out in power and cooling capacity, so a new high performance computing facility was needed in order to support the company’s expanding HPC footprint. The new one will also be located in Houston and is scheduled to open sometime around the middle of next year.
It will house compute and storage systems devoted to processing BP’s voluminous set of seismic data collected around the world. It will also support “rock physics,” which will enable company scientists to produce images of rock structures deep underground – all of this to help BP locate and exploit new oil and gas resources.
The future center and supercomputer will put BP’s HPC infrastructure on par with that of national labs. At 110,000 square feet, the new facility will actually be larger than the recent 95,000 square-foot datacenter built for NCSA’s 11.5-petaflop Blue Waters supercomputer. To go along with the 2 petaflops of peak number-crunching capability, the future BP machine will also be outfitted with 536 terabytes of memory and 23.5 petabytes of external disk storage.
The upcoming BP super will apparently be getting all its FLOPs from CPUs, about 67,000 of them according to the official announcement. In an email interview with HPCwire, Keith Gray, BP’s HPC center manager, said they are not quite ready to make the jump to heterogeneous computing. “We continue to test accelerators,” wrote Gray, “but have not built a strong business case for our complete application base.”
“We must create a competitive environment to maximize the capabilities we will deliver,” he continued. “Our researchers want to test their ideas on real problems at scale. They want to increase the resolution and complexity. We need to be flexible and take advantage of what the market can deliver.”
The existing HPC setup at BP provides an aggregate peak performance of more than 1.2 petaflops. It consists of multiple clusters based on a variety of Intel Xeon-powered clusters, including of 2,912 HP SL230 nodes (8-core 2.6 GHz Sandy Bridge CPUs), 1,920 Dell C6100 nodes (6-core 2.6 GHz Westmere CPUs), and 50 HP DL580 nodes (2.3 GHz Westmere EX CPUs). The core network in their current datacenter is Ethernet and is provided is by Arista, while their storage systems have been gathered from various vendors, including Panasas, IBM, and DataDirect Networks.
The largest MPI applications used at BP can scale to more than 30,000 cores, so the new system will give them plenty of headroom for expansion. It will also allow multiple large jobs to be processed in parallel. “Projects that currently run overnight can now be run twice a day – letting us try more ideas,” explained Gray. “If a project takes six months, we might choose to defer it. If we can complete in three months, we may choose to proceed.”
BP says its processing needs have increased 10,000-fold since 1999. Seismic imaging that would have taken four years of computing time a decade ago can now be accomplished in an hour. The increase in processing power over this period has transformed oil and gas exploration, allowing major new finds at a time when many were predicting that most of world’s reserves had been located.
With oil pushing $100 per barrel, there is plenty of incentive for these companies to be investing in technologies that can uncover new reserves. For its part, BP has doubled HPC spending over the last few years and intends to keep that investment on an upward slope. The company is planning to test 15 new oil and gas sites over the next three years, and it expects that at least some of its 35 exploration wells will each yield an equivalent of a quarter billion barrels of oil.
BP claims that its 2-petaflop system will be the largest such machine employed for commercial purposes. That may or may not be the case, since not all commercial supercomputing deployments are made public, especially in the financial services realm and the oil and gas industry. These just happen to be the two industries that have the wherewithal and the monetary incentives to buy top-of-the-line supercomputers. But, for competitive reasons, not all of them want to reveal the technology they are using to drive revenue.
In the current TOP500 rankings, the fastest Linpack machine that was obtained without the help of government funds is a 461 teraflop cluster for a non-specified geosciences firm. It sits at number 44 on the November 2012 list. This one from BP will be four times as powerful and would land it in the top 10 today.
While petaflop-plus computing is not commonplace yet, even in the government sector, BP’s plans are yet another indication that the petascale era is in full swing. And although there are only about 50 such machines in the world today, with the advent of teraflop accelerators and ever more powerful CPUs, such computing should become much more prevalent in the commercial arena and elsewhere over the next few years.
[The original version of this article erroneously referred to the Blue Waters supercomputer employing Xeon Phi processors. As of today, Stampede is the only petascale Phi-powered system. The original text also mistakenly talked about oil at $100 per gallon oil, rather than $100 per barrel. We regret the errors — Editor]