Mont Blanc Forges Cluster from Smartphone Chips

By Timothy Prickett Morgan

November 22, 2013

The Mont Blanc project, an effort by a number of European supercomputing centers and vendors that seeks to create an energy-efficient supercomputer based on ARM processors and GPU coprocessors, has put together its third prototype. That is one more step on the path to an exascale system.

The third generation machine, which is being shown off at the SC13 conference in Denver this week, is by far the most elegant one that the Mont Blanc project has created thus far. This prototype supercomputer actually bears the name of the project this time around, and was preceded by the Tibidabo and Petraforca clusters, which were based on a different collection of ARM processors and GPU accelerators.

Just because this design is elegant, don’t get the wrong idea, though. The Mont Blanc machine is still a prototype, cautions Alex Ramirez, leader of the Heterogeneous Architectures Research Group at BSC who heads up the Mont Blanc project.

“In order to make this a production product, we would have to go through at least one more generation,” he says.

It stands to reason that the Mont Blanc project is waiting for the day when 64-bit ARM chips with integrated interconnects and faster GPUs are available before going into production. But for now, software can be ported to these prototypes and things can be learned about where the performance bottlenecks are and what reliability issues there might be.

The exact size of the Mont Blanc prototype cluster has not been determined yet, but Ramirez says it will have two or three racks of ARM-powered nodes. “It will be big enough to make scalability and reliability claims, but we are trying to keep the cost down on a machine that is not a production system,” he says.

Mont-Blanc-blade-carrier

The server node in the Mont Blanc system is based on the Exynos 5 system-on-chip made by Samsung, which is a dual-core ARM Cortex-A15 with an ARM Mali-T604 GPU on the die. The ARM CPU portion of the system-on-chip has about twice the performance of the quad-core Cortex-A9 processor used on the Petraforca prototype that was put together earlier this year. (There were actually two versions, but the second one is more important.) That machine used Nvidia Tesla K20 GPU coprocessors to test out how a wimpy CPU and a brawny GPU might be married. Specifically, the ARM processors, which were Tegra 3 chips running at 1.3 GHz, were put into a Mini-ITX system board with one I/O slot that was linked to a PCI-Express switch that in turn had one GPU and one ConnectX-3 40 Gb/sec InfiniBand adapter card.

The dual-core Exynos 5 chip from Samsung is used in smartphones, runs at 1.7 GHz, and has a quad-core Mali-T604 GPU that supports OpenCL 1.1. It has a dual-channel DDR3 memory controller and a USB 3.0 to 1 Gb/sec Ethernet bridge. Each Mont Blanc node is a daughter card made by Samsung that has the CPU and GPU, 4 GB of memory (1.6 GHz DDR3), a microSD slot for flash storage, and a 1 Gb/sec Ethernet network interface. All of this is crammed onto a daughter card that is 3.3 by 3.2 inches that has 6.8 gigaflops of compute on the CPU and 25.5 gigaflops of compute on the GPU for something around 10 watts of power. That works out to around 3.2 gigaflops per watt at peak theoretical performance.

The Mont Blanc system is using the Bull B505 blade server carrier and the related blade server chassis and racks to house multiple ARM server nodes. In this case, the blade carrier is fitted with a custom backplane that has a Broadcom Ethernet crossbar switch on it that links fifteen of these ARM compute nodes together. Every blade in the carrier has an Ethernet bridge chip, made by ASIX Electronics, that converts the USB port into Ethernet and then lets it hook into that Broadcom switch in the carrier.

Here is how you stack up the Mont Blanc rack:

Mont-Blanc-system

In this particular setup, says Ramirez, the location had some power density and heat density restrictions, so it was limited to four Bull blade server chassis. But the system is designed to support up to six chassis if the datacenter has enough power and cooling.

Each blade has fifteen nodes, and is a cluster in its own right. The blade delivers on the order of 485 gigaflops of compute and will burn about 200 watts. (Ramirez is estimating because he has not actually been able to do the wall power test yet because the machines just came out of the factory a few days prior to SC13.) That works out to 2.4 gigaflops per watt or so after the overhead of the network is added in.

The 7U blade chassis can hold nine carrier blades, for a total of 135 compute nodes. That works out to 4.3 teraflops in the aggregate per chassis at around 2 kilowatts of power, or 2.2 gigaflops per watt. With two 36 port 10 Gb/sec Ethernet switches to link the chassis together and 40 Gb/sec uplinks to hook into other racks, a four-chassis rack would deliver 17.2 teraflops of computing in an 8.2 kilowatt power envelope, or about 2.1 gigaflops per watt. With six blade chassis, you can get 25.8 teraflops into a rack. That is 810 chips in total per rack, by the way, with a total of 1,620 ARM cores and 3,240 Mali GPU cores.

This Mont Blanc effort will get very interesting next year, when many different ARMv8 processors, sporting 64-bit memory addressing and integrated interconnects, become available from a variety of vendors, including AppliedMicro, Calxeda, AMD, and maybe others like Samsung. Many of the components that had to be woven together in this third prototype will be unnecessary, and the thermal efficiency of the cluster will presumably rise dramatically once these features are integrated on the chips. These future ARM chips will also come with server features, such as ECC memory protection and standard I/O interfaces like PCI-Express.

“There will be enough providers that at least one of them will have exactly the kind of part you want at any given time,” says Ramirez, a bit like a kid in a candy store.

The Mont Blanc project was established in October 2011 and is a five-year effort that is coordinated by the Barcelona Supercomputer Center in Spain. British chip maker ARM Holdings, French server maker Bull, French chip maker STMicroelectronics, and British compiler tool maker Allinea are vendor participants in the Mont Blanc consortium. The University of Bristol in England, the University of Stuttgart in Germany, and the CINECA consortium of universities in Italy are academic members of the group, and the CEA, BADW-LRZ, Juelich, and BSC supercomputer centers are also members. So are a number of other institutions that promote HPC in Europe, including Inria, GENCI, and CNRS.

Mont Blanc was originally a three year project with a relatively modest budget of €14.5 million, and it has secured an additional €8.1 million in funding from the European Commission to extend it two more years. The funds are not just being used to create an exascale design, but also to create a parallel programming environment that will run on hybrid ARM-GPU machines as well as creating check pointing software to run on the clusters.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

HPC Startup Advances Auto-Parallelization’s Promise

January 23, 2017

The shift from single core to multicore hardware has made finding parallelism in codes more important than ever, but that hasn’t made the task of parallel programming any easier. Read more…

By Tiffany Trader

Answered Prayers for High Frequency Traders? Latency Cut to 20 Nanoseconds

January 23, 2017

“You can buy your way out of bandwidth problems. But latency is divine.”

This sentiment, from Intel Technical Computing Group CTO Mark Seager, seems as old as the Bible, a truth universally acknowledged. Read more…

By Doug Black

CMU’s Latest “Card Shark” – Libratus – is Beating the Poker Pros (Again)

January 20, 2017

It’s starting to look like Carnegie Mellon University has a gambling problem – can’t stay away from the poker table. Read more…

By John Russell

IDG to Be Bought by Chinese Investors; IDC to Spin Out HPC Group

January 19, 2017

US-based publishing and investment firm International Data Group, Inc. (IDG) will be acquired by a pair of Chinese investors, China Oceanwide Holdings Group Co., Ltd. Read more…

By Tiffany Trader

HPE Extreme Performance Solutions

Enhancing Patient Care with Next-Generation Sequencing

In the ever-evolving world of life sciences, speed, accuracy, and savings are more important than ever. Today’s scientists and healthcare professionals are leveraging high-performance computing (HPC) solutions to solve the world’s greatest health problems and accelerate the diagnoses and treatment of a variety of medical conditions. Read more…

Weekly Twitter Roundup (Jan. 19, 2017)

January 19, 2017

Here at HPCwire, we aim to keep the HPC community apprised of the most relevant and interesting news items that get tweeted throughout the week. Read more…

By Thomas Ayres

France’s CEA and Japan’s RIKEN to Partner on ARM and Exascale

January 19, 2017

France’s CEA and Japan’s RIKEN institute announced a multi-faceted five-year collaboration to advance HPC generally and prepare for exascale computing. Among the particulars are efforts to: build out the ARM ecosystem; work on code development and code sharing on the existing and future platforms; share expertise in specific application areas (material and seismic sciences for example); improve techniques for using numerical simulation with big data; and expand HPC workforce training. It seems to be a very full agenda. Read more…

By Nishi Katsuya and John Russell

ARM Waving: Attention, Deployments, and Development

January 18, 2017

It’s been a heady two weeks for the ARM HPC advocacy camp. At this week’s Mont-Blanc Project meeting held at the Barcelona Supercomputer Center, Cray announced plans to build an ARM-based supercomputer in the U.K. while Mont-Blanc selected Cavium’s ThunderX2 ARM chip for its third phase of development. Last week, France’s CEA and Japan’s Riken announced a deep collaboration aimed largely at fostering the ARM ecosystem. This activity follows a busy 2016 when SoftBank acquired ARM, OpenHPC announced ARM support, ARM released its SVE spec, Fujistu chose ARM for the post K machine, and ARM acquired HPC tool provider Allinea in December. Read more…

By John Russell

Women Coders from Russia, Italy, and Poland Top Study

January 17, 2017

According to a study posted on HackerRank today the best women coders as judged by performance on HackerRank challenges come from Russia, Italy, and Poland. Read more…

By John Russell

HPC Startup Advances Auto-Parallelization’s Promise

January 23, 2017

The shift from single core to multicore hardware has made finding parallelism in codes more important than ever, but that hasn’t made the task of parallel programming any easier. Read more…

By Tiffany Trader

Answered Prayers for High Frequency Traders? Latency Cut to 20 Nanoseconds

January 23, 2017

“You can buy your way out of bandwidth problems. But latency is divine.”

This sentiment, from Intel Technical Computing Group CTO Mark Seager, seems as old as the Bible, a truth universally acknowledged. Read more…

By Doug Black

IDG to Be Bought by Chinese Investors; IDC to Spin Out HPC Group

January 19, 2017

US-based publishing and investment firm International Data Group, Inc. (IDG) will be acquired by a pair of Chinese investors, China Oceanwide Holdings Group Co., Ltd. Read more…

By Tiffany Trader

France’s CEA and Japan’s RIKEN to Partner on ARM and Exascale

January 19, 2017

France’s CEA and Japan’s RIKEN institute announced a multi-faceted five-year collaboration to advance HPC generally and prepare for exascale computing. Among the particulars are efforts to: build out the ARM ecosystem; work on code development and code sharing on the existing and future platforms; share expertise in specific application areas (material and seismic sciences for example); improve techniques for using numerical simulation with big data; and expand HPC workforce training. It seems to be a very full agenda. Read more…

By Nishi Katsuya and John Russell

ARM Waving: Attention, Deployments, and Development

January 18, 2017

It’s been a heady two weeks for the ARM HPC advocacy camp. At this week’s Mont-Blanc Project meeting held at the Barcelona Supercomputer Center, Cray announced plans to build an ARM-based supercomputer in the U.K. while Mont-Blanc selected Cavium’s ThunderX2 ARM chip for its third phase of development. Last week, France’s CEA and Japan’s Riken announced a deep collaboration aimed largely at fostering the ARM ecosystem. This activity follows a busy 2016 when SoftBank acquired ARM, OpenHPC announced ARM support, ARM released its SVE spec, Fujistu chose ARM for the post K machine, and ARM acquired HPC tool provider Allinea in December. Read more…

By John Russell

Spurred by Global Ambitions, Inspur in Joint HPC Deal with DDN

January 17, 2017

Inspur, the fast-growth cloud computing and server vendor from China that has several systems on the current Top500 list, and DDN, a leader in high-end storage, have announced a joint sales and marketing agreement to produce solutions based on DDN storage platforms integrated with servers, networking, software and services from Inspur. Read more…

By Doug Black

For IBM/OpenPOWER: Success in 2017 = (Volume) Sales

January 11, 2017

To a large degree IBM and the OpenPOWER Foundation have done what they said they would – assembling a substantial and growing ecosystem and bringing Power-based products to market, all in about three years. Read more…

By John Russell

UberCloud Cites Progress in HPC Cloud Computing

January 10, 2017

200 HPC cloud experiments, 80 case studies, and a ton of hands-on experience gained, that’s the harvest of four years of UberCloud HPC Experiments. Read more…

By Wolfgang Gentzsch and Burak Yenier

AWS Beats Azure to K80 General Availability

September 30, 2016

Amazon Web Services has seeded its cloud with Nvidia Tesla K80 GPUs to meet the growing demand for accelerated computing across an increasingly-diverse range of workloads. The P2 instance family is a welcome addition for compute- and data-focused users who were growing frustrated with the performance limitations of Amazon's G2 instances, which are backed by three-year-old Nvidia GRID K520 graphics cards. Read more…

By Tiffany Trader

For IBM/OpenPOWER: Success in 2017 = (Volume) Sales

January 11, 2017

To a large degree IBM and the OpenPOWER Foundation have done what they said they would – assembling a substantial and growing ecosystem and bringing Power-based products to market, all in about three years. Read more…

By John Russell

US, China Vie for Supercomputing Supremacy

November 14, 2016

The 48th edition of the TOP500 list is fresh off the presses and while there is no new number one system, as previously teased by China, there are a number of notable entrants from the US and around the world and significant trends to report on. Read more…

By Tiffany Trader

Container App ‘Singularity’ Eases Scientific Computing

October 20, 2016

HPC container platform Singularity is just six months out from its 1.0 release but already is making inroads across the HPC research landscape. It's in use at Lawrence Berkeley National Laboratory (LBNL), where Singularity founder Gregory Kurtzer has worked in the High Performance Computing Services (HPCS) group for 16 years. Read more…

By Tiffany Trader

Dell EMC Engineers Strategy to Democratize HPC

September 29, 2016

The freshly minted Dell EMC division of Dell Technologies is on a mission to take HPC mainstream with a strategy that hinges on engineered solutions, beginning with a focus on three industry verticals: manufacturing, research and life sciences. "Unlike traditional HPC where everybody bought parts, assembled parts and ran the workloads and did iterative engineering, we want folks to focus on time to innovation and let us worry about the infrastructure," said Jim Ganthier, senior vice president, validated solutions organization at Dell EMC Converged Platforms Solution Division. Read more…

By Tiffany Trader

Lighting up Aurora: Behind the Scenes at the Creation of the DOE’s Upcoming 200 Petaflops Supercomputer

December 1, 2016

In April 2015, U.S. Department of Energy Undersecretary Franklin Orr announced that Intel would be the prime contractor for Aurora: Read more…

By Jan Rowell

D-Wave SC16 Update: What’s Bo Ewald Saying These Days

November 18, 2016

Tucked in a back section of the SC16 exhibit hall, quantum computing pioneer D-Wave has been talking up its new 2000-qubit processor announced in September. Forget for a moment the criticism sometimes aimed at D-Wave. This small Canadian company has sold several machines including, for example, ones to Lockheed and NASA, and has worked with Google on mapping machine learning problems to quantum computing. In July Los Alamos National Laboratory took possession of a 1000-quibit D-Wave 2X system that LANL ordered a year ago around the time of SC15. Read more…

By John Russell

Enlisting Deep Learning in the War on Cancer

December 7, 2016

Sometime in Q2 2017 the first ‘results’ of the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) will become publicly available according to Rick Stevens. He leads one of three JDACS4C pilot projects pressing deep learning (DL) into service in the War on Cancer. Read more…

By John Russell

Leading Solution Providers

CPU Benchmarking: Haswell Versus POWER8

June 2, 2015

With OpenPOWER activity ramping up and IBM’s prominent role in the upcoming DOE machines Summit and Sierra, it’s a good time to look at how the IBM POWER CPU stacks up against the x86 Xeon Haswell CPU from Intel. Read more…

By Tiffany Trader

Nvidia Sees Bright Future for AI Supercomputing

November 23, 2016

Graphics chipmaker Nvidia made a strong showing at SC16 in Salt Lake City last week. Read more…

By Tiffany Trader

Vectors: How the Old Became New Again in Supercomputing

September 26, 2016

Vector instructions, once a powerful performance innovation of supercomputing in the 1970s and 1980s became an obsolete technology in the 1990s. But like the mythical phoenix bird, vector instructions have arisen from the ashes. Here is the history of a technology that went from new to old then back to new. Read more…

By Lynd Stringer

Beyond von Neumann, Neuromorphic Computing Steadily Advances

March 21, 2016

Neuromorphic computing – brain inspired computing – has long been a tantalizing goal. The human brain does with around 20 watts what supercomputers do with megawatts. And power consumption isn’t the only difference. Fundamentally, brains ‘think differently’ than the von Neumann architecture-based computers. While neuromorphic computing progress has been intriguing, it has still not proven very practical. Read more…

By John Russell

BioTeam’s Berman Charts 2017 HPC Trends in Life Sciences

January 4, 2017

Twenty years ago high performance computing was nearly absent from life sciences. Today it’s used throughout life sciences and biomedical research. Genomics and the data deluge from modern lab instruments are the main drivers, but so is the longer-term desire to perform predictive simulation in support of Precision Medicine (PM). There’s even a specialized life sciences supercomputer, ‘Anton’ from D.E. Shaw Research, and the Pittsburgh Supercomputing Center is standing up its second Anton 2 and actively soliciting project proposals. There’s a lot going on. Read more…

By John Russell

Dell Knights Landing Machine Sets New STAC Records

November 2, 2016

The Securities Technology Analysis Center, commonly known as STAC, has released a new report characterizing the performance of the Knight Landing-based Dell PowerEdge C6320p server on the STAC-A2 benchmarking suite, widely used by the financial services industry to test and evaluate computing platforms. The Dell machine has set new records for both the baseline Greeks benchmark and the large Greeks benchmark. Read more…

By Tiffany Trader

The Exascale Computing Project Awards $39.8M to 22 Projects

September 7, 2016

The Department of Energy’s Exascale Computing Project (ECP) hit an important milestone today with the announcement of its first round of funding, moving the nation closer to its goal of reaching capable exascale computing by 2023. Read more…

By Tiffany Trader

What Knights Landing Is Not

June 18, 2016

As we get ready to launch the newest member of the Intel Xeon Phi family, code named Knights Landing, it is natural that there be some questions and potentially some confusion. Read more…

By James Reinders, Intel

  • arrow
  • Click Here for More Headlines
  • arrow
Share This