TACC Steps Up to the MIC

By Michael Feldman

April 21, 2011

As Intel prepares to roll out its Many Integrated Core (MIC) technology for commercial production in 2012, it has managed to entice a major US supercomputing center to start porting some of its science codes to the new architecture. The Texas Advanced Computing Center (TACC) announced it has teamed up with the chipmaker and begun porting a handful of research applications to the pre-production “Knights Ferry” MIC processor. Later this year, TACC will build a cluster of such chips for further development, with the intent to deploy a system based on the commercial “Knights Corner” MIC processor when Intel starts production.

MIC represents Intel’s entry into the HPC processor accelerator sweepstakes, as the company attempts to perform an end-run around GPU computing. Mainly thanks to NVIDIA, over the last few years GPU computing, aka GPGPU, has become a mainstream HPC solution across workstations, clusters and supercomputers. They rely on specialized programming environments, like CUDA and OpenCL, to develop software on those platforms.

As suggested by its name, MIC is essentially an x86 processor, with more cores (but simpler ones) than a standard x86 CPU, an extra-wide SIMD unit for heavy duty vector math, and four-way SMT threading. As such, it’s meant to speed up codes that can exploit much higher levels of parallelization than can be had on standard x86 parts.

Knights Ferry is Intel’s development implementation spun out of the chipmaker’s abandoned Larrabee processor effort for visual computing. The chip sports 32 IA cores and runs at 1.2 GHz. Since each core supports a four-way SMP (as opposed to the two-way HyperThreading on Xeons), each chip can manage up to 128 threads in parallel. Memory-wise, Knights Ferry has 8 MB of cache and 1 to 2 GB of GPU-flavored GDDR5 DRAM. Like its current GPGPU competition, Knights Ferry is meant to be hooked up to a PCIe bus, acting as a co-processor to a standard x86 CPU.

Knights Corner will be Intel’s first commercial version of MIC, will have upwards of 50 cores per chip, and will be implemented on the company’s 22nm process technology. Although no official date has been announced for the commercial launch, according to a presentation by Intel research engineer Pradeep Dubey at the recent 2011 Open Fabrics International Workshop in Monterey, Knights Corner is slated for release sometime in the second half of 2012.

At this point, TACC is using the MIC software development kit (SDK), employing a Knights Ferry chip attached to a single machine. According to TACC’s deputy director Dan Stanzione, they are planning to build a “relatively small” cluster of Knights Ferry-equipped nodes to test codes in a distributed computing environment before the end of the year.

On Thursday, I spoke with Stanzione, who was very upbeat about the new architecture, noting that the x86 compatibility is a big deal for TeraGrid researchers. In aggregate, they have a massive investment in their science codes, numbering in the hundreds.

“This is a way to get a dramatically better power per operation without having to throw out everything we know about software,” he said, adding, “I’m really excited about this as a path forward. I think it has the potential to be a real game-changer.”

One nice feature of MIC programming is that it inherently supports OpenMP, a popular parallel computing model for shared memory environments. And since Intel’s HPC tool chain — Parallel Studio and Cluster Studio — has been extended to the MIC architecture, the programmer can even stay in the same development environment for both its Xeon and MIC work — which, of course, Intel would like very much.

The result is that OpenMP code written for four-core or six-core x86 CPUs, like some of the ones TACC has started porting, should move rather easily to a 32-core MIC co-processor. “Getting the codes to run the first time is pretty simple,” Stanzione said, adding that when they move to the MIC cluster, they’ll have to figure out how to layer an MPI distributed memory model on top of that.

According to him, they’ve already ported a bunch of benchmark codes and have started with the applications. One is a bio-modeling app, which attempts to detect epistatic interactions (how genes modify each other to express a phenotype) across a corn genome. The code was thousands of lines long, but because it was parallelized via OpenMP, it moved to MIC with minimal restructuring.

Although TACC has committed resources to the MIC effort, Stanzione said they are evaluating hardware and software accelerator approaches across the spectrum, most notably using CUDA and OpenCL on GPUs. (TACC’s Longhorn supercomputer is currently the center’s largest GPU platform, sporting 512 NVIDIA Tesla processors.) Although it’s too early to compare performance across specific applications, it’s already apparent that porting is much simpler with Intel’s offering.

“Moving a code to MIC might involve sitting down and adding a couple of lines of directives that takes a few minutes,” explained Stanzione. “Moving a code to a GPU is a project.”

Although measuring performance is still a work in progress, the early results on scaling appear to be encouraging. According to Stanzione, doubling the number of MIC cores has roughly doubled the performance on some of the initial codes. They expect to be able to say a lot more about performance when they get the Knights Corner commercial parts.

From Intel’s point of view, getting TACC to sign on to MIC development is a big boost for its manycore effort. Assuming the porting goes as planned, the chipmaker will be able to point to a nice set of proof points based on real-world HPC applications. According to John Hengeveld, Intel’s director of technical compute marketing for its datacenter group, they’ll be able to incorporate TACC’s experience into the upcoming delivery of Knights Corner parts and software. “Having a partner that is helping us work on issues of scalability and optimization is really quite valuable,” he explained.

Although TACC is the first big HPC organization with a committed roadmap for MIC development, they won’t be the last. Intel currently has about 100 MIC developers scattered around, and according to Hengeveld, they’ll be announcing some bigger collaborations in the months ahead. And as we get closer to MIC’s commercial release, the news surrounding the new architecture should start to pick up. “We’ll be talking a lot more about this at ISC,” promised Hengeveld.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “pre-exascale” award), parsed out additional information about the upc Read more…

By Tiffany Trader

Tsinghua Crowned Eight-Time Student Cluster Champions at ISC

June 22, 2017

Always a hard-fought competition, the Student Cluster Competition awards were announced Wednesday, June 21, at the ISC High Performance Conference 2017. Amid whoops and hollers from the crowd, Thomas Sterling presented t Read more…

By Kim McMahon

GPUs, Power9, Figure Prominently in IBM’s Bet on Weather Forecasting

June 22, 2017

IBM jumped into the weather forecasting business roughly a year and a half ago by purchasing The Weather Company. This week at ISC 2017, Big Blue rolled out plans to push deeper into climate science and develop more gran Read more…

By John Russell

Intersect 360 at ISC: HPC Industry at $44B by 2021

June 22, 2017

The care, feeding and sustained growth of the HPC industry increasingly is in the hands of the commercial market sector – in particular, it’s the hyperscale companies and their embrace of AI and deep learning – tha Read more…

By Doug Black

HPE Extreme Performance Solutions

Creating a Roadmap for HPC Innovation at ISC 2017

In an era where technological advancements are driving innovation to every sector, and powering major economic and scientific breakthroughs, high performance computing (HPC) is crucial to tackle the challenges of today and tomorrow. Read more…

At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can

June 22, 2017

I've seen the future this week at ISC, it’s on display in prototype or Powerpoint form, and it’s going to dumbfound you. The future is an AI neural network designed to emulate and compete with the human brain. In thi Read more…

By Doug Black

AMD Charges Back into the Datacenter and HPC Workflows with EPYC Processor

June 20, 2017

AMD is charging back into the enterprise datacenter and select HPC workflows with its new EPYC 7000 processor line, code-named Naples, announced today at a “global” launch event in Austin TX. In many ways it was a fu Read more…

By John Russell

Hyperion: Deep Learning, AI Helping Drive Healthy HPC Industry Growth

June 20, 2017

To be at the ISC conference in Frankfurt this week is to experience deep immersion in deep learning. Users want to learn about it, vendors want to talk about it, analysts and journalists want to report on it. Deep learni Read more…

By Doug Black

OpenACC Shows Growing Strength at ISC

June 19, 2017

OpenACC is strutting its stuff at ISC this year touting expanding membership, a jump in downloads, favorable benchmarks across several architectures, new staff members, and new support by key HPC applications providers, Read more…

By John Russell

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “pre-exascal Read more…

By Tiffany Trader

Tsinghua Crowned Eight-Time Student Cluster Champions at ISC

June 22, 2017

Always a hard-fought competition, the Student Cluster Competition awards were announced Wednesday, June 21, at the ISC High Performance Conference 2017. Amid wh Read more…

By Kim McMahon

GPUs, Power9, Figure Prominently in IBM’s Bet on Weather Forecasting

June 22, 2017

IBM jumped into the weather forecasting business roughly a year and a half ago by purchasing The Weather Company. This week at ISC 2017, Big Blue rolled out pla Read more…

By John Russell

Intersect 360 at ISC: HPC Industry at $44B by 2021

June 22, 2017

The care, feeding and sustained growth of the HPC industry increasingly is in the hands of the commercial market sector – in particular, it’s the hyperscale Read more…

By Doug Black

At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can

June 22, 2017

I've seen the future this week at ISC, it’s on display in prototype or Powerpoint form, and it’s going to dumbfound you. The future is an AI neural network Read more…

By Doug Black

AMD Charges Back into the Datacenter and HPC Workflows with EPYC Processor

June 20, 2017

AMD is charging back into the enterprise datacenter and select HPC workflows with its new EPYC 7000 processor line, code-named Naples, announced today at a “g Read more…

By John Russell

Hyperion: Deep Learning, AI Helping Drive Healthy HPC Industry Growth

June 20, 2017

To be at the ISC conference in Frankfurt this week is to experience deep immersion in deep learning. Users want to learn about it, vendors want to talk about it Read more…

By Doug Black

OpenACC Shows Growing Strength at ISC

June 19, 2017

OpenACC is strutting its stuff at ISC this year touting expanding membership, a jump in downloads, favorable benchmarks across several architectures, new staff Read more…

By John Russell

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is the successor to Caffe, the deep learning framework developed by Berkeley AI Research and community contributors. Read more…

By Tiffany Trader

Leading Solution Providers

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

US Supercomputing Leaders Tackle the China Question

March 15, 2017

Joint DOE-NSA report responds to the increased global pressures impacting the competitiveness of U.S. supercomputing. Read more…

By Tiffany Trader

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

DOE Supercomputer Achieves Record 45-Qubit Quantum Simulation

April 13, 2017

In order to simulate larger and larger quantum systems and usher in an age of “quantum supremacy,” researchers are stretching the limits of today’s most advanced supercomputers. Read more…

By Tiffany Trader

Messina Update: The US Path to Exascale in 16 Slides

April 26, 2017

Paul Messina, director of the U.S. Exascale Computing Project, provided a wide-ranging review of ECP’s evolving plans last week at the HPC User Forum. Read more…

By John Russell

Knights Landing Processor with Omni-Path Makes Cloud Debut

April 18, 2017

HPC cloud specialist Rescale is partnering with Intel and HPC resource provider R Systems to offer first-ever cloud access to Xeon Phi "Knights Landing" processors. The infrastructure is based on the 68-core Intel Knights Landing processor with integrated Omni-Path fabric (the 7250F Xeon Phi). Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This