In Justin Rattner’s ISC keynote address on Thursday, the Intel CTO painted a picture in which future supercomputing technologies are merged into everyday computing. Titled “Multicore/Manycore Platforms Bring Supercomputing to the Masses,” Rattner’s keynote outlined how Intel’s terascale research program is bringing together all the hardware and software pieces that will become the platform for a new set of killer applications.
The applications are what Intel calls RMS, for recognition, mining and synthesis. An example of recognition software is code that locates images of the Eiffel Tower in a video stream. An example of an integrated RMS application is one that detects a tumor, models its growth, and then prescribes a therapy. Applications would span virtually all fields, though: health, entertainment, finance, engineering, scientific research, and so on. The common thread, said Rattner, is that they involve “multimodal recognition and synthesis over large and complex datasets” — artificial intelligence by any other name.
Conveniently, the platform that will support RMS apps needs to incorporate a bunch of different technologies currently being researched and developed at Intel. These included manycore processors, 3D stacked memory, silicon photonics, and parallel programming tools. The stacked memory and silicon photonics are still at the R end of the R&D process, but Intel’s manycore processors and parallel programming tools have been in development for a while and should see the light of day within the next year or two.
Larrabee, for example, is Intel’s first manycore architecture that is scheduled for production in 2010. It merges the IA instruction set with an extended vector capability, essentially merging the GPU and CPU within a processing core. Besides the IA vector cores (presumably at least 16), the chip incorporates coherent cache, a high performance interprocessor network, fixed function logic and memory controllers.
Intel thinks the IA compatibility will be a big advantage for Larrabee, allowing the processor to leverage the enormous amount of existing x86 software and developer talent. “Programming Larrabee is like programming any IA multicore, except you have an extraordinary amount of floating point capability available to you.” explained Rattner. “So it’s an extremely familiar programming environment.”
Although Rattner didn’t talk about the how the chip is being positioned in the market, the company has made it clear that it is general-purpose enough to satisfy both high-end graphics users and the HPC crowd. If successful, Larrabee would do an end-around NVIDIA, who appears committed to GPU-only architectures, as well as AMD, who is merging discrete GPUs and CPUs on-chip. As with Larrabee’s progenitor, the 80-core Polaris prototype, there is every reason to believe that the first rendition of the new processor will hit a teraflop. In fact, to compete in the high-end graphics market in the 2010 timeframe, it will have to do even better than that.
Of course, another important element of the terascale environment is the programming language. For this piece, Intel has conjured up Ct, a C/C++ extension that, according to Rattner, allows “an ordinary programmer to write serial-like code in a core-independent fashion using familiar syntax.” The main data structure is the nested vector, which can represent both sparse and dense matrices. The application binary is the same for all processors — or at least Intel ones — with the decision on how to map the parallelism to the underlying hardware made by the just-in-time (JIT) compiler and the auto-scaling runtime.
Rattner says they have Ct running on multicore CPUs today and already have a version targeted for the Larrabee processor. He demo’ed an application that recognizes moving automobiles in traffic, written in Ct and running on a multicore x86 machine. Rattner admitted that Ct may not be the end-all and be-all as far as RMS programming tools, but he thought it was a good start and an important piece of technology.
It’s all a bit of a risk for Intel, though. They’re late to the high-end GPU game and evolving a CPU into a graphics/vector architecture is unconventional. But right now Intel has the wind at its back, so it’s probably a good time for them to be adventurous.
“We understand that multicore is already tough and that manycore would be even tougher,” said Rattner. “We went into this with our eyes open.”