Visit additional Tabor Communication Publications
December 02, 2009
On Wednesday Intel shifted its Tera-scale Computing Research Program into second gear by demonstrating a 48-core x86 processor. The company is intending to use the new chip as a research platform for the purpose of lighting a fire under manycore computing.
According to Intel, the new chip boasts 1.3 billion transistors and is built on 45nm CMOS technology. Its distinction is that it contains the largest number of Intel Architecture (IA) cores ever assembled on a single microprocessor. As such, it represents the sequel to Intel's 2007 "Polaris" 80-core prototype that was based on simple floating point units. While the latter chip was said to reach 2 teraflops, the company is not talking about performance for the 48-core version.
It's worth mentioning that Intel is not blazing completely new territory here. Tilera already offers 32- and 64-core general-purpose processors (albeit non-x86) and previewed a 100-core version in October. Those chips are aimed at digital multimedia applications, networking gear, wireless infrastructure, and cloud computing.
Intel's 48-core offering is not intended for commercial use at all. Rather it will be used to help software researchers figure out how real applications can scale from dozens to thousands of cores. It can also be used as a testbed to experiment with new parallel computing models and applications. Intel plans to distribute at least a 100 of the experimental chips to commercial and academic researchers. Since the new chip incorporates "fully functional" IA (32-bit) cores, existing software should port with relative ease.
Intel is labeling the device a "Single-chip Cloud Computer" (SCC), presumably to emphasize the processor's resemblance to a shrink-wrapped datacenter. It's probably more accurate to simply call it a cluster-on-a-chip, considering it is essentially a bunch of cores hooked together by an on-chip network. Specifically, the processor contains 24 dual-core tiles, arranged in a two-dimensional 6-by-4 layout. Main memory is accessed via four on-chip DDR3 memory controllers. Each tile comes with its own router that connects the tiles to the network fabric.
Probably the most important feature of the network is its hardware support for message passing, which should provide a very high-performance environment for many cluster applications. Alternatively, a software-managed shared-memory model may be employed by codes that communicate via global data.
Each core contains its own L2 cache. But unlike most modern CPU designs, the SCC doesn't offer hardware cache coherence. Instead, it offloads that task to the software, which has to coordinate reads and writes between all the caches. In this case, Intel opted to trade off programming ease with hardware simplicity.
Fine-grained power management allows the processor to scale from a low of 25 watts up to a maximum of 125 watts. Each tile can run at a different frequency, while each row of four tiles can be run at a different voltage. There's also voltage and frequency controls for the network and the memory controllers. Like cache coherence, all power management is controlled via software.
The potential application area is actually much larger than the examples cited above. Given that CPUs with dozens of cores will eventually be a mainstay across most IT market segments, it is anticipated that new parallel computing applications will emerge as manycore makes its way from servers into the desktop and mobile arena. At least that's what Intel is hoping.
Details about the new microprocessor will be formally presented at the International Solid-State Circuits Conference on February 8 in San Francisco.
May 22, 2013 |
At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.