Visit additional Tabor Communication Publications
January 11, 2008
With all major chipmakers committed to the multicore path, it seems only a matter of time before manycore (processors with greater than 8 cores) becomes the standard architecture across all computing sectors. The 128-core NVIDIA GPUs, the Cisco's 188-core Metro network processor, and the 64-core Tilera TILE64 processor are three early examples of this trend. The 80-core prototype demonstrated by Intel is an indication that even the most mainstream segments of the computer industry are looking to enter the manycore realm.
While most discussions of manycore tend to focus on software development challenges or memory bandwidth limitations, an even more fundamental issue is the economic model that will drive these products into the marketplace. This is the topic that researchers Joseph Sloan and Rakesh Kumar at the University of Illinois at Urbana-Champaign addressed recently in a paper titled, Hardware/System Support for Four Economic Models for Many Core Computing (http://passat.crhc.uiuc.edu/rakeshk/techrep_economic.pdf).
In the current model, customers buy systems containing processors that satisfy the average or worst-case computation needs of their applications. This means when the application requirements change, either the user has to live with the pain of a performance mismatch or go through the expense of purchasing new systems (or new chips) to realign system performance with the applications. Sloan and Rakesh argue that as the number of cores increase, matching the performance needs with applications becomes increasingly difficult and the associated cost of buying unused computing power becomes more prohibitive.
The chip vendors are effected as well. As the number of cores increase, chipmakers must decide on the number of processor configurations to apply to a given market segment. If one can fit 100 cores on a die, how many different variations can be rationalized? Certainly not 100. Intel will have to deal with a smaller version of this problem in its upcoming 45nm Nehalem microarchitecture. So far, the company has described only 2-, 4- and 8-core processor designs for Nehalem. But with the combination of different cache sizes, memory controller architectures and clock speeds, the new processor family will probably end up being the largest Intel has ever supported. When tens or hundreds of cores are the norm, practical considerations will limit the number of unique designs to a very small subset of possible core layouts.
In their paper, Sloan and Kumar propose four related economic models (five actually) for manycore computing. The overall approach is that the customer will usually need fewer cores than are physically present on the chip, but at times may want to use more of them. The authors suggest that chips be developed in such a way as to allow users to pay only for the computing power they need, rather than the peak computing power that is physically present. This can be accomplished with small pieces of logic incorporated into the processor that enables the vendor to disable/enable individual cores. (Presumably, disabled cores would draw little, if any, power.) Enabling or disabling cores involves contacting the vendor, who authenticates the chip and sends activation codes that are used to unlock or lock the specified cores. The user ends up paying only for the desired computing power.
Of the models proposed, the most restrictive approach, the IntelligentBaseline model, forces the user to make a onetime decision about the number of cores needed. In this model, the vendor enables the user-selected subset of cores on the chip before shipping. Each of the other four models -- UpgradesOnly, Limited Up/Downgrade, CoresOnRent and PayPerUse -- offers a way to change the available processing power of the chip dynamically:
The underlying assumption to all this is that the cost of manufacturing the processor does not rise linearly with the number of cores on the die, which allows the chip vendor to sell underutilized processors at a profit. According to Kumar, this is indeed the case. His assumption is that the factors that determine the cost of manufacturing often have nothing to do with the number of cores on a die.
"Going from a one-core chip to a manycore chip may often represent increased costs -- due to higher design/verification overhead," explains Kumar. "But, multiplying the number of cores on a manycore chip will increase costs only marginally [since] the same design can be stamped multiple times to multiply the number of cores on a die. In fact, one of the main reasons for going to many cores is the high degree of IP reuse, i.e., the computational power can be multiplied without much increased cost."
Kumar admits that chip costs are dependent upon the die area, and if the number of cores increased that area, costs would increase linearly as well. But his contention is the die area is usually fixed because of yield considerations, so the cost does not change much.
Another issue is the strong coupling of the memory system with the peak performance of the processor. Sloan and Kumar suggest that the memory architecture should be composable to support system balance.
"Designing a composable memory hierarchy may not be a big technical challenge," contends Kumar. "It is just that a strong need was not there in the desktop and mobile domains. Composable memory hierarchies have often been designed in server systems. For example, Capacity on Demand for IBM System i offer clients the ability to non-disruptively activate (no IPL required) processors and memory. Same for Unisys as well as Sun systems too. You can simply have a middleware or microcode that allows/disallows access to certain regions of memory. Alternatively, some the techniques that we developed for supporting and enforcing the proposed models can also be used for memory hierarchies. Composability can also be attained by physically modifying the memory controller or disk controller to decouple memory regions."
However, the authors admit that in some cases composability may be difficult to achieve because system architectures may require memory hierarchies that are closely coupled with the core count. They also point out a number of other areas of concern, including compatibility with software licensing models (already an area of contention for multicore processors) and privacy/security issues related to vendors having access to customers' hardware.
"I think that there is no clear answer as to what are the new economic models that we need or whether we need new economic models at all," says Kumar. "But now may be the time when a discussion needs to start among academics, industry people, and everyone else who has a stake in it. At least an awareness of the issues is needed."
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.