Visit additional Tabor Communication Publications
April 05, 2010
April 5 -- Researchers at North Carolina State University have developed a new approach to software development that will allow common computer programs to run up to 20 percent faster and possibly incorporate new security measures.
The researchers have found a way to run different parts of some programs -- including, for the first time, such widely used programs as word processors and Web browsers -- at the same time, which makes the programs operate more efficiently.
In order to understand how they did it, you have to know a little bit about computers. The brain of a computer chip is its central processing unit, or "core." Computing technology has advanced to the point where it is now common to have between four and eight cores on each chip. But for a program to utilize these cores, it has to be broken down into separate "threads" -- so that each core can execute a different part of the program simultaneously. The process of breaking down a program into threads is called parallelization, and allows computers to run programs very quickly.
However, some programs are difficult to parallelize, including word processors and Web browsers. These programs operate much like a flow chart -- with certain program elements dependent on the outcome of others. These programs can only utilize one core at a time, minimizing the benefit of multicore chips.
But NC State researchers have developed a technique that allows hard-to-parallelize applications to run in parallel, by using nontraditional approaches to break programs into threads.
Every computer program consists of multiple steps. The program will perform a computation, then perform a memory-management function -- which prepares memory storage to contain data or frees up memory storage which is currently in use. It repeats these steps over and over again, in a cycle. And, for difficult-to-parallelize programs, both of these steps have traditionally been performed in a single core.
"We've removed the memory-management step from the process, running it as a separate thread," says Dr. Yan Solihin, an associate professor of electrical and computer engineering at NC State, director of this research project, and co-author of a paper describing the research. Under this approach, the computation thread and memory-management thread are executing simultaneously, allowing the computer program to operate more efficiently.
"By running the memory-management functions on a separate thread, these hard-to-parallelize programs can operate approximately 20 percent faster," Solihin says. "This also opens the door to development of new memory-management functions that could identify anomalies in program behavior, or perform additional security checks. Previously, these functions would have been unduly time-consuming, slowing down the speed of the overall program."
Using the new technique, when a memory-management function needs to be performed, "the computational thread notifies the memory-management thread -- effectively telling it to allocate data storage and to notify the computational thread of where the storage space is located," says Devesh Tiwari, a Ph.D. student at NC State and lead author of the paper. "By the same token, when the computational thread no longer needs certain data, it informs the memory-management thread that the relevant storage space can be freed."
The paper, "MMT: Exploiting Fine-Grained Parallelism in Dynamic Memory Management," will be presented April 21 at the IEEE International Parallel and Distributed Processing Symposium in Atlanta. The research was funded by the National Science Foundation. The paper is co-authored by Tiwari, Solihin, NC State Ph.D. student Sanghoon Lee and Dr. James Tuck, an assistant professor of electrical and computer engineering at NC State.
NC State's Department of Electrical and Computer Engineering is part of the university's College of Engineering.
"MMT: Exploiting Fine-Grained Parallelism in Dynamic Memory Management"
Authors: Devesh Tiwari, Sanghoon Lee, James Tuck, Yan Solihin, North Carolina State University
Presented: April 21, 2010, at the IEEE International Parallel and Distributed Processing Symposium, Atlanta.
Abstract: In this paper, we propose a new approach for accelerating dynamic memory management on multicore architecture, by offloading dynamic management functions to a separate thread that we refer to as memory management thread (MMT). We show that an efficient MMT design can give significant performance improvement by extracting parallelism while being agnostic to the underlying memory management library algorithms and data structures. We also show how parallelism provided by MMT can be beneficial for high overhead memory management tasks, for example, security checks related to memory management. We evaluate MMT on heap allocation-intensive benchmarks running on an Intel core 2 quad platform for two widely-used memory allocators: Doug Lea's and PHKmalloc allocators. On average, MMT achieves a speedup ratio of 1.19 times for both allocators, while both the application and memory management libraries are unmodified and are oblivious to the parallelization scheme. For PHKmalloc with security checks turned on, MMT reduces the security check overheads from 21 percent to just 1 percent on average.
Source: North Carolina State University
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.