October 21, 2010
PITTSBURGH, Oct. 21 -- Computer scientists at Carnegie Mellon University have devised an innovative and elegantly concise algorithm that can efficiently solve systems of linear equations that are critical to such important computer applications as image processing, logistics and scheduling problems, and recommendation systems.
The theoretical breakthrough by Professor Gary Miller, Systems Scientist Ioannis Koutis and Ph.D. student Richard Peng, all of Carnegie Mellon's Computer Science Department, has enormous practical potential. Linear systems are widely used to model real-world systems, such as transportation, energy, telecommunications and manufacturing that often may include millions, if not billions, of equations and variables.
Solving these linear systems can be time consuming on even the fastest computers and is an enduring computational problem that mathematicians have sweated for 2,000 years. The Carnegie Mellon team's new algorithm employs powerful new tools from graph theory, randomized algorithms and linear algebra that make stunning increases in speed possible.
The algorithm, which applies to an important class of problems known as symmetric diagonally dominant (SDD) systems, is so efficient that it may soon be possible for a desktop workstation to solve systems with a billion variables in just a few seconds.
The work will be presented at the annual IEEE Symposium on Foundations of Computer Science (FOCS 2010), Oct. 23-36 in Las Vegas.
A myriad of new applications have emerged in recent years for SDD systems. Recommendation systems, such as the one used by Netflix to suggest movies to customers, use SDD systems to compare the preferences of an individual to those of millions of other customers. In image processing, SDD systems are used to segment images into component pieces, such as earth, sky and objects like buildings, trees and people. "Denoising" images to bring out lettering and other details that otherwise might appear as a blur also make use of SDD systems.
A large class of logistics, scheduling and optimization problems can be formulated as maximum-flow problems, or "max flow," which calculate the maximum amount of materials, data packets or vehicles that can move through a network, be it a supply chain, a telecommunications network or a highway system. The current theoretically best max flow algorithm uses, at its core, an SDD solver.
SDD systems also are widely used in engineering, such as for computing heat flow in materials or the vibrational modes of objects with complex shapes, in machine learning, and in computer graphics and simulations.
"In our work at Microsoft on digital imaging, we use a variety of fast techniques for solving problems such as denoising, image blending and segmentation," said Richard Szeliski, leader of the Interactive Visual Media Group at Microsoft Research. "The fast SDD solvers developed by Koutis, Miller and Peng represent a real breakthrough in this domain, and I expect them to have a major impact on the work that we do."
Finding methods to quickly and accurately solve simultaneous equations is an age-old mathematical problem. A classic algorithm for solving linear systems, dubbed Gaussian elimination in modern times, was first published by Chinese mathematicians 2,000 years ago.
"The fact that you can couch the world in linear algebra is super powerful," Miller said. "But once you do that, you have to solve these linear systems and that's often not easy."
A number of SDD solvers have been developed, but they tend not to work across the broad class of SDD problems and are prone to failures. The randomized algorithm developed by Miller, Koutis and Peng, however, applies across the spectrum of SDD systems.
The team's approach to solving SDD systems is to first solve a simplified system that can be done rapidly and serve as a "preconditioner" to guide iterative steps to an ultimate solution. To construct the preconditioner, the team uses new ideas from spectral graph theory, such as spanning tree and random sampling.
The result is a significant decrease in computer run times. The Gaussian elimination algorithm runs in time proportional to s3, where s is the size of the SDD system as measured by the number of terms in the system, even when s is not much bigger the number of variables. The new algorithm, by comparison, has a run time of s[log(s)]2. That means, if s = 1 million, that the new algorithm run time would be about a billion times faster than Gaussian elimination.
Other algorithms are better than Gaussian elimination, such as one developed in 2006 by Daniel Spielman of Yale University and Miller's former student, Shang-Hua Teng of the University of Southern California, which runs in s[log(s)]25. But none promise the same speed as the one developed by the Carnegie Mellon team.
"The new linear system solver of Koutis, Miller and Peng is wonderful both for its speed and its simplicity," said Spielman, a professor of applied mathematics and computer science at Yale. "There is no other algorithm that runs at even close to this speed. In fact, it's impossible to design an algorithm that will be too much faster."
The group's research paper, "Approaching Optimality for Solving SDD Linear Systems," can be downloaded at http://www.cs.cmu.edu/~glmiller/Publications/Papers/KoutisApproaching-2010.pdf.
About Carnegie Mellon University
Carnegie Mellon (www.cmu.edu) is a private, internationally ranked research university with programs in areas ranging from science, technology and business, to public policy, the humanities and the arts. More than 11,000 students in the university's seven schools and colleges benefit from a small student-to-faculty ratio and an education characterized by its focus on creating and implementing solutions for real problems, interdisciplinary collaboration and innovation. A global university, Carnegie Mellon's main campus in the United States is in Pittsburgh, Pa. It has campuses in California's Silicon Valley and Qatar, and programs in Asia, Australia, Europe and Mexico. The university is in the midst of a $1 billion fundraising campaign, titled "Inspire Innovation: The Campaign for Carnegie Mellon University," which aims to build its endowment, support faculty, students and innovative research, and enhance the physical campus with equipment and facility improvements.
-----
Source: Carnegie Mellon University
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Read more...
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Read more...
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.