February 25, 2010
Nine terabytes of data validated in less than 20 minutes; greater predictability with more accuracy
SEATTLE and ZURICH, Feb. 25 -- IBM Research today unveiled a breakthrough method based on a mathematical algorithm that reduces the computational complexity, costs, and energy usage for analyzing the quality of massive amounts of data by two orders of magnitude. This new method will greatly help enterprises extract and use the data more quickly and efficiently to develop more accurate and predictive models.
In a record-breaking experiment, IBM researchers used the fourth most powerful supercomputer in the world -- a Blue Gene/P system at the Forschungszentrum Julich in Germany -- to validate nine terabytes of data (nine million million or a number with 12 zeros) in less than 20 minutes, without compromising accuracy. Ordinarily, using the same system, this would take more than a day. Additionally, the process used just one percent of the energy that would typically be required.*
http://www.flickr.com/photos/ibm_research_zurich/sets/72157623370264033/
The breakthrough will be presented today at the Society for Industrial and Applied Mathematics conference in Seattle.
"In a world with already one billion transistors per human and growing daily, data is exploding at an unprecedented pace," said Dr. Alessandro Curioni, manager of the Computational Sciences team at IBM Research – Zurich. "Analyzing these vast volumes of continuously accumulating data is a huge computational challenge in numerous applications of science, engineering and business. This breakthrough greatly extends the ability to analyze the quality of large volumes of data at rapid speeds."
One of the most computation-intense, yet critical factors in analytics is the measurement of the quality of the data, which shows how reliable the data is that is being used and also generated by the model. In areas ranging from traffic management, financial management and water management this method could pave the way to create more powerful, complex and accurate models with greater predictability.
For example:
The amount of digital data is increasing at enormous rates -- due also to the ever more ubiquitous presence of sensors, actuators, RFID-tags or GPS-tracking-devices. These miniature computers measure everything from the degree of pollution of ocean water to traffic patterns to food supply chains.
With all of this data come new challenges as organizations are now struggling to not only extract the relevant information out of it, but to also make sure it's accurate. IBM researchers are pursuing leading edge research and actively engaging in client projects to extend the ability for analytics to predict outcomes and improve the speed and quality of business decisions.
"Determining how typical or how statistically relevant the data is, helps us to measure the quality of the overall analysis and reveals flaws in the model or hidden relations in the data," explains Dr. Costas Bekas of IBM Research – Zurich. "Efficient analysis of huge data sets requires the development of a new generation of mathematical techniques that target at both reducing computational complexity and at the same time allow for their efficient deployment on modern massively parallel resources."
The new method demonstrated by the IBM scientists brings down computational complexity and has very good scaling characteristics that reach to the full scale of the JuGene Supercomputer at the Forschungszentrum Julich with its 72 racks of IBM's Blue Gene/P system, 294,912 processors and a peak performance of one petaflop.
"In the next years supercomputing will provide us with unique insights and will help to create added value with new technologies," says Prof. Dr. Thomas Lippert, director of the Julich Supercomputing Centre. "A cornerstone for the future will be innovative tools and algorithms helping us to analyze the huge amount of data provided by simulations on the most powerful computers."
IBM's intends to make this capability available to clients.
* The JuGene supercomputer at Forschungszentrum Julich requires about 52800 kWh for one day of operation on the full machine; the IBM demonstration required an estimated 700 kWh.
About Forschungszentrum Julich
Forschungszentrum Julich pursues cutting-edge interdisciplinary research on solving the grand challenges facing society in the fields of health, energy and the environment, and also information technologies. In combination with its two key competencies -- physics and supercomputing -- work at Julich focuses on both long-term, fundamental and multidisciplinary contributions to science and technology as well as on specific technological applications. With a staff of about 4,400, Julich -- a member of the Helmholtz Association -- is one of the largest research centres in Europe. The Julich Supercomputing Centre hosts regularly world leading Supercomputers and supports a user community of over 200 science and research groups by developing algorithms, models, tools, and methods in a variety of fields of computational science and engineering. http://www.fz-juelich.de/jsc/.
About IBM and Analytics
For more information, visit http://www.ibm.com/press/us/en/presskit/27163.wss.
-----
Source: IBM
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Read more...
Supercomputing veteran, Bo Ewald, has been neck-deep in bleeding edge system development since his twelve-year stint at Cray Research back in the mid-1980s, which was followed by his tenure at large organizations like SGI and startups, including Scale Eight Corporation and Linux Networx. He has put his weight behind quantum company....
Read more...
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.