Visit additional Tabor Communication Publications
March 30, 2012
PITTSBURGH, PA, March 29 -- Times are changing for HPC (high-performance computing) research, as non-traditional fields of study have begun taking advantage of powerful HPC tools. This was part of the plan when the National Science Foundation’s XSEDE (Extreme Science and Engineering Discovery Environment) program launched in July 2011. In recent months, the program took big steps toward this objective, in that a number of non-traditional projects — the common denominator being the need to process and analyze large amounts of data — were awarded peer-reviewed allocations of time on XSEDE resources.
“We’re happy to see these proposals succeed,” said Sergiu Sanielevici of the Pittsburgh Supercomputing Center (PSC), who leads XSEDE’s program in Novel and Innovative Projects (NIP). “As a brand new initiative in XSEDE, the NIP team proactively stimulates the development of strong projects that focus on interesting kinds of research that differ from the more typical simulation and modeling applications that have dominated HPC research in previous decades.”
The projects Sanielevici refers to involve:
• Creating searchable access to hand-written census data going back to the 1940s (Kenton McHenry, University of Illinois at Urbana-Champaign),
• Analyzing huge quantities of finance-trading data, the volume of which has rapidly increased beyond the computational power of prior approaches to trade-data research (Mao Ye, University of Illinois at Urbana-Champaign),
• Assembling DNA segments from fungi in soil to identify new enzymes that can cost-effectively convert plant material to biofuel (Mostafa Elshahed, Oklahoma State University), and
• Simulating the WorldWideWeb to discern which of many proposed protocols to make the Internet secure works best against various kinds of attacks (Sharon Goldberg, Boston University),
• Applying sophisticated “machine learning” algorithms to discern meaning from huge amounts of online text data (Noah Smith, Carnegie Mellon University).
All of these non-traditional HPC projects sought and received large allocations on XSEDE’s Blacklight resource at PSC, an SGI® Altix® UV1000 system partitioned into two connected 16-terabyte shared-memory systems, the two largest shared-memory systems in the world.
Shared-memory resources such as Blacklight present a large advantage for many data-intensive applications, says Sanielevici, because of “efficient fine-grained random access” — all of the system’s memory can be directly accessed from all of its processors, as opposed to distributed memory (in which each processor’s memory is directly accessed only by that processor). Because all processors share a single view of data, a shared-memory system is, relatively speaking, easy to program and use.
Elshahed’s project also received allocations on XSEDE’s Forge system at the National Center for Supercomputing Applications, and Ye’s project also received an allocation on the new Gordon system at the San Diego Supercomputer Center. McHenry’s and Ye’s projects were recommended to receive extended collaborative support from XSEDE staff experts, subject to the formulation of project plans.
Blacklight has proven to be especially useful in genomics assembly projects, such as Elshahed’s, as highlighted in a recent article in the bioinformatics journal GenomeWeb [http://psc.edu/publicinfo/pdf/GenomeWeb_BioInformatics_021312.pdf]stating that “genomics researchers might be hard pressed to do better . . . .”
Brian Couger of Oklahoma State, a Ph.D. candidate working with Elshahed, recently completed a “metagenomics” assembly on Blacklight of 3 billion DNA segments, which he believes is the largest metagenomics assembly accomplished to date, and couldn’t have been done, said Couger, on systems other than Blacklight.
Mao Ye’s work with Blacklight has examined how lack of transparency in certain kinds of finance trading can skew the market. Because of the quantity of data involved, the problem is very difficult to analyze. He notes that it took several months for the U.S. Securities and Exchange Commission to analyze just two hours of trade data and that Blacklight has greatly improved the ability to produce timely analysis.
With Blacklight, Noah Smith’s research group at CMU has been able to “train” large-scale semantic models of natural language using millions to billions of words. Blacklight’s parallelism and shared memory allows Smith and collaborators to take advantage of much more powerful and complex algorithms than have been applied before to such large datasets.
XSEDE, the Extreme Science and Engineering Discovery Environment, is the most advanced, powerful, and robust collection of integrated digital resources and services in the world. It is a single virtual system that scientists and researchers can use to interactively share computing resources, data, and expertise. XSEDE integrates the resources and services, makes them easier to use, and helps more people use them. The five-year, $121 million project is supported by the National Science Foundation, and it replaces and expands on the NSF TeraGrid project.
In quieter times, sounding the bell of funding big science with big systems tends to resonate further than when ears are already burning with sour economic and national security news. For exascale's future, however, the time could be ripe to instill some sense of urgency....
In a recent solicitation, the NSF laid out needs for furthering its scientific and engineering infrastructure with new tools to go beyond top performance, Having already delivered systems like Stampede and Blue Waters, they're turning an eye to solving data-intensive challenges. We spoke with the agency's Irene Qualters and Barry Schneider about..
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
May 23, 2013 |
The study of climate change is one of those scientific problems where it is almost essential to model the entire Earth to attain accurate results and make worthwhile predictions. In an attempt to make climate science more accessible to smaller research facilities, NASA introduced what they call ‘Climate in a Box,’ a system they note acts as a desktop supercomputer.
May 22, 2013 |
At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.