The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
October 06, 2006
Tailored allocation management on NCSA's Tungsten cluster proves an important part of many researchers' workflows.
Once the allocations have been made and the highest-quality projects have been given set amounts of time, there are two straightforward ways of scheduling users on a supercomputer. One is egalitarian. A queuing system applies a set of rules -- based on the amount of time a particular job is going to take, how many processors are going to be used, and the like -- and puts people in line to wait their turn. The other is totalitarian. The decks are cleared for a big user, and he or she runs on a massive number of processors, perhaps the whole machine, for a long time.
Neither approach is ideal, and neither addresses more nuanced or immediate needs.
Take the case of the MILC collaboration, which studies quantum chromodynamics. In 2004, they received an allocation of four million CPU hours on NCSA's Tungsten cluster. By any reckoning, even one that comprises researchers at nine institutions, that's a massive allocation of time.
To use those resources sensibly and efficiently requires human decisions and policies that are well tuned to the various ways that researchers use the center's systems.
"Sitting down, going to talk to the users, and figuring out what they want. It's the only way to do this" when you have a broad variety of user needs, according to John Towns, who leads NCSA's Persistent Infrastructure Directorate. "'This doesn't work for me' is the last thing you want to hear."
A powerful machine is still important, and Tungsten is certainly that. It has a peak capability of more than 15 trillion calculations per second, making it the largest computer supported by the National Science Foundation and available for open scientific research.
A popular machine is also important, and Tungsten is that, too. In September 2005, about 162 million normalized units of computing time were allocated on Tungsten -- about 20 percent of the total parceled out by NSF across the nation. User requests for Tungsten were almost double that number, far exceeding the number available. This made Tungsten the most requested and the most allocated system in NSF's arsenal in September.
"If you allocate this large and popular a resource in the traditional ways, somebody always suffers. People with large runs wait a long time in the queues or don't get to run at all because the queuing system is set up to handle a large number of smaller jobs. Or the smaller jobs get brushed aside in order to dedicate the machine to large jobs. It's a tough balance to strike," Towns says.
"Tungsten is a resource that satisfies specific needs of the user community. It's a critical part of their research workflow," NCSA Director Thom Dunning says. "That means we have to tailor allocations to suit them. We planned for this sort of approach when we installed Tungsten, and the popularity and productivity among users really showed us that it was the right way to go."
Page: 1 of 4(Digg, Technorati, more)
There was a new energy at this year's TeraGrid '09 conference thanks to an outstanding turnout for the student program. Thanks to support from the National Science Foundation, more than 100 high school, undergraduate and graduate students were able to participate in the conference.
Read More...
Paul Avery, a recognized leader in advanced grid and networking for science, delivered the first keynote address at the recent TeraGrid '09 conference in Arlington, Virginia. A professor of physics at the University of Florida, Avery is co-principal investigator and founding member of the Open Science Grid (OSG). Avery talked about the history of OSG, some of the projects that leverage its resources, and OSG's relationship with TeraGrid.
Read More...
Before he even took the podium, Ed Seidel was one of the buzz makers at the TeraGrid '09 conference. The day before his keynote, it was announced that he was stepping in as acting assistant director of the National Science Foundation's math and physical sciences directorate. For his talk at the conference, however, Seidel focused on the issues and efforts within his home at NSF, the Office of Cyberinfrastructure.
Read More...
Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...
Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...
Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...
Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...
Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.