The Week in Review
10 words and a link
Oracle-Sun passes US anti-trust, Europe still pondering
Platform crams in HP MPI and developers alongside Scali MPI
IEEE Hot Chips 21 preview
Iowa State turns on 28 TFLOPS Sun system
RISC alive and kicking at Sun and Big Blue
NASA upgrades iDataPlex for climate simulation
Amazon’s new virtual private cloud starts to address security concerns
Intel and AMD gaining traction in storage
NVIDIA’s CUDA Superhero Challenge
New 80M (AUD) HPC project expected boost in SKA bid
SGI rings the bell
NOAA outsources to Oak Ridge via Recovery Act funds
The Knoxville News Sentinel ran a story this week on the recent decision by NOAA to outsource some of its computational work to the government’s HPC powerhouse, the DoE in the person of the computing facilities at Oak Ridge National Lab:
The National Oceanic and Atmospheric Administration will provide $215 million to Oak Ridge National Laboratory over the next five years to support climate research, further bolstering ORNL’s role as a U.S. hub for broad-based work on global climate change.
Thomas Zacharia, ORNL’s deputy lab director for science and technology, said Oak Ridge is becoming a go-to place where agencies engaged in climate studies can work together to leverage their assets and get the most out of their resources.
NOAA will be using ORNL’s computers, but Oak Ridge will also be growing computational staff to better support the expanded mission. And bonus: if you are following the ARRA’s impact on HPC, you just found another item for your bingo card:
The new agreement between NOAA and the Department of Energy already has provided $73.5 million in American Recovery and Reinvestment Act money to ORNL, with similar amounts to follow over the next four years, he said. The lab expects to hire 25 to 50 additional climate researchers as part of the expanded effort, he said.
I actually think this kind of shift makes sense. It is extremely difficult in the federal government to get the funds and approvals (which extend all the way to the requirement for a literal act of Congress in some cases) to maintain large scale compute infrastructure. If the infrastructure itself isn’t part of an agency’s core mission then paying someone else to run it for you may make good sense. This is a logical first step on the path toward hosted computing, since it keeps the data and apps within the federal government at least.
Platform teams with NVIDIA for GPU management goodness
Platform has announced a new agreement between itself and NVIDIA that will result in Platform’s cluster provisioning and management tools being GPU aware. From the release:
Platform Computing, the leader in cluster, grid and cloud computing software, announced that it is providing new GPU kits for its Platform Cluster Manager and Platform HPC Workgroup products to support NVIDIA’s CUDA-enabled GPUs, including NVIDIA’s market leading Tesla GPUs for high performance computing.
…Platform’s Cluster Manager and HPC Workgroup Manager simplify building, managing and scheduling GPU based clusters. Deployed with a single-click install, Platform Computing’s GPU kits allow users to quickly provision the clusters they need using NVIDIA’s solutions.
It’s actually not at all clear from the press release what this is really about. The Register has coverage of this announcement that is a little more revealing:
The move will allow Platform Computing’s Load Sharing Facility (LSF), the backbone of its open and closed source products, to dispatch applications to nVidia’s Tesla GPU co-processors much as it dispatches work to regular CPUs inside HPC clusters.
The LSF tool provisions and monitors workloads running on clusters, and can do so down to the CPU core level on server nodes and down to the GPU level on the co-processors. This is thanks to the integration of the CUDA development kit with the Platform cluster manager products.
EnterpriseStorageForum.com posted an introduction to clustered file systems last week that you may be interested in perusing if storage is a source of fear and confusion for you. I’m in a twelve step program to get over my own storage fears.
Many options exist for setting up clustered and highly available data storage, but figuring out what each option does will take a bit of research. Your choice of storage architecture as well as file system is critical, as most have severe limitations that require careful design workarounds.
In this article we will cover a few common physical storage configurations, as well as clustered and distributed file system options. Hopefully, this is a good starting point to begin looking into the technology that will work best for your high availability storage needs.