Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
October 11, 2013

TACC Spurs Data-Intensive Science with Corral

Tiffany Trader

TACC, the Texas Advanced Computing Center, knows all about big data. As a leading center of computational excellence in the United States, TACC relies on advanced computing technologies to enable discoveries that advance science and society. Of course, all the data that is generated requires a repository – that’s where Corral comes in. The large-scale data repository was deployed in 2009 to support the storing and sharing of research data at the University of Texas.

A recent article on TACC’s website highlights an important milestone for Corral. The DataDirect Networks storage system recently crossed the one petabyte mark in total data stored, and it now hosts over 100 unique data collections. The diverse assortment of datasets range from measurements of Earth’s gravity field to whale songs to mass spectrometry data, according to the piece by science writer Arron Dubrow.

Usage of the system continues to climb. For the last six months, usage has increased 10 percent per month.

“We’ve seen ever-increasing growth in the number and diversity of collections on Corral over the past several years,” said Chris Jordan, manager of the data management and collections group at TACC. “This shows how important a resource dedicated to data collections is to modern research practices, both for the researchers who are creating data and the worldwide community of researchers who use public data collections to further their own research.”

Corral is not the only storage mechanism at TACC, but it is unique for hosting large collections that are actively serving the community. TACC’s 100-petabyte Ranch tape archive serves as a long-term repository for archived work. The site’s newest petascale supercomputer, Stampede, includes more than 15 petabytes of dedicated storage, and there is also a scalable global file system, which adds another 20 petabytes. These are both used for short-term data retention to support ongoing simulations and analyses.

Corral, which has a current raw capacity of six petabytes, was designed and optimized to support complex large-scale collections and a collaborative research environment. With a high-speed connection to TACC’s other advanced computing systems, scientists can easily share data and results.

According to Niall Gaffney, TACC’s Director of Data Intensive Computing, “Corral is leading the way in the preservation and dissemination of data for researchers who are discovering that global, on-demand access to large quantities of data leads to previously unachievable results.”

SC14 Virtual Booth Tours

AMD SC14 video AMD Virtual Booth Tour @ SC14
Click to Play Video
Cray SC14 video Cray Virtual Booth Tour @ SC14
Click to Play Video
Datasite SC14 video DataSite and RedLine @ SC14
Click to Play Video
HP SC14 video HP Virtual Booth Tour @ SC14
Click to Play Video
IBM DCS3860 and Elastic Storage @ SC14 video IBM DCS3860 and Elastic Storage @ SC14
Click to Play Video
IBM Flash Storage
@ SC14 video IBM Flash Storage @ SC14  
Click to Play Video
IBM Platform @ SC14 video IBM Platform @ SC14
Click to Play Video
IBM Power Big Data SC14 video IBM Power Big Data @ SC14
Click to Play Video
Intel SC14 video Intel Virtual Booth Tour @ SC14
Click to Play Video
Lenovo SC14 video Lenovo Virtual Booth Tour @ SC14
Click to Play Video
Mellanox SC14 video Mellanox Virtual Booth Tour @ SC14
Click to Play Video
Panasas SC14 video Panasas Virtual Booth Tour @ SC14
Click to Play Video
Quanta SC14 video Quanta Virtual Booth Tour @ SC14
Click to Play Video
Seagate SC14 video Seagate Virtual Booth Tour @ SC14
Click to Play Video
Supermicro SC14 video Supermicro Virtual Booth Tour @ SC14
Click to Play Video