While there’s been a lot of activity around the coming crop of “exascale-relevant” supercomputers, the HPC landscape is also shifting to become more data-aware. Perhaps no system reflects this transition better than Wrangler, the I/O-optimized open science system from Dell and EMC that debuted earlier this month at the Texas Advanced Computing Center (TACC).
In a presentation from the 2015 HPC User Forum, TACC Executive Director Dan Stanzione shares the drivers behind Wrangler’s unique architecture and describes how it will dovetail with TACC’s other resources, including the much larger Stampede supercomputer, to accommodate new users and new use cases.
“Wrangler is a rethink of how we build clusters for these ‘big data’ applications, which of course isn’t one problem but a huge family of problems that are somewhat related,” Stanzione relates. “As we all know, HPC and big data have a lot in common, and we can solve a lot of problems, but we can’t do every one of them, so we thought about what were the gaps in our hardware capability.”
Stanzione says the really interesting part about Wrangler is the flash storage technology. Wrangler’s rack-scale flash storage tier, an innovation of DSSD (which was acquired by EMC last year), is on track to provide bandwidth of 1TB/s and 250M IOPS – that’s six times faster than TACC’s flagship Stampede supercomputer. This is not SSD, but rather a big array of NAND flash dies, nearly 100,000 of them.
Funding for Wrangler was provided through a grant from the National Science Foundation (NSF) and the machine is currently in early operations mode with a half-dozen projects using it. A few more are added each week and Stanzione expects it to enter full production status soon. Other partners include the Indiana University Pervasive Technology Institute and the University of Chicago.
Check out the <25 minute presentation to learn about the users and use cases being enabled by Wrangler and the performance increases provided by the DSSD technology over both disk and SSD.