Researchers are finding innovative uses for Gordon, the 285 teraflop supercomputer housed at the San Diego Supercomputer Center (SDSC) that has a unique Flash-based storage system. Since going online, researchers have put the incredibly fast I/O to use on a wide variety of workloads, ranging from chemistry to political science.
When it went online on the campus of University of California San Diego in the beginning of 2012, the 16,160-core Gordon was the world’s first large-scale deployment of Flash-based solid state disks (SSDs). With 300 TB of Series 710 SSDs hooked up to Xeon X5 processors and Infiniband QDR interconnect, the GreenBlade system from Cray represented a test bed of sorts for a new breed of supercomputers.
Since then, researchers have found all kinds of uses for Gordon. According to this story, the HPC system has been used on more than 300 projects through the first half of 2013. This includes traditional HPC workloads in the scientific and engineering fields. But more interestingly, Gordon has been employed in some non-traditional applications for a supercomputer, including political science, mathematical anthropology, finance, and the cinematic arts.
For example, Gordon has worked to categorize huge media archives as part of the Large Scale Video Analytics (LSVA) project. The project, which is a collaboration among cinema scholars, digital humanists, and computational scientists at various organizations, utilizes a modified content management system and several algorithms to help researchers find, index, and tag videos.
“Contemporary culture is awash in moving images,” the project’s lead investigator, Virginia Kuhn, tells Scientific Computing. “There is more video uploaded to YouTube in a day than a single person can ever view in a lifetime. As such, one must ask what the implications are when it comes to the issues of identity, memory, history, or politics.”
Another of Gordon’s projects involved marrying its fast I/O with the Hadoop file system and the MapReduce engine. Yoav Freund, a computer science and engineering professor at UCSD, had his graduate students come up with problems, such as analyzing temperature and airflow sensors on the university’s campus or predicting medical costs associated with vehicle accidents, to run on Gordon.
Having Hadoop loaded onto Gordon’s 64 I/O nodes gave the noted data scientist an insight into how HPC workloads may run in the future. “Given the large data movement requirements and the need for very rapid turnaround, it would not have been feasible for the students to work through a standard batch queue,” Freund told the publication. “Having access to Flash storage greatly reduced the time for random data access.”
By the way, Gordon is still ranked on the Top 500 list, but it dropped from number 88 on the November 2012 list to number 102 on the list that was published Monday. It debuted on the list in November 2011 at number 48.
Related Articles
SDSC’s Gordon to Help Guide Future of Particle Physics