Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
July 10, 2014

SDSC Launches Workflows for Data Science Center

Tiffany Trader
SDSC Words

The San Diego Supercomputer Center (SDSC) at the University of California, San Diego, has created a new “center of excellence” focused on helping researchers more effectively manage research data. Called Workflows for Data Science Center of Excellence, or WorDS Center for short, the goal is to provide a common resource for the developing and validating of scientific workflows related to data ingestion, preparation, integration, analysis, visualization and dissemination.

As explained on the project’s website, the new center is based on more than a decade of experience building workflows for computational science, data science and engineering as it intersects with distributed computing, big data analysis, and reproducible science.

“The WorDS Center’s purpose is to allow scientists to focus on their specific areas of research rather than having to solve workflow issues, or the computational challenges that arise as data analysis progresses from task to task,” said Ilkay Altintas, SDSC’s deputy coordinator for research and director of SDSC’s Scientific Workflow Automation Technologies Laboratory, and director of the new WorDS Center. “The amount of potentially valuable information buried in what is commonly known as ‘Big Data’ is of interest to numerous data science applications, and big data workflows have been an active area of research ever since the introduction of scientific workflows in the early 2000s.”

Some of the services and expertise offered by the new Center of Excellence include:

  • Researchers and developers well-versed in data science and scientific computing technologies.
  • Workflow management technologies that resulted in the collaborative development of the Kepler Scientific Workflow System.
  • Development of data science workflow applications through combination of tools, technologies and best practices.
  • Hands on consulting on workflow technologies for big data and cloud systems, e.g., MapReduce, Hadoop, Yarn, Cascading.
  • Technology briefings and applied classes on end-to-end support for data science.

“The age of data-enabled science is upon us,” states SDSC Director Michael Norman, “and it’s here to stay.” In light of this, effective and efficient workflows are more important that ever. In practically every domain, insight and discovery is hiding in reams of data. Now that the technology for analyzing and digesting such heterogeneous data exist, the systems for sharing and distribution must also be promulgated.

SC14 Virtual Booth Tours

AMD SC14 video AMD Virtual Booth Tour @ SC14
Click to Play Video
Cray SC14 video Cray Virtual Booth Tour @ SC14
Click to Play Video
Datasite SC14 video DataSite and RedLine @ SC14
Click to Play Video
HP SC14 video HP Virtual Booth Tour @ SC14
Click to Play Video
IBM DCS3860 and Elastic Storage @ SC14 video IBM DCS3860 and Elastic Storage @ SC14
Click to Play Video
IBM Flash Storage
@ SC14 video IBM Flash Storage @ SC14  
Click to Play Video
IBM Platform @ SC14 video IBM Platform @ SC14
Click to Play Video
IBM Power Big Data SC14 video IBM Power Big Data @ SC14
Click to Play Video
Intel SC14 video Intel Virtual Booth Tour @ SC14
Click to Play Video
Lenovo SC14 video Lenovo Virtual Booth Tour @ SC14
Click to Play Video
Mellanox SC14 video Mellanox Virtual Booth Tour @ SC14
Click to Play Video
Panasas SC14 video Panasas Virtual Booth Tour @ SC14
Click to Play Video
Quanta SC14 video Quanta Virtual Booth Tour @ SC14
Click to Play Video
Seagate SC14 video Seagate Virtual Booth Tour @ SC14
Click to Play Video
Supermicro SC14 video Supermicro Virtual Booth Tour @ SC14
Click to Play Video