Biomolecule structure prediction has long been challenging not least because the relevant software and workflows often require high-end HPC systems that many bioscience researchers lack easy access to. One bioscience gateway – ROSIE – has been established as part of XSEDE (Extreme Science and Engineering Discover Environment) to expand access to the popular Rosetta suite of prediction software; so far 5,000 users have run more than 30,000 jobs and ROSIE organizers are hoping recent additions will further expand use.
A fundamental issue here is that bioscience researchers often face the twin hurdles of possessing limited computational expertise and having limited access to HPC. ROSIE – the Rosetta Online Server that Includes Everyone (quite the name) – lets researchers run their jobs using a straightforward interface and without necessarily knowing the work is being done on supercomputer resources such as TACC’s Stampede. The idea isn’t brand new. ROSIE is the latest morphing of what was the RosettaCommons.
An account (Rosetta Modeling Software and the ROSIE Science Gateway) of the expansion of ROSIE is posted on the TACC site.
Structure prediction is fundamental to much of bioscience research. Think of biomolecules as expert contortionists whose shape critically influences their function. For example, the 3D shape of a protein is critical to its function and is determined the sequence of its constituent amino acids; however predicting the shape from the amino acid sequence is (still) challenging and computationally intensive. The same can be said for many classes of biomolecules.
“One of the most widely used such [structure prediction] programs is Rosetta. Originally developed as a structure prediction tool more than 17 years ago in the laboratory of David Baker at the University of Washington, Rosetta has been adapted to solve a wide range of common computational macromolecular problems. It has enabled notable scientific advances in computational biology, including protein design, enzyme design, ligand docking, and structure predictions for biological macromolecules and macromolecular complexes,” according to the TACC article.
“The structure prediction problem is to take a sequence and ask, ‘What does it look like?'” said Jeffrey Gray, a professor of Chemical and Biomolecular Engineering at Johns Hopkins University and a collaborator on the project. “The design problem asks ‘What sequence would fold into this structure?’ That’s at the heart of Rosetta, but Rosetta does a lot of other things,” Gray said. Over the years, Rosetta evolved from a single tool, to a collection of tools, to a large collaboration called RosettaCommons, which includes more than 50 government laboratories, institutes, and research centers (only nonprofits).
Gray had used TACC resources as a graduate student in Texas in the late 1990s, so he knew about TACC and some of the other NSF supercomputing facilities. “We’ve been using Stampede and applied for it through XSEDE,” Gray said. “We have a Stampede allocation for my lab and we have a separate allocation for ROSIE.”
First described in PLOS One in May 2013, ROSIE continues to add new elements. In January 2017, a team of researchers, including Gray, reported in Nature Protocols on the latest additions to the gateway: antibody modeling and docking tools called RosettaAntibody and SnugDock that can run fully automated via the ROSIE web server or manually, with user control, on a personal computer or cluster.
Link to TACC article: https://www.tacc.utexas.edu/-/rosetta-modeling-software-and-the-rosie-science-gateway