New software that has been developed by researchers at the Johns Hopkins Bloomberg School of Public Health that vastly improves how quickly and cost-effectively scientists can analyze data from RNA sequencing projects. While there is clear value in extending RNA sequencing capabilities to examine genetic links to diseases and assist in detection and drug development, the process is generally lengthy and expensive due to the vast amount of data involved.
The software, termed “Myrna” was funded in part by Amazon Web Services (in addition to the Bloomberg School of Public Health and the National Institutes of Health) was, not surprisingly, making use of compute resources from Amazon. In order to test Myrna, researchers rented time and storage resources from AWS and were able to realize solid performance and cost savings. According to the study’s authors, “Myrna calculated differential expression from 1.1 billion RNA sequences reads in less than two hours at a cost of about $66.”
As the lead researcher of study that was just published in the journal Genome Biology, Jeffrey T. Leek noted, “Biological data in many experiments—from brain images to genomic sequences—can now be generated so quickly that it often takes many computers working simultaneously to perform statistical analyses.” With this rush in data, Leeks says the cloud opens more possibilities and grants more open access to researchers as they can focus on their research versus the hassles of data center operation.
As Leek stated, “the cloud computing approach we developed for Myrna is one way that statisticians can quickly build different models to find the relevant patterns in sequencing data and connect them to different diseases.” He suggests that while Myrna has been designed to analyze next-generation sequencing reads, using cloud in conjunction with statistical modeling can carry over to other fields that generate large amounts of data.