How Bioscientists Tackle Data Overload to Advance Medical Research

By Juan Caballero, Ph.D., Chief Science Officer, Databiology

June 18, 2019

The International Center for Scientific Debate explains that the ability to better collect, store, organize, integrate, analyze and share biomedical data provides opportunities to advance the detection, diagnosis, treatment and prevention of disease.1

Yet the greatest challenge bioscientists face is how to handle the flood of information coming from various sources and every instrument that assess patient health.

Databiology has created a biomedical information management and orchestration platform for the life sciences and healthcare sectors that helps researchers tap into many different data sources. 2

Enabling faster, more effective medical research

When a research team creates a workspace on Databiology’s platform on the IBM Cloud, members can load any type of biomedical data and perform intensive processing tasks requiring high-performance compute power.

The platform functions as a central data and analysis management hub for conducting end-to-end biomedical research. By provisioning the required technology to manage large and complex data assets, Databiology enables clients to perform faster and more economically effective research.

Additionally, the platform can take on any third-party application stack and orchestrate it to run in any number of connected compute environments. The platform includes an app store, which has more than 250 different biomedical analytics and visualization applications. If researchers need an application that isn’t in the app store, they can rapidly add their own using Databiology’s CIAO application onboarding framework. 3

[Also learn how companies are making dark medical data visible.]

 

Tailoring research with hybrid cloud capabilities

The Databiology platform can be deployed either on premises or on different clouds and can use IBM Asperato transport terabyte-sized biomedical data sets from disparate locations to the workspace quickly.

Databiology has two offerings to fit the needs of different biomedical companies.

  • The Databiology for Enterprise platform is integrated with IBM Power Systems, IBM Spectrum LSF, and IBM Spectrum Scale to enhance workload, resource and data lifecycle management in the cloud, on- and off-premises, and in hybrid models. IBM Power Systems servers are built on a flexible, open platform and the processor is designed for big data workloads. Power Systems servers combine computing power, memory bandwidth and I/O in ways that are easier to consume and manage, and provide high resiliency, availability and security features. IBM Spectrum Scaleprovides world-class storage management with extreme scalability, flash accelerated performance, and automatic policy-based storage tiering from flash through disk to tape. IBM Spectrum LSF provides highly scalable and reliable resource-aware workload management platform that supports demanding, distributed and mission-critical high-performance computing (HPC) environments offering an enhanced user and administrator experience.
  • Databiology Lab runs exclusively on IBM Cloud. The secure, high-performance cloud offers dynamic burst capabilities for intense compute requirements. Databiology Lab is designed for smaller teams or academic use, or for larger customers to try out the capability of the platform before they decide to go with the Enterprise platform.

 

[Beware of the perils of becoming trapped in the cloud with HPC.]


Automatically securing the provenance trail and capturing scientific insights

 

The Databiology platform is making research more efficient by capturing all the metadata about scientific analysis automatically. The platform maintains a sophisticated knowledge graph, which delivers reliable reproducibility with the same software, with the same data, on the same environments if needed later. Users are now able to understand how different items of data are related to each other.

For pharma company customers, this provenance graph is hugely important. For example, if they’ve developed a product that went through regulatory approval, and, years later, discover issues they’ve got to be able to demonstrate exactly how they derived certain insights.

In academia, push-button reproducibility of the scientific process is becoming increasingly important because of how much poorly reproducible science is out there and how many papers ultimately cannot be verified independently.

Researchers know exactly how results were derived from the multitude of pieces of data and by which process. This drives data interoperability and reuse, which is something every enterprise is after today.

Learn more about how IBM supports the healthcare and life sciences industry.


References:

 

  1. https://www.bdebate.org/en/forum/big-data-biomedicine-challenges-and-opportunities
  2. https://databiology.com/
  3. http://www.databiology.com/index.php/developers
Return to Solution Channel Homepage
HPCwire