The Covid-19 pandemic has profoundly changed the world. The remote workplace has become the norm. We have started looking at personal health differently – the way we work, live, play and do business. AI’s use for drug discovery has accelerated post-Covid-19 era.
Today, drug discovery is an expensive proposition, with a $2.6 billion cost over 10 years and just a 12% success rate. AI promises to significantly improve this. Innovative startups are attempting to change the landscape with the use of AI/ML. On the forefront is Atomwise, with its AtomNet® platform. It has succeeded in finding small molecule hits for more undruggable targets than any other AI drug discovery platform.
In this blog, we will lay out the challenges of high cost, long cycle time and low success rate faced during drug discovery process, and show how AI/ML startups have stepped up to uniquely solve these challenges using best of breed technology solutions from Atomwise, AWS, and WEKA.
Best of breed technology stack with the AtomNet® platform
Atomwise’s AtomNet® is built on best-in-class engineering architecture and tools, with WEKA and AWS as key technology partners. The AtomNet®® platform enables massive scale and unprecedented speed needed to create a deep and broad pipeline of drugs to improve human health. The platform leverages CNN (Convolutional Neural Nets), which employ deep learning in three dimensions to the molecular recognition problem. In many ways, it’s the same approach as deep learning for image recognition. Instead of learning low-level image features, the networks learn low-level features of 3D molecular interactions and associate them into higher-order concepts that explain and predict important labels like binding affinity to a particular protein. This AI-based approach, is then effectively used for drug discovery or for precision medicine to eliminate diseases such as cancer and Sars-COV-2019.
The Data Challenge
The small molecule drug discovery process is very data intensive. The drug discovery process takes around 4,000 different protein structures, with over 3 million molecule compounds, and runs over 15 million experiments. This equates to importing data from 15 million source databases, running ETL (Extract, Transform, Load) to generate around 30 million small files used for training Convolutional Neural Nets (CNN) models. CNN models employ deep learning in three dimensions to the molecular recognition problem. In many ways, it’s the same approach as deep learning for image recognition. Instead of learning low-level image features, the networks learn low-level features of 3D molecular interactions and associate these into higher-order concepts. These concepts explain and predict important labels like binding affinity to a particular protein, which can then be used to treat a disease.
Read the full blog to learn more about AI-based drug discovery with Atomwise and WEKA Data Platform.
Reminder: You can learn a lot from AWS HPC engineers by subscribing to the HPC Tech Short YouTube channel, and following the AWS HPC Blog channel.