Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in history. Now, a new competition from the Joint European Disruptive Initiative (JEDI) is poised to raise the bar even higher, aiming to recruit up to a hundred teams to crunch billions of molecules in the hunt for a COVID-19 therapeutic – and offering millions of euros in prizes. In an interview with HPCwire, JEDI’s founder, André Loesekrug-Pietri, spoke about the structure and goals of the ambitious, supercomputing-powered challenge.
JEDI, a foundation that aims to be the “European DARPA” and a “moonshot factory,” typically looks to the future, focusing on longer-term projects that are years away and haven’t received funding or scientific attention commensurate to their social impacts. But with the advent of COVID-19, JEDI found itself working distinctly in the present – and hunting for a way to create value added in a crowded research field.
“A couple of weeks ago, we brought together all those people who are involved in the broader healthcare sector, trying to understand: okay, what could be our added value in this global crisis?” Loesekrug-Pietri said. The experts indicated that research was focusing too much on individual molecules – such as hydroxychloroquine – and there, JEDI saw an opening.
“Why don’t we use the capacity that HPC is giving us today? And why don’t we, on top of that, bring in people coming from ML and artificial intelligence to try to optimize these calculations?” Loesekrug-Pietri said. “And so we framed a challenge around: can we screen, to a level unprecedented before, … for an interaction either destructive or ameliorating [to] the coronavirus?”
The Billion Molecules Against COVID-19 Grand Challenge
It’s a catchy headline: a billion molecules. The name, however, might actually be underselling the ambition of the competition. “Every team needs to come up with a billion molecules,” Loesekrug-Pietri explained. In the first stage of the challenge, each of the teams (Loesekrug-Pietri expects that around 50 to 100 teams will have the capacity to compete) will be tasked with screening those billion molecules for their affinity with COVID-19 using three different screening methods. The objective: to identify molecules with strong binding potentials (within 100 nanomolar) that can advance to the second stage of the challenge.
“The uniqueness here, also a little bit inspired by climate models, is not just to have everybody come up with their own solution, but requesting that all teams come up with three different methods to screen these molecules on their binding affinity,” Loesekrug-Pietri said. Between the first and second stages of the competition, JEDI will take advantage of the medley of results produced by many teams using many approaches by cross-correlating the results from all the teams to produce a so-called “ultimate list.” “By cross-correlating these methodologies, you basically leverage out biases or errors,” Loesekrug-Pietri said, explaining that most researchers don’t cross-correlate their results internally – let alone with international teams using radically different methods.
The second stage, Loesekrug-Pietri said, is all about reducing the viral load, with the aim of reducing it by 99 percent. “We will ask the teams again to come up with very creative virology calculation methodologies using predictive algorithms to be able to pinpoint which of the compounds they want to test in terms of viral discharge,” he said. “We are then going to synthesize these ultimate compounds to go to stage two in order to really test. Because otherwise, you remain very theoretical, which is a really great step, but then you need to test it on real molecules.” The most promising candidate molecules in the ultimate list will be synthesized – if possible – and their potential to reduce viral load will be tested in the real world. “If you have affinity plus viral discharge,” Loesekrug-Pietri said, “then you are up to something really powerful.”
The third and final stage will focus on testing existing real-world therapeutics. “Stage three is basically one and two together, plus using that on existing FDA-approved drugs,” Loesekrug-Pietri said. JEDI, he explained, wanted to zero in on any and all drugs that researchers may have overlooked. “Here, we basically want to create serendipity and force people to also check on all the molecules where we already know the toxicity and where basically we can go directly into animal testing,” he said. After feedback from the scientific community, the third stage will also incorporate drug cocktails. “Look at how HIV went,” Loesekrug-Pietri said. “It took us 25 years to go from testing individual drugs, and today, the things that work are cocktails of up to ten different drugs that need to be taken in different phases.”
The supercomputing firepower
To enable the teams to conduct their research, JEDI has brought together a broad coalition of high-profile supercomputing and science organizations. HPC resources are being provided by GENCI, the French national high-performance computing organization; the Partnership for Advanced Computing in Europe (PRACE); and Deutsche Telekom (which Loesekrug-Pietri said is committing all of its CPU and GPU resources), among others.
JEDI is also working to distribute the resources evenly among participants. “What we’re currently building,” Loesekrug-Pietri explained, “is an interface where basically the participant can tap directly into these resources and request a certain number of hours – millions of core hours, probably – and it will distribute it by doing a bit of load balancing, if I can call it that.”
However, Loesekrug-Pietri isn’t even sure that load balancing will be necessary. “We have, probably, enough resources ourselves, but it’s very difficult to estimate – it will really be depending on the methods that people will use,” he said, adding that machine learning approaches can sometimes offer 30-fold speedups relative to brute force computing, complicating total demand estimates. In terms of capacity, Loesekrug-Pietri said that JEDI is aiming for “not unlimited, but close.” “We are in the tens, if not in the hundreds of millions of core hours,” he said.
Crowdsourced computing powerhouse Folding@home is working closely with JEDI, helping to provide targets for researchers to assail with candidate molecules. “The more targets we have on which teams will be able to run their billion compounds,” Loesekrug-Pietri said, “the more combinations of keys and locks. You can imagine that these become numbers which are just absolutely massive.” Folding@home’s John Chodera has joined the challenge’s scientific committee, which also includes leaders from a wide range of universities, research institutes and supercomputing centers.
Looking ahead
The challenge launches on May 1st. Loesekrug-Pietri estimates that the first two stages will each take around four weeks, with a couple of weeks between them to allow for cross-correlation of the lists. Stage three, however, may coexist with the other stages, depending on how teams are progressing through the challenge. Either way, Loesekrug-Pietri said, “we are looking for results before the end of June.” The challenge, he added, was built as open science, and participants will deposit their results into public libraries to aid global efforts against COVID-19.
“We think that we can probably … be much faster in this very long traditional testing phase without cutting corners,” Loesekrug-Pietri said. “By cross-correlating, by using this massive screening, we are able actually to automate a lot of the steps that today are the reasons why these clinical tests are so long – because they’re all very sequential. We’re trying to do a lot of things running in parallel.”
For JEDI, of course, the goal is to achieve a COVID-19 moonshot. “We already have high hopes that this will be a massive breakthrough,” Loesekrug-Pietri said.