Startup Aims to Revolutionize Drug Discovery

By Nicole Hemsoth

May 18, 2007

The advancement of genomics and proteomics via high performance computing is drawing new companies into the drug discovery business. One such company, Gencom Inc., claims its goal is to revolutionize in-silico drug discovery by dramatically speeding up the identification of novel therapeutics. Without external funding, the company has developed a drug discovery platform based on the Intel Itanium 2 hardware. At April's Intel Developer's Forum, Gencom received an Honorable Mention in the Itanium Solutions Alliance's Humanitarian Impact awards for its use of Itanium to advance functional genomics.

Recently, we got the opportunity to ask Gencom's founder and chief technology architect, Michael J. Colonna, about the company and the nature of the technology it has developed.

HPCwire: Could you give us a brief overview of Gencom's history, your vision as the founder, and where the company stands today as far as products and services offered?

Colonna: Gencom was founded in 2002 with an initial vision that was focused on improving computational performance in the area of molecular modeling with the ultimate goal of improving drug-to-market times. At the time, many of the most simplistic models were taking up to 30 days to process on some of the most powerful computers available. It was obvious that bigger and faster servers with increasingly more processors were not getting the job done. 

We decided to take an entirely different approach towards the analysis of the problem which started with a decomposition of the entire process, beginning with the models themselves. The decomposition of the models revealed significant deficiencies in terms of how the models were constructed. This served to establish our first goal which was to develop platform independent “pre-processing” optimization software that would restructure the models for optimal computational performance. We put together some code and tested a simple model consisting of a single gene-protein pair and the results were very promising.

During the course of our analysis it became evident that even if we improved computational performance of the models, there were other deficiencies, both process and technical, that threatened to trivialize the results of our efforts. This led to a significant expansion of our scope which included a rather lofty goal to optimize the entire drug discovery process. We started the entire analysis over, beginning with a business modeling exercise that revealed numerous process and technical candidates for optimization and established the high-level requirements for our technical architecture.

Today, Gencom offers its clientele a full suite of complimentary services that are designed exclusively to improve their ability to more rapidly identify new drug candidates, provide early identification of high-risk candidates and expedite the discovery-to-market process in order to bring more drugs to market, faster, more efficiently and more safely. The services include fully validated on-demand high performance computing, in-silico forced degradation modeling and discovery-focused electronic data capture.

HPCwire: Can you discuss the GENeSYS technology: its unique attributes, the nature of the hardware and software, its performance attributes, and what kinds of users and applications it's targeted for?

Colonna: GENeSYS [pronounced Genesis] is our second generation computing architecture that began its life as Black Widow. Like Black Widow, GENeSYS consists of seven software components that work together to address each of the optimization requirements and the Unifying Platform which, among other things, brings together and normalizes various genomic and proteomic data from the public domain. GENeSYS includes all of the optimization components of Black Widow but is significantly improved in terms of its ability to narrow the focus of drug candidates through constraint-based simulations of metabolic networks and contains improvements in the ability to predict chemical and biological degradation.

The first challenge required us to rethink the computational requirements that would be necessary to achieve our goals. This led to the development of a technical architecture that would run the optimization software components which included the need to dynamically allocate system resources as required based on the complexity of the models. We called the approach “hyper-mesh” or G2 — grid computing on steroids — due to its ability to scale dynamically, create multiple virtual processors and encapsulate and isolate multiple processes to eliminate the risk of process “bleed-over.”

The architecture at this point was technology independent in that we had only defined the requirements without consideration for deployment. It was essentially an exercise that permitted us to dream big and scale back if necessary, based upon limitations of available hardware. Given our initial load projections, we knew that it would be a tall order for any processor to fill and we even considered various contingencies that would get us close to where we wanted to be. One of the many tracks of the analysis and design phase included an assessment of available hardware architectures that could conceivably deliver the type of performance we were hoping for. 

Enter EPIC, Itanium's Explicitly Parallel Instruction Computing architecture. As a purely intellectual exercise, we overlaid EPIC and competing architectures with models of biological systems in an attempt to identify analogs that would ultimately support the software architecture which was modeled on functional biology. Much to our surprise, this turned out to be a fairly good indicator as to the likelihood that the processor architecture would be suitable for our purposes.

The second iteration of analysis and design was focused on a proof of concept for the Itanium and EPIC. Based upon our rough calculations, we could conceivably reduce turnaround time to somewhere in the millisecond range as opposed to weeks. Being absolutely cynical and convinced that our calculations were flawed, we solicited objective validation from various independent subject matter experts and they all confirmed that “on paper”, our assumptions and calculations appeared to be accurate.

This was sufficient for us to commit to the development of an initial prototype. Black Widow Lite was cobbled together using bailing wire and duct tape, figuratively speaking, and contained limited-utility versions of the seven software components written in C++, each strung loosely together with PERL. It was so fragile that we often joked that everyone had to hold their breath during testing for fear that it would completely unravel with the slightest movement. It was not very pretty but it served to validate our initial assumptions and calculations.

The current iteration of GENeSYS is written in C# and PERL has been replaced with F# due to its scripting capabilities which are far more efficient and elegant in execution. Using our own measure of computational speed, we have been able to derive the equivalent of 700 teraflops in terms of overall throughput. While our measure of speed would not stand in terms of the requirements to make the Top 500 list, we are more concerned with measuring overall performance based upon computation of multi-dimensional biological organisms which is a bit different than measuring purely clock speed, I/O, etc. Ultimately, we are not interested in making the list, rather we are interested in identifying drug candidates.

Early in the inception stage, we leaned towards the use of Microsoft-based technologies due to our collective experiences using other technologies and vendors in the past, which were less than positive. As a startup, we were confident that Microsoft would provide us with a level of support, that with limited resources, other vendors simply would not. That added an additional challenge as most of the development on Itanium was on Linux-based machines so we knew that there would be very little history to draw upon in terms of lesson learned. In hindsight, we are confident that we made the right decision for numerous reasons.

In terms of applicability, the utility of GENeSYS is extremely broad and could be applied in many areas where large-scale computational modeling is required. While we had numerous discussions in terms of the breadth of applicability, we made a strategic decision to focus exclusively on drug discovery optimization and the unique requirements associated with FDA compliance to eliminate any further complexity. 

HPCwire: What is the rationale behind the use of an Itanium-based platform, as opposed to say an x86-based cluster or a capability type supercomputer, such an IBM Blue Gene or a Cray system?

Colonna: We had always envisioned a multi-threaded approach to software optimization, possibly on HT equipped x86-based servers, but found that EPIC provided far more capability in terms of overall throughput than we could achieve with 10 times the number of x86 machines, and this was prior to the release of the dual-core Montecito. Blue Gene or any of the other supercomputers of that type would most likely have limited our creativity in terms of development and also deployment.

The days of supercomputers, I believe, are long gone and we are now in the age of “superclusters” and the Itanium is the “big gun” on the block in terms of large-scale computational modeling capabilities. Now, with the dual-core Itanium which significantly reduces power consumption while improving overall performance, the field of in-silico drug discovery could be changed forever. Our hope is that Big Pharma and biotech organizations will apply some of their brainpower and brawn towards the development of applications that could significantly improve the quality of life for all of us. That should be the goal of any technology.

HPCwire: Are there users today, beta or otherwise, that you can talk about?

Colonna: We partnered early on with five biotech companies (NDA's prevent disclosure of their names) that would act as pilots during the development process and help with the definition of requirements. That number has now increased to seven; and as part of our initial agreement and in return for their invaluable guidance, they have an exclusive right to use the technology for a prescribed period of time. GENeSYS is currently going through validation. The inclusion of the metabolic signaling pathway functionality has proven to be far more complex than with Black Widow. So far, we are absolutely thrilled with what has been accomplished and are awaiting approval of the first drug discovered on GENeSYS with hopefully many more to come.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

House Bill Seeks Study on Quantum Computing, Identifying Benefits, Supply Chain Risks

May 27, 2020

New legislation under consideration (H.R.6919, Advancing Quantum Computing Act) requests that the Secretary of Commerce conduct a comprehensive study on quantum computing to assess the benefits of the technology for Amer Read more…

By Tiffany Trader

$100B Plan Submitted for Massive Remake and Expansion of NSF

May 27, 2020

Legislation to reshape, expand - and rename - the National Science Foundation has been submitted in both the U.S. House and Senate. The proposal, which seems to have bipartisan support, calls for giving NSF $100 billion Read more…

By John Russell

IBM Boosts Deep Learning Accuracy on Memristive Chips

May 27, 2020

IBM researchers have taken another step towards making in-memory computing based on phase change (PCM) memory devices a reality. Papers in Nature and Frontiers in Neuroscience this month present IBM work using a mixed-si Read more…

By John Russell

Australian Researchers Break All-Time Internet Speed Record

May 26, 2020

If you’ve been stuck at home for the last few months, you’ve probably become more attuned to the quality (or lack thereof) of your internet connection. Even in the U.S. (which has a reasonably fast average broadband Read more…

By Oliver Peckham

Hats Over Hearts: Remembering Rich Brueckner

May 26, 2020

It is with great sadness that we announce the death of Rich Brueckner. His passing is an unexpected and enormous blow to both his family and our HPC family. Rich was born in Milwaukee, Wisconsin on April 12, 1962. His Read more…

AWS Solution Channel

Computational Fluid Dynamics on AWS

Over the past 30 years Computational Fluid Dynamics (CFD) has grown to become a key part of many engineering design processes. From aircraft design to modelling the blood flow in our bodies, the ability to understand the behaviour of fluids has enabled countless innovations and improved the time to market for many products. Read more…

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the dominant primate species, with the neanderthals disappearing b Read more…

By Oliver Peckham

$100B Plan Submitted for Massive Remake and Expansion of NSF

May 27, 2020

Legislation to reshape, expand - and rename - the National Science Foundation has been submitted in both the U.S. House and Senate. The proposal, which seems to Read more…

By John Russell

IBM Boosts Deep Learning Accuracy on Memristive Chips

May 27, 2020

IBM researchers have taken another step towards making in-memory computing based on phase change (PCM) memory devices a reality. Papers in Nature and Frontiers Read more…

By John Russell

Nvidia Q1 Earnings Top Expectations, Datacenter Revenue Breaks $1B

May 22, 2020

Nvidia’s seemingly endless roll continued in the first quarter with the company announcing blockbuster earnings that exceeded Wall Street expectations. Nvidia Read more…

By Doug Black

Microsoft’s Massive AI Supercomputer on Azure: 285k CPU Cores, 10k GPUs

May 20, 2020

Microsoft has unveiled a supercomputing monster – among the world’s five most powerful, according to the company – aimed at what is known in scientific an Read more…

By Doug Black

HPC in Life Sciences 2020 Part 1: Rise of AMD, Data Management’s Wild West, More 

May 20, 2020

Given the disruption caused by the COVID-19 pandemic and the massive enlistment of major HPC resources to fight the pandemic, it is especially appropriate to re Read more…

By John Russell

AMD Epyc Rome Picked for New Nvidia DGX, but HGX Preserves Intel Option

May 19, 2020

AMD continues to make inroads into the datacenter with its second-generation Epyc "Rome" processor, which last week scored a win with Nvidia's announcement that Read more…

By Tiffany Trader

Hacking Streak Forces European Supercomputers Offline in Midst of COVID-19 Research Effort

May 18, 2020

This week, a number of European supercomputers discovered intrusive malware hosted on their systems. Now, in the midst of a massive supercomputing research effo Read more…

By Oliver Peckham

Nvidia’s Ampere A100 GPU: Up to 2.5X the HPC, 20X the AI

May 14, 2020

Nvidia's first Ampere-based graphics card, the A100 GPU, packs a whopping 54 billion transistors on 826mm2 of silicon, making it the world's largest seven-nanom Read more…

By Tiffany Trader

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

[email protected] Turns Its Massive Crowdsourced Computer Network Against COVID-19

March 16, 2020

For gamers, fighting against a global crisis is usually pure fantasy – but now, it’s looking more like a reality. As supercomputers around the world spin up Read more…

By Oliver Peckham

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

Global Supercomputing Is Mobilizing Against COVID-19

March 12, 2020

Tech has been taking some heavy losses from the coronavirus pandemic. Global supply chains have been disrupted, virtually every major tech conference taking place over the next few months has been canceled... Read more…

By Oliver Peckham

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the do Read more…

By Oliver Peckham

Steve Scott Lays Out HPE-Cray Blended Product Roadmap

March 11, 2020

Last week, the day before the El Capitan processor disclosures were made at HPE's new headquarters in San Jose, Steve Scott (CTO for HPC & AI at HPE, and former Cray CTO) was on-hand at the Rice Oil & Gas HPC conference in Houston. He was there to discuss the HPE-Cray transition and blended roadmap, as well as his favorite topic, Cray's eighth-gen networking technology, Slingshot. Read more…

By Tiffany Trader

Honeywell’s Big Bet on Trapped Ion Quantum Computing

April 7, 2020

Honeywell doesn’t spring to mind when thinking of quantum computing pioneers, but a decade ago the high-tech conglomerate better known for its control systems waded deliberately into the then calmer quantum computing (QC) waters. Fast forward to March when Honeywell announced plans to introduce an ion trap-based quantum computer whose ‘performance’ would... Read more…

By John Russell

Leading Solution Providers

SC 2019 Virtual Booth Video Tour

AMD
AMD
ASROCK RACK
ASROCK RACK
AWS
AWS
CEJN
CJEN
CRAY
CRAY
DDN
DDN
DELL EMC
DELL EMC
IBM
IBM
MELLANOX
MELLANOX
ONE STOP SYSTEMS
ONE STOP SYSTEMS
PANASAS
PANASAS
SIX NINES IT
SIX NINES IT
VERNE GLOBAL
VERNE GLOBAL
WEKAIO
WEKAIO

Contributors

Fujitsu A64FX Supercomputer to Be Deployed at Nagoya University This Summer

February 3, 2020

Japanese tech giant Fujitsu announced today that it will supply Nagoya University Information Technology Center with the first commercial supercomputer powered Read more…

By Tiffany Trader

Tech Conferences Are Being Canceled Due to Coronavirus

March 3, 2020

Several conferences scheduled to take place in the coming weeks, including Nvidia’s GPU Technology Conference (GTC) and the Strata Data + AI conference, have Read more…

By Alex Woodie

Exascale Watch: El Capitan Will Use AMD CPUs & GPUs to Reach 2 Exaflops

March 4, 2020

HPE and its collaborators reported today that El Capitan, the forthcoming exascale supercomputer to be sited at Lawrence Livermore National Laboratory and serve Read more…

By John Russell

‘Billion Molecules Against COVID-19’ Challenge to Launch with Massive Supercomputing Support

April 22, 2020

Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in hist Read more…

By Oliver Peckham

Cray to Provide NOAA with Two AMD-Powered Supercomputers

February 24, 2020

The United States’ National Oceanic and Atmospheric Administration (NOAA) last week announced plans for a major refresh of its operational weather forecasting supercomputers, part of a 10-year, $505.2 million program, which will secure two HPE-Cray systems for NOAA’s National Weather Service to be fielded later this year and put into production in early 2022. Read more…

By Tiffany Trader

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

By John Russell

TACC Supercomputers Run Simulations Illuminating COVID-19, DNA Replication

March 19, 2020

As supercomputers around the world spin up to combat the coronavirus, the Texas Advanced Computing Center (TACC) is announcing results that may help to illumina Read more…

By Staff report

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This