If you’ve ever wondered how much DNA data there is on earth, you’re not alone. But a trio of UK researchers did more than wonder. They set to calculating the total information content in the biosphere and published their results and methodology in a recent issue of PLOS Biology.
In this first ever accounting of its kind, the total amount of DNA in the biosphere comes out to roughly 50 × 1030 megabase pairs (that’s fifty trillion trillion trillion base pairs). Weighing in at 50 billion metric tons, the total volume of this genetic material would fill one billion shipping containers.
Storing this information would require 1021 computers with the mean storage capacity of the world’s four most powerful supercomputers (Tianhe-2, Titan, Sequoia, and K computer), say the paper’s authors University of Edinburgh researchers Hanna Landenmark, Duncan Forgan and Charles Cockell.
The team used an alternative approach that they say takes an information view of biodiversity, in which the total amount of information in the biosphere is represented in terms of DNA.
“The biosphere can be visualized as a large, parallel supercomputer, with the information storage represented by the total amount of DNA and the processing power symbolized by transcription rates,” they write. “In analogy with the Internet, all organisms on Earth are individual containers of information connected through interactions and biogeochemical cycles in a large, global, bottom-up network.”
To reach their estimate, the researchers began with the five major subgroups of life: prokaryotes, plants, animals, unicellular eukaryotes (sometimes referred to as protists), and fungi. They then estimated the total biomass for each group and extrapolated DNA content based on the genome size and typical mass per cell.
The authors see this new metric as an important indicator of the genetic diversity of life. Focusing primarily on species diversity doesn’t go far enough, they say. They say it would be like tallying the information content of the Internet by counting the components that are attached to it.
The work leads to an interesting discussion of the computational power of the biosphere, which can be thought of as a living supercomputer. Instead of FLOPS (Floating-point Operations Per Second), DNA processing speed refers to how many bases are transcribed per second, measured in Nucleotide Operations Per Second (NOPS).
Using a middle-of-the-road estimate of 30 bases per second as the typical rate of DNA transcription, the potential computational power of the biosphere comes out to approximately 1015 yottaNOPS (yotta = 1024). This is 1022 times more processing power than the reigning TOP500 champion, Tianhe-2, which has a processing power on the order of 105 teraflops (tera = 1012). Consider also, that the total combined performance of all 500 systems on the most recent TOP500 list (November 2014) is “just” 309 petaflops.
Note: I maintained the authors’ notation style for the sake of consistency and simplicity. The published LINPACK speed of Tianhe-2 is 33.86 petaflops, or 3.38 x 10^16 flops.