SCIENCE & ENGINEERING NEWS
Ithaca, NY — With the announcement of the completion of the genome sequence for Arabidopsis, the model system for plant research, there is much excitement for applying the knowledge base from this plant to crops and other economically important plants. In fact, scientists are well on their way to the next step in understanding crop genetics: mapping function to features.
Now, in a paper published in the December 15 issue of the journal Science, Dr. Todd Vision, a molecular biologist with the USDA-ARS Center for Agricultural Bioinformatics (CAB) at Cornell University, and colleagues Steven Tanksley (Plant Breeding, Cornell) and Daniel Brown (Whitehead Institute/MIT Center for Genome Research), describe a computational window that they have opened onto the genetic history of Arabidopsis. Vision is able to look back in time to the Mesozoic Era and see the ghosts of dramatic changes in the genetic map of this model plant. His method will help scientists connect the known function of a gene in modern Arabidopsis to one in another species, thus extending the work from a common weed to the crops we depend on for survival.
Arabidopsis thaliana is a small, fast-growing plant in the mustard family that has its genetic code stored in a conveniently compact package. Thus it became the model system for plant genome research. However, even this modest plant has a complex evolutionary history, and researchers have determined that Arabidopsis has doubled its genetic makeup repeatedly over time, with most of the activity occurring during the age of the dinosaurs. Vision has devised a method of data analysis that makes it possible to see the faded tracks of these dramatic events in a representation of the modern genetic map, and then to reconstruct the ancient images.
“If you can reverse history by reconstructing what the map looked like in the distant ancestors of Arabidopsis, then you are well on your way to figuring out where each gene should be in the map of all the important crop plants that descended from that same ancestor. This is important because we don’t have nearly as detailed a map for most crops as we do for Arabidopsis,” says Vision. Herein lies the key: based on the map of the modern Arabidopsis genome, researchers will be able to track the genes through time to their locations in modern crops. As biologists determine the function of known genes in arabidopsis, they can use this system to find and test these genes for their functions in other species. Some will have retained the same functions; many others will have changed, but the information about their heritage will be very valuable.
Vision and his colleagues used cluster computing resources at Cornell, including a high-performance Dell/Intel/Windows cluster at the Cornell Theory Center (CTC) funded by the USDA, to conduct his research. The team’s method combines standard software in a novel way that allows the computer to see patterns in the genomic sequence. The core of their method is a graph theoretic algorithm implemented by Brown in MATLAB that repeatedly identifies duplicated segments in the genome from thousands of automatically generated random samples. While the method is very computationally intensive, it allows researchers to assign statistical confidence to the patterns that they see for the first time. This tool is embedded in a powerful combination of bioinformatics tools including BLAST (Basic Local Alignment Search Tool), a program that searches gene and protein databases for similar sequences.
The methods developed through this collaboration will likely be picked up and used by other researchers for finding related chromosome segments both within individual genomes and between related genomes. For example, Tanksley has been studying the relationships between the genomes of Arabidopsis and tomatoes.
“Vision’s work demonstrates that basic inquiry, in this case into genomic evolution, can yield useful tools for the broad research community,” says CTC executive director Linda Callahan.
The CAB is supported by the U.S. Department of Agriculture, Agricultural Research Service, in partnership with the College of Agriculture and Life Sciences and CTC. CTC is a high-performance computing and interdisciplinary research center located at Cornell University. CTC receives funding from Cornell University, New York State, a number of federal agencies, and Corporate Program members.
============================================================