19 June 2009

New computational methodology could revolutionize evolutionary biology

by Kate Melville

Detailed, accurate evolutionary trees that reveal the relatedness of living things can now be determined much faster and for thousands of species with the help of a novel computing method developed by scientists at The University of Texas at Austin.

For hundreds of years, biologists have used evolutionary trees to explain the interconnectedness of plants, animals and other organisms. The science of figuring out these trees, known as systematics, has progressed significantly in the last two decades largely due to advances in computation, genetics and molecular biology. But many of the relationships among the world's 1.5 million described species (the true number could be more than 10 million) remain to be figured out, and surprises still remain. Figuring out these relationships requires analyzing large amounts of molecular data, such as DNA and protein sequences.

To provide a quantum leap in this field, computer scientist Tandy Warnow, biologist Randy Linder and their graduate students, have created an automated computing method, called SATé, that can analyze these molecular data from thousands of organisms, simultaneously figuring out how the sequences should be organized and computing their evolutionary relatedness in as little as 24 hours.

Previous simultaneous methods like Warnow and Linder's have been limited to analyzing 20 species or fewer and have taken months to complete. "SATé could completely change the practice of making evolutionary trees and revolutionize our understanding of evolution," said Warnow. In addition, SATé can accurately analyze DNA sequences that are rapidly evolving. These sequences have been previously avoided due to concern that the resulting trees would be poor.

Before a tree, or phylogeny, can be determined, DNA and protein sequence data must be organized. This process is called alignment. Key to Warnow and Linder's program is its ability to quickly and accurately align these data. "Our process is novel because it rapidly and simultaneously aligns sequences and looks for the best phylogenies," says Linder. "The old way of doing this for a large number of sequences was basically to align the data once, but we can look at many arrangements to find better ones."

This is important because different alignments can lead to significantly different phylogenies, and scientists must find the phylogeny that best represents the evolutionary relationships among the species in question.

For their paper, published in Science, Warnow, the researchers tested SATé using computer-generated data and real biological data. The biological data had been previously aligned manually by other experts. The new phylogenies closely match those existing, both validating the method's potential, and, in some cases, validating the evolutionary trees themselves.

"Instead of doing things by hand, evolutionary biologists can now trust our automated program," says Warnow. "It will enable the creation of much more accurate trees, especially for the Tree of Life, which deals with hundreds of thousands of gene sequences from the millions of species on Earth."

Related:
Citizen Scientists And The Web Of Life
Constructal law unifies animate and inanimate designs of nature
Biodiversity: It's In The Water
Creature With No Brain Makes New Phylum A No-Brainer
Into The Shadows - Searching For Alien Life On Earth

Source: University of Texas at Austin