Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Phylogenetic networks : concepts, algorithms and applications
Huson D., Rupp R., Scornavacca C., Cambridge University Press, New York, NY, 2011. 374 pp. Type: Book (978-0-521755-96-2)
Date Reviewed: Aug 31 2011

Metaphors are powerful. They guide our research agenda and even limit the questions we can ask. Until the 18th century, the “great chain of being” was the dominant metaphor. It said that all creation consisted of a chain of beings from the simple to the complex, and that we, mankind, were the culmination of God’s creation. Starting with Linnaeus, and most explicitly in Haeckel, this metaphor was discarded and replaced with the “tree of life” metaphor, in which there was still progress and change, but there was no longer a single crowning species.

Modern biology, starting with the work of such maverick scientists as Barbara McClintock and Lynn Margulis, and following more recent discoveries in molecular genetics, suggests that this tree metaphor is too limiting. Translocations and captures of genetic material suggest a more cross-cutting metaphor that links organisms in structures that are not simple hierarchies.

What does this have to do with computing? Computing theory is necessary to set up and analyze models of phylogenetic networks. Are there restrictions on the data that might allow a tree to reasonably describe the relationships between taxa? Even if a tree is reasonable, how does one efficiently find this tree? Obviously, algorithm design is necessary, both for trees and for other relationship networks. This might not be so easy, however. As one might expect, almost all problems of finding the “best” tree or network are nondeterministic polynomial-time (NP)-hard.

We also need design of heuristics. Of course, this brings with it the need to experimentally evaluate the heuristics on real data. To do so, we need software design in order to create programs that are capable of dealing with data in formats that biologists use. Further, these programs must be able to deal with masses of data from molecular sequencing. Finally, one would like the programs to be usable by biologists. Hence, all of computing, including theory, algorithms, heuristics, design, implementation, and software engineering, is necessary.

This book examines the theory, algorithms, and heuristics for phylogenetic networks. (A short chapter on software mentions several working systems and a number of proof-of-concept programs.)

The first four chapters present necessary background, and include graphs and trees, sequence alignment, phylogenetic trees, and an introduction to phylogenetic networks. At this point in the book, the difficulties become clear. Even if we can describe a set of data with a tree, which tree is “best”? For almost any definition of “best,” finding the best tree is NP-hard. Further, it is not clear that forcing the data into a tree is reasonable.

Subsequent chapters deal with techniques that have been developed over the past two decades to create trees and networks from data. Some of these techniques are quite new, and this book marks their first appearance outside of technical journals.

If one can distinguish two pieces of data on a certain factor, then this factor splits the dataset into classes, some of which we can, and some of which we cannot, distinguish based on this factor. Chapter 5 covers the theory of such splits.

Chapter 6, which is the heart of this book, develops the idea of clustering for phylogenetic networks.

Subsequent chapters take these ideas of splits, clusters, and distance, and use them as the basis for algorithms to construct networks based on some assumptions about the allowed form of the networks. For example, non-tree forms are allowed if the cross edges obey some rather strong restrictions.

Since, for the same set of species, data on different genes will likely produce different trees, the authors also cover the problem of constructing a consensus tree, and of constructing a network that includes the relationships found in these different trees. They also describe methods to create networks that show how species have evolved by insertions, deletions, and duplications.

Chapter 13 discusses and gives examples of various techniques to graphically present phylogenetic networks. Finally, chapter 14 briefly mentions software that is presently available, or that has at least been published.

The many worked examples and suggested exercises make this book a potential text for a graduate course, but the formal style would limit the audience to computer science students. (Biology students would probably find the mathematics too daunting.) In summary, this is a technical monograph that will be of great interest to those working in this field, but it is probably too technical for a more general audience.

Reviewer:  Paul Cull Review #: CR139416 (1203-0268)
Bookmark and Share
 
Biology And Genetics (J.3 ... )
 
 
Biology And Genetics (J.3 ... )
 
 
Clustering (H.3.3 ... )
 
 
Graph And Tree Search Strategies (I.2.8 ... )
 
 
Trees (G.2.2 ... )
 
 
Clustering (I.5.3 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Biology And Genetics": Date
Discovering the secrets of DNA
Friedland P., Kedes L. Communications of the ACM 28(11): 1164-1186, 1985. Type: Article
May 1 1986
The formation of three-dimensional biological structures: computer uses and future needs
Levinthal C.  Computer culture: the scientific, intellectual, and social impact of the computer (, New York,1801984. Type: Proceedings
Sep 1 1986
Computer techniques in neuroanatomy
Capowski J., Plenum Press, New York, NY, 1989. Type: Book (9789780306432637)
Nov 1 1990
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy