Publication details
- Phylogenies with Statistical Methods: Problems & Solutions (Alexandros P. Stamatakis, Thomas Ludwig, Harald Meier), In Proceedings of 4th International Conference on Bioinformatics and Genome Regulation and Structure, pp. 229–233, BGRS-04, Novosibirsk, Russia, 2004
Publication details – URL
Abstract
The computation of ever larger as well as more accurate phylogenetic trees with the ultimate goal to compute the “tree of life” represents a major challenge in Bioinformatics. Statistical methods for phylogenetic analysis such as maximum likelihood or bayesian inference, have shown to be the most accurate methods for tree reconstruction. Unfortunately, the size of trees which can be computed in reasonable time is limited by the severe computational complexity induced by these statistical methods. However, the field has witnessed great algorithmic advances over the last 3 years which enable inference of large phylogenetic trees containing 500-1000 sequences on a single CPU within a couple of hours using maximum likelihood programs such as RAxML and PHYML. An additional order of magnitude in terms of computable tree sizes can be obtained by parallelizing these new programs. In this paper we briefly present the MPI-based parallel implementation of RAxML (Randomized Axelerated Maximum Likelihood), as a solution to compute large phylogenies. Within this context, we describe how parallel RAxML has been used to compute the –to the best of our knowledge- first maximum likelihood-based phylogenetic tree containing 10.000 taxa on an inexpensive LINUX PC-Cluster. In addition, we address unresolved problems, which arise when computing large phylogenies for real-world sequence data consisting of more than 1.000 organisms with maximum likelihood, based on our experience with RAxML. Finally, we discuss potential
BibTeX
@inproceedings{PWSMPSSLM04, author = {Alexandros P. Stamatakis and Thomas Ludwig and Harald Meier}, title = {{Phylogenies with Statistical Methods: Problems \& Solutions}}, year = {2004}, booktitle = {{Proceedings of 4th International Conference on Bioinformatics and Genome Regulation and Structure}}, pages = {229--233}, conference = {BGRS-04}, location = {Novosibirsk, Russia}, abstract = {The computation of ever larger as well as more accurate phylogenetic trees with the ultimate goal to compute the “tree of life” represents a major challenge in Bioinformatics. Statistical methods for phylogenetic analysis such as maximum likelihood or bayesian inference, have shown to be the most accurate methods for tree reconstruction. Unfortunately, the size of trees which can be computed in reasonable time is limited by the severe computational complexity induced by these statistical methods. However, the field has witnessed great algorithmic advances over the last 3 years which enable inference of large phylogenetic trees containing 500-1000 sequences on a single CPU within a couple of hours using maximum likelihood programs such as RAxML and PHYML. An additional order of magnitude in terms of computable tree sizes can be obtained by parallelizing these new programs. In this paper we briefly present the MPI-based parallel implementation of RAxML (Randomized Axelerated Maximum Likelihood), as a solution to compute large phylogenies. Within this context, we describe how parallel RAxML has been used to compute the –to the best of our knowledge- first maximum likelihood-based phylogenetic tree containing 10.000 taxa on an inexpensive LINUX PC-Cluster. In addition, we address unresolved problems, which arise when computing large phylogenies for real-world sequence data consisting of more than 1.000 organisms with maximum likelihood, based on our experience with RAxML. Finally, we discuss potential}, url = {http://wwwkramer.in.tum.de/exelixis/pubs/BGRS2004.pdf}, }