Identifying equally scoring trees in phylogenomics with incomplete data using Gentrius

Identifying equally scoring trees in phylogenomics with incomplete data using Gentrius

Abstract

Phylogenetic trees are routinely built from huge and yet incomplete multi-locus datasets often leading to multiple equally scoring trees under many common criteria. As typical tree inference software output only a single tree, identifying all trees with identical score challenges phylogenomics. Here, we introduce Gentrius – an efficient algorithm that tackles this problem. We showed on simulated and biological datasets that Gentrius generates millions of trees within seconds. Depending on the distribution of missing data across species and loci and the inferred phylogeny, the number of equally good trees varies tremendously. The strict consensus tree computed from them displays all the branches unaffected by the pattern of missing data. Thus, Gentrius provides an important systematic assessment of phylogenetic trees inferred from incomplete data. One-Sentence Summary Gentrius - the algorithm to generate a complete stand, i.e. all binary unrooted trees compatible with the same set of subtrees.

Grafik Top
Authors
  • Chernomor, Olga
  • Elgert, Christiane
  • von Haeseler, Arndt
Grafik Top
Shortfacts
Category
Journal Paper
Divisions
Bioinformatics and Computational Biology
Journal or Publication Title
bioRxiv Cold Spring Harbor Laboratory
ISSN
0362-4331
Publisher
https://www.biorxiv.org/content/10.1101/108027v3
Date
20 January 2023
Export
Grafik Top