Polynomial-Time Algorithms for Phylogenetic Inference Problems Involving Duplication and Reticulation

Journal Article (2020)
Author(s)

L.J.J. Van Iersel (TU Delft - Discrete Mathematics and Optimization)

Remie Janssen (TU Delft - Discrete Mathematics and Optimization)

M.E.L. Jones (TU Delft - Discrete Mathematics and Optimization)

Yukihiro Murakami (TU Delft - Discrete Mathematics and Optimization)

Norbert Zeh (Dalhousie University)

Research Group
Discrete Mathematics and Optimization
Copyright
© 2020 L.J.J. van Iersel, R. Janssen, M.E.L. Jones, Yukihiro Murakami, Norbert Zeh
DOI related publication
https://doi.org/10.1109/TCBB.2019.2934957
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 L.J.J. van Iersel, R. Janssen, M.E.L. Jones, Yukihiro Murakami, Norbert Zeh
Research Group
Discrete Mathematics and Optimization
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Issue number
1
Volume number
17
Pages (from-to)
14-26
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A common problem in phylogenetics is to try to infer a species phylogeny from gene trees. We consider different variants of this problem. The first variant, called Unrestricted Minimal Episodes Inference, aims at inferring a species tree based on a model with speciation and duplication where duplications are clustered in duplication episodes. The goal is to minimize the number of such episodes. The second variant, Parental Hybridization, aims at inferring a species network based on a model with speciation and reticulation. The goal is to minimize the number of reticulation events. It is a variant of the well-studied Hybridization Number problem with a more generous view on which gene trees are consistent with a given species network. We show that these seemingly different problems are in fact closely related and can, surprisingly, both be solved in polynomial time, using a structure we call 'beaded trees'. However, we also show that methods based on these problems have to be used with care because the optimal species phylogenies always have a restricted form. To mitigate this problem, we introduce a new variant of Unrestricted Minimal Episodes Inference that minimizes the duplication episode depth. We prove that this new variant of the problem can also be solved in polynomial time.

Files

08798653_1.pdf
(pdf | 0.729 Mb)
License info not available