Reconstructing Phylogenetic Networks Using Cherry Picking

A journey into the phylogenetics

Bachelor Thesis (2024)
Author(s)

B.T. Hoekstra (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Leo van Van Iersel – Mentor (TU Delft - Discrete Mathematics and Optimization)

Esther Julien – Mentor (TU Delft - Discrete Mathematics and Optimization)

Cor Kraaikamp – Graduation committee member (TU Delft - Applied Probability)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
12-07-2024
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the study of evolutionary biology, there exists a method called the “cherry picking algorithm” that produces the instructions needed to create a network that shows how different species are related.
This report explores what happens when the algorithm starts with a wrong choice, or a “suboptimal cherry” for the first step of the algorithm, and how this affects the accuracy of the algorithm. Imagine you are trying to build a family tree for different species, but you start with a mistake. This research looks at how such initial mistakes can impact the accuracy of the entire family tree. We conducted this study
using simulations of the algorithm that deliberately make an initial mistake, and afterwards continue as the algorithm would normally. The study found that starting with a wrong step in the algorithm usually makes the performance of the algorithm worse. Specifically, it lead to an average optimal performance decrease of 34,8% for networks relating a smaller number of species, and 11.3% for networks relating a larger number of species. Interestingly, the larger the number of species we are attempting to relate in the network produced by our algorithm, the less severe the impact of the initial mistake.
We concluded that making an initial mistake negatively effects the average performance of the algorithm, and the extent of the effect varies with the number of species we are trying to relate in our network.

Files

License info not available