A Graph-Neural-Network Approach for Reconstructing Temporal Networks

Master Thesis (2024)
Author(s)

T.J.M. Broeders (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

L.J.J. van Iersel – Mentor (TU Delft - Discrete Mathematics and Optimization)

Esther Julien – Graduation committee member (TU Delft - Discrete Mathematics and Optimization)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2024 Theo Broeders
More Info
expand_more
Publication Year
2024
Language
English
Copyright
© 2024 Theo Broeders
Graduation Date
27-03-2024
Awarding Institution
Delft University of Technology
Programme
['Applied Mathematics']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reconstructing a minimum reticulation network from phylogenetic trees is used in evolutionary studies. In this thesis, we focus on finding temporal networks using cherry-picking sequences for binary trees with all taxa. Finding such a minimum reticulation temporal network is NP-hard.

We introduce an algorithm to find a minimum reticulation network with a running time of O(2^n poly(n,t)). In addition, this study explores potential enhancements to the algorithm through the utilisation of branch and bound.

Additionally, we introduce a similar algorithm to determine the existence of a temporal phylogenetic network. This algorithm is improved upon by integrating a new concept called cherry growing. This leads to a notable speed-up in performance.

Furthermore, we examine the application of Graph Neural Networks (GNN) in heuristics to find a cherry-picking sequence which can be used to construct a network. This is done by classifying leaves into good, which leads to optimal solutions, and bad leaves. To assess this, two types of data were employed: one simulating evolutionary models and the other employing a fully random approach. The best-performing GNN model has a 97.4% accuracy for evolution-based data and a 79.1% accuracy for random-based data.

The GNN models are implemented as predictors in two classes of heuristics. The first generates a cherry-picking sequence by repeatedly picking leaves. The second class of heuristics is based on a tree search heuristic. This tree-search-based heuristic outperforms the cherry-picking-based heuristic. Furthermore, the GNN heuristics outperform their random variant, even for problems substantially larger than the GNN was trained on.

We also examine the use of GNN in predicting the existence of a phylogenetic temporal network given a set of trees. The best-performing GNN found for this problem has an accuracy of 80.3%.

Files

Mep_Copy_-Final.pdf
(pdf | 1.89 Mb)
License info not available