MJ

M.E.L. Jones

info

Please Note

6 records found

Phylogenetic networks are a specific type of directed acyclic graph (DAG), used to depict evolutionary relationships among, for example, species or other groups of organisms. To solve computationally hard problems, treewidth has been used to parametrize algorithms in phylogenetics. In the hope of simplifying the algorithmic design process, Berry, Scornavacca and Weller recently proposed a new measure of tree-likeness that takes into account the directions of the arcs: scanwidth. They showed that the corresponding decision problem of this parameter - which can be seen as a variant of directed cutwidth, using a tree instead of a linear ordering - is NP-complete. This thesis aims to widen the structural knowledge of scanwidth and to find efficient ways of computing it on general DAGs, both by exact and heuristic algorithms.

With the help of reduction rules, we construct an explicit dynamic programming algorithm that computes scanwidth exactly, along with its corresponding tree extension, in O(k * n* m) time for rooted DAGs of scanwidth k. This slicewise polynomial algorithm proves that computing the scanwidth is in the complexity class XP. The algorithm also functions as an FPT algorithm for networks of level-l, with the complexity bounded by O(24l-1* l * n + n2). It performs well in practice, being able to compute the scanwidth of networks up to 30 reticulations and 100 leaves within 500 seconds.

On the heuristic side, an algorithm that repeatedly splits at a specific type of smallest cut is proposed. Enhanced with simulated annealing, this heuristic shows promising results, obtaining an average approximation ratio of 1.5 for large synthetic networks of 30 reticulations and 100 leaves. Applied to a real-world dataset of networks, the heuristic performs near-optimal. Although we prove that the scanwidth is always greater than or equal to the treewidth, experiments show that they are close to each other in practice. This further motivates the use of scanwidth over treewidth as a parameter in algorithms. ...
TreeContainment is a well-known problem within phylogenetics, which asks whether a binary phylogenetic tree is embedded in a binary phylogenetic network. For this problem, Jones, Weller and van Iersel (2022) have created an algorithm that uses dynamic programming on tree-decompositions to achieve a running time that is exponential in the tree-width parameter instead of in the number of reticulations. However, due to the implicit formulations of two crucial steps in this algorithm, this running time cannot be achieved in practice without finding ways to generate the required structures explicitly. This paper provides two new sub-algorithms that can do that. To further improve the performance of the algorithm, I introduce a number of criteria and other methods that can be used to reduce the number of structures generated. Additionally, I describe a way to manipulate the nice tree-decompositions to create a more favorable order of bags for the dynamic programming. These sub-algorithms and improvements are used by two new algorithms TWITCH and PITCH, whose implementations are compared to a brute force algorithm and a new branching cherry-picking algorithm named BOTCH. The latter has an FPT running time that is exponential in the number of vertices that have only reticulations as children. The comparisons show that the implementations of the dynamic programming algorithms TWITCH and PITCH are slower in practice than the brute force algorithm, despite their numerous improvements. Of the four new implementations, BOTCH has the best test results and it is shown to be fast in practice.

...
Bachelor thesis (2022) - V.J. Veenman, M.E.L. Jones
Due to human advancement in recent centuries, extinction rates of animals and plants around the earth have greatly risen. Conservation biology, the study of conservation of nature and earth’s biodiversity, aims to protect species from these increasing extinction rates.
Phylogenetic diversity is a measure for biodiversity that can help in selecting which species to prioritize in preserving diversity. This is necessary because there are bounds on how many of these species can be preserved, due to costs connected to this preservation. To aid in selecting the subset of species with maximum diversity, an ILP can be formulated to find this subset. In this thesis these ILP’s will be formulated for different phylogenetic networks and functions that give a diversity score to these subsets. ...
Phylogenetic networks are used to describe evolutionary histories and are a generalisation of evolutionary trees. They can contain so called reticulations, representing reticulate evolution, such as hybridization, lateral gene transfer and recombination. Methods are being developed to construct certain rooted phylogenetic networks from their subnetworks. A constructed network is encoded by their subnetworks if it is uniquely determined by that set. It has been shown that phylogenetic trees are encoded by their set of triplets, which are rooted trees on three species. However, triplets do not encode phylogenetic networks. Huber and Moulton introduced trinets, rooted networks on three species, which do encode level-1 phylogenetic networks, which are networks containing at most one reticulation in each biconnected component. Van Iersel and Moulton proved that level-2 phylogenetic networks are encoded by their set of trinets and Nipius proved that level-3 phylogenetic networks are encoded by their set of quarnets, which are rooted networks on four species. In this thesis we prove that for all k>1, level-k networks without symmetry in their biconnected components are encoded by their set of (k+1)-nets, which are rooted networks on k+1 leaves. This result provides some evidence for the conjecture that all level-k phylogenetic networks are encoded by their set of (k+1)-nets. Thereafter, we generalise encoding results for level-2 and level-3 networks, where the underlying structure, called generator, and its sides play an important role. A generator is a directed acyclic biconnected multigraph, containing only vertices with indegree 2 and outdegree at most 1, indegree 1 and outdegree 2, and indegree 0 and outdegree 2. The sides of a generator are the arcs and outdegree-0 vertices of a generator. We have not been able to prove that level-k networks with symmetry in the generators of their biconnected components are in general encoded by k+1-nets. For the networks with symmetry, we prove encoding results for networks with leaves on at most p sides of the underlying generators of their biconnected components. We further prove that level-4 networks are encoded by 6-nets. Although Nipius gave a counterexample showing that not all (level-3) phylogenetic networks are encoded by their set of trinets, it is useful to know which networks are encoded by their set of trinets or k-nets. In this thesis, we provide an algorithm which can serve as tool for proving that certain level-k networks are encoded by their set of k-nets. Our presented algorithm is a first step to generalise encoding results to level-k networks with k>3 by using trinets. Furthermore, we are a step closer to proving the conjecture that all level-k networks are encoded by (k+1)-nets, including networks with symmetry in the generators of their biconnected components. ...
In this report, the bounded Maximum Parsimony distance will be considered when
applying three different reduction rules. The distance is a measure on how dissimilar two trees are and is calculated based on the number of mutations that occur when looking at heritable traits. The first rule considered, is the chain reduction. For this rule, it is proven that the bounded MP distance is preserved after applying this rule. This is done by adapting the proof from Steven Kelk et al. [10]. For the second rule considered, the generalized subtree reduction, it is also proven that the bounded MP distance is preserved after applying this reduction. Again, this is done by adapting the proof in the paper by Steven Kelk et al. [10]. Then, at last, we looked at a new reduction rule for the TBR distance, introduced by Steven Kelk and Simone Linz [12], the (2,1,2)-reduction. In this report, it is shown with help of a counterexample that this rule does not necessarily reduce the distance with one like it is the case for the TBR distance. However, it can be concluded that the distance is either preserved or reduced with one. ...

Computing the Maximum Parsimony Scores of Phylogenetic Networks

Bachelor thesis (2017) - Niels Jonkman, Leo van Iersel, Mark Jones
Phylogenetic networks represent the evolutionary history of organisms and the relationships among them. In the field of phylogenetics research has been done to reconstruct these networks. Several methods for reconstructing those networks have been developed over the years. One of them is the softwired parsimony score and another, a fairly new method, is the parental parsimony method. In theory this is a better and more accurate method, but before this thesis, it was not tested yet. In this thesis, we compare the softwired and the parental parsimony method in practice by implementing the methods and calculating their parsimony scores. The results will show that the parental parsimony is a better and more accurate method in practice as well. ...