<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
In a phylogenetic tree, present-day species are leaves and an edge from u to v indicates that u is an ancestor of v. Weights on these edges indicate the phylogenetic distance. The phylogenetic diversity (PD) of a set of species A is the total weight of edges that are on any path between the root of the phylogenetic tree and a species in A. Selecting a small set of species that maximizes phylogenetic diversity for a given phylogenetic tree is an essential task in preservation planning, where limited resources naturally prevent saving all species. An optimal solution can be found with a greedy algorithm [Steel, Systematic Biology, 2005; Pardi and Goldman, PLoS Genetics, 2005]. However, when a food web representing predator-prey relationships is given, finding a set of species that optimizes phylogenetic diversity subject to the condition that each saved species should be able to find food among the preserved species is NP-hard [Spillner et al., IEEE/ACM, 2008]. We present a generalization of this problem, where, inspired by biological considerations, the food web has weighted edges to represent the importance of predator-prey relationships. We show that this version is NP-hard even when both structures, the food web and the phylogenetic tree, are stars. To cope with this intractability, we proceed in two directions. Firstly, we study special cases where a species can only survive if a given fraction of its prey is preserved. Secondly, we analyze these problems through the lens of parameterized complexity. Our results include that finding a solution is fixed-parameter tractable with respect to the vertex cover number of the food web, assuming the phylogenetic tree is a star.
...
In a phylogenetic tree, present-day species are leaves and an edge from u to v indicates that u is an ancestor of v. Weights on these edges indicate the phylogenetic distance. The phylogenetic diversity (PD) of a set of species A is the total weight of edges that are on any path between the root of the phylogenetic tree and a species in A. Selecting a small set of species that maximizes phylogenetic diversity for a given phylogenetic tree is an essential task in preservation planning, where limited resources naturally prevent saving all species. An optimal solution can be found with a greedy algorithm [Steel, Systematic Biology, 2005; Pardi and Goldman, PLoS Genetics, 2005]. However, when a food web representing predator-prey relationships is given, finding a set of species that optimizes phylogenetic diversity subject to the condition that each saved species should be able to find food among the preserved species is NP-hard [Spillner et al., IEEE/ACM, 2008]. We present a generalization of this problem, where, inspired by biological considerations, the food web has weighted edges to represent the importance of predator-prey relationships. We show that this version is NP-hard even when both structures, the food web and the phylogenetic tree, are stars. To cope with this intractability, we proceed in two directions. Firstly, we study special cases where a species can only survive if a given fraction of its prey is preserved. Secondly, we analyze these problems through the lens of parameterized complexity. Our results include that finding a solution is fixed-parameter tractable with respect to the vertex cover number of the food web, assuming the phylogenetic tree is a star.
Journal article(2026)
-
Christian Komusiewicz, Jannik Schestag
In the Generalized Noah’s Ark Problem , one is given a phylogenetic tree on a set of species X and a set of conservation projects for each species. Each project comes with a cost and raises the survival probability of the corresponding species. The aim is to select a conservation project for each species such that the total cost of the selected projects does not exceed some given threshold and the expected phylogenetic diversity is as large as possible. We study the complexity of Generalized Noah’s Ark Problem and some of its special cases with respect to several parameters related to the input structure, such as the number of different costs, the number of different survival probabilities, or the number of species, |X|.
...
In the Generalized Noah’s Ark Problem , one is given a phylogenetic tree on a set of species X and a set of conservation projects for each species. Each project comes with a cost and raises the survival probability of the corresponding species. The aim is to select a conservation project for each species such that the total cost of the selected projects does not exceed some given threshold and the expected phylogenetic diversity is as large as possible. We study the complexity of Generalized Noah’s Ark Problem and some of its special cases with respect to several parameters related to the input structure, such as the number of different costs, the number of different survival probabilities, or the number of species, |X|.
Phylogenetic diversity plays an important role in biodiversity, conservation, and evolutionary studies by measuring the diversity of a set of taxa based on their phylogenetic relationships. In phylogenetic trees, a subset of k taxa with maximum phylogenetic diversity can be found by a simple and efficient greedy algorithm. However, this algorithmic tractability is lost when considering phylogenetic networks, which incorporate reticulate evolutionary events such as hybridization and horizontal gene transfer. To address this challenge, we introduce PaNDA (Phylogenetic Network Diversity Algorithms), the first software package and interactive graphical user-interface for exploring, visualizing and maximizing diversity in phylogenetic networks. PaNDA includes a novel algorithm to find a subset of k taxa with maximum diversity, running in polynomial time for networks of bounded scanwidth, a measure of tree-likeness of a network that grows slower than the well-known level measure. This algorithm considers the variant of phylogenetic diversity on networks in which the branch lengths of all paths from the root to the selected taxa contribute towards their diversity. We demonstrate the scalability of this algorithm on simulated networks, successfully analyzing level-15 networks with up to 200 taxa in seconds. We also provide a proof-of-concept analysis using a phylogenetic network on Xiphophorus species, illustrating how the tool can support diversity studies based on real genomic data. The software is easily installable and freely available at https://github.com/nholtgrefe/panda. Additionally, we extend the definition of phylogenetic diversity to semi-directed phylogenetic networks, which are mixed graphs increasingly used in phylogenetic analysis to model uncertainty of the root location. We prove that finding a subset of k taxa with maximum diversity remains NP-hard on semi-directed networks, but do present a polynomial-time algorithm for networks with bounded level.
...
Phylogenetic diversity plays an important role in biodiversity, conservation, and evolutionary studies by measuring the diversity of a set of taxa based on their phylogenetic relationships. In phylogenetic trees, a subset of k taxa with maximum phylogenetic diversity can be found by a simple and efficient greedy algorithm. However, this algorithmic tractability is lost when considering phylogenetic networks, which incorporate reticulate evolutionary events such as hybridization and horizontal gene transfer. To address this challenge, we introduce PaNDA (Phylogenetic Network Diversity Algorithms), the first software package and interactive graphical user-interface for exploring, visualizing and maximizing diversity in phylogenetic networks. PaNDA includes a novel algorithm to find a subset of k taxa with maximum diversity, running in polynomial time for networks of bounded scanwidth, a measure of tree-likeness of a network that grows slower than the well-known level measure. This algorithm considers the variant of phylogenetic diversity on networks in which the branch lengths of all paths from the root to the selected taxa contribute towards their diversity. We demonstrate the scalability of this algorithm on simulated networks, successfully analyzing level-15 networks with up to 200 taxa in seconds. We also provide a proof-of-concept analysis using a phylogenetic network on Xiphophorus species, illustrating how the tool can support diversity studies based on real genomic data. The software is easily installable and freely available at https://github.com/nholtgrefe/panda. Additionally, we extend the definition of phylogenetic diversity to semi-directed phylogenetic networks, which are mixed graphs increasingly used in phylogenetic analysis to model uncertainty of the root location. We prove that finding a subset of k taxa with maximum diversity remains NP-hard on semi-directed networks, but do present a polynomial-time algorithm for networks with bounded level.
Network Phylogenetic Diversity (Network-PD) is a measure for the diversity of a set of species based on a rooted phylogenetic network (with branch lengths and inheritance probabilities on the reticulation edges) describing the evolution of those species. We consider the Max-Network-PD problem: given such a network, find k species with maximum Network-PD score. We show that this problem is fixed-parameter tractable (FPT) for binary networks, by describing an optimal algorithm running in O(2rlog(k)(n+r)) time, with n the total number of species in the network and r its reticulation number. Furthermore, we show that Max-Network-PD is NP-hard for level-1 networks, proving that, unless P = NP, the FPT approach cannot be extended by using the level as parameter instead of the reticulation number.
...
Network Phylogenetic Diversity (Network-PD) is a measure for the diversity of a set of species based on a rooted phylogenetic network (with branch lengths and inheritance probabilities on the reticulation edges) describing the evolution of those species. We consider the Max-Network-PD problem: given such a network, find k species with maximum Network-PD score. We show that this problem is fixed-parameter tractable (FPT) for binary networks, by describing an optimal algorithm running in O(2rlog(k)(n+r)) time, with n the total number of species in the network and r its reticulation number. Furthermore, we show that Max-Network-PD is NP-hard for level-1 networks, proving that, unless P = NP, the FPT approach cannot be extended by using the level as parameter instead of the reticulation number.