CHOP

Haplotype-aware path indexing in population graphs

Journal Article (2020)
Author(s)

Tom Mokveld (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Jasper Linthorst (TU Delft - Electrical Engineering, Mathematics and Computer Science, VU University Medical Centre)

Zaid Al-Ars (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Henne Holstege (VU University Medical Centre, TU Delft - Electrical Engineering, Mathematics and Computer Science)

Marcel Reinders (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1186/s13059-020-01963-y Final published version
More Info
expand_more
Publication Year
2020
Language
English
Research Group
Pattern Recognition and Bioinformatics
Journal title
Genome biology
Issue number
1
Volume number
21
Pages (from-to)
1-16
Downloads counter
132
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The practical use of graph-based reference genomes depends on the ability to align reads to them. Performing substring queries to paths through these graphs lies at the core of this task. The combination of increasing pattern length and encoded variations inevitably leads to a combinatorial explosion of the search space. Instead of heuristic filtering or pruning steps to reduce the complexity, we propose CHOP, a method that constrains the search space by exploiting haplotype information, bounding the search space to the number of haplotypes so that a combinatorial explosion is prevented. We show that CHOP can be applied to large and complex datasets, by applying it on a graph-based representation of the human genome encoding all 80 million variants reported by the 1000 Genomes Project.