Peaks on Trial

Deep Learning for Allelic Peak Classification in Forensic DNA Electropherograms

Master Thesis (2026)
Author(s)

W.W. Büthker (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Rolf Ypma – Mentor (Nederlands Forensisch Instituut (NFI))

Thomas Abeel – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

T. Höllt – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
23-02-2026
Awarding Institution
Delft University of Technology
Programme
Computer Science
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
57
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Allele calling is a critical step in forensic DNA analysis, and better automation could increase throughput and consistency in casework. However, few studies systematically compare machine-learning architectures for allele calling, and limited high-quality training data constrain progress.
We designed and evaluated a peak-based allele-calling pipeline that makes peak-level classifications rather than profile-level segmentations. The pipeline comprises three models: the Peak Model, Autoencoder Model, and Combined Model; and we evaluated them on datasets with ground-truth annotations and with imperfect analyst annotations. We compared performance against state-of-the-art baselines.
The Combined Model outperformed DNANet on NFI research data with ground-truth annotations (pixel F1 0.934 vs. 0.923; p = 0.001). Ablation experiments showed that each component contributed to performance. Autoencoder pretraining improved accuracy when training data were scarce (fewer than 1,000 DNA profiles). Error analysis further indicated that small peaks are hardest to classify as allelic (true DNA) versus artefactual compared with medium and high peaks.
Overall, peak-based classification improves allele-calling performance over current models and clarifies key failure regimes, bringing fully automated allele calling closer to forensic deployment.

Files

Peaks_on_Trial.pdf
(pdf | 6.18 Mb)
License info not available