Peaks on Trial
Deep Learning for Allelic Peak Classification in Forensic DNA Electropherograms
W.W. Büthker (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Rolf Ypma – Mentor (Nederlands Forensisch Instituut (NFI))
Thomas Abeel – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
T. Höllt – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Allele calling is a critical step in forensic DNA analysis, and better automation could increase throughput and consistency in casework. However, few studies systematically compare machine-learning architectures for allele calling, and limited high-quality training data constrain progress.
We designed and evaluated a peak-based allele-calling pipeline that makes peak-level classifications rather than profile-level segmentations. The pipeline comprises three models: the Peak Model, Autoencoder Model, and Combined Model; and we evaluated them on datasets with ground-truth annotations and with imperfect analyst annotations. We compared performance against state-of-the-art baselines.
The Combined Model outperformed DNANet on NFI research data with ground-truth annotations (pixel F1 0.934 vs. 0.923; p = 0.001). Ablation experiments showed that each component contributed to performance. Autoencoder pretraining improved accuracy when training data were scarce (fewer than 1,000 DNA profiles). Error analysis further indicated that small peaks are hardest to classify as allelic (true DNA) versus artefactual compared with medium and high peaks.
Overall, peak-based classification improves allele-calling performance over current models and clarifies key failure regimes, bringing fully automated allele calling closer to forensic deployment.