MUSICiAn: Genome-wide Identification of Genes Involved in DNA Repair via Control-Free Mutational Spectra Analysis

Journal Article (2026)
Author(s)

C.F. Seale (TU Delft - Pattern Recognition and Bioinformatics)

Marco Barazas (Leiden University Medical Center)

Robin van Schendel (Leiden University Medical Center)

Marcel Tijsterman (Leiden University Medical Center)

Joana P. Gonçalves (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1093/nargab/lqaf202
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Pattern Recognition and Bioinformatics
Issue number
1
Volume number
8
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Understanding DNA double-strand break (DSB) repair is crucial for the development of targeted anticancer therapies, yet the roles of many genes remain unclear. Recent studies show that disruption of known DSB repair genes can alter the sequence-specific distribution of mutations arising after DSB repair, suggesting that genome-wide perturbation screens could be leveraged to identify new DSB genes leading to distinct deviations from the expected wild-type distribution. Given the challenges of designing controls for a genome-wide screen, we explore the high gene throughput to forgo the use of traditional controls by reframing the analysis as an outlier detection problem, assuming that most genes have minimal influence on DSB repair outcomes. We propose MUSICiAn (Mutational Signature Catalogue Analysis), a compositional data analysis method that ranks gene perturbation impact on mutational spectra without controls by measuring deviations from the central tendency considering the distribution of all spectra. We show that MUSICiAn effectively estimates pseudo-controls for the Repair-seq screen, yielding 476 genes and 60 nontargeting controls. We further apply MUSICiAn to the first genome-wide screen of 18 406 genes with mutational spectra readout, MUSIC, reporting that MUSICiAn successfully recovers known DSB repair genes, highlights the spliceosome as a lesser-appreciated player, and reveals candidates for further investigation.