Learning Signature Exposures from Gene Expression at Single-Cell Resolution
Regular vs. Multitask Learning of Individual Regression Models
A. Potolski Eilat (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Joana P. Gonçalves – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
S. Costa – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
I. Stresec – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Catherine Oertel – Graduation committee member (TU Delft - Interactive Intelligence)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Understanding the mutational processes active within cancer cells is essential to improve diagnosis and treatment strategies. This study investigates whether the activity levels of these processes, quantified as mutational signature exposures, can be predicted from single-cell gene expression data. Two regression-based learning paradigms are compared: regular independent modelling, where the different models of each mutational signature selects its own regularisation parameter and set of genes, and multitask modelling, where the different models agree on a set of genes to be used for the prediction of each signature, and the regularisation parameter is shared. We evaluate their predictive performance and interpretability using biologically informed metrics. Furthermore, we assess the models’ robustness on unseen data by simulating real-world shifts through clustering-based data splits. Our results show that while both models achieve reasonable predictive accuracy, independently trained models offer greater flexibility and interpretability by identifying signature-specific genes and regularisation strengths. These findings suggest that gene expression carries meaningful information about a cell’s mutational history and that signature-specific modelling may offer better biological insight into tumour heterogeneity.