Attention-based deep learning for DNA repair outcome prediction
Learning how the cell repairs DNA breaks using local sequence context
J.H.D. de Boer (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Joana P. Goncalves – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Marcel J.T. Reinders – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
C.F. Seale – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
O.E. Scharenborg – Graduation committee member (TU Delft - Multimedia Computing)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Recent advancements in quantification of repair outcomes of CRISPR-Cas9 mediated double-stranded DNA breaks (DSBs) have allowed for the use of machine learning for predicting the frequencies of these repair outcomes. Local DNA sequence context influences the frequencies of mutations that arise when DNA gets repaired after it is targeted by CRISPR (CRISPR outcomes). Contemporary models exploit this and can predict what the frequencies are of CRISPR outcomes at predetermined genomic loci. Predictions of such models are reasonably precise, but there may be opportunities for improvement in how the DNA sequence context is leveraged for making predictions. Some models only utilize a set of hand-crafted features, limiting the available information for the model. Other models do utilize broader sequence context but disregard sequence order or only predict a limited set of outcome classes. In this work we present an attention-based deep learning model that uses DNA sequence context to make fine-grained CRISPR outcome predictions. We present a custom input embedding for representing DSB repair outcomes and we expand on existing methods for analyzing attention-based models.