Modality fusion strategies in a transformer-based algorithm predicting enzyme-substrate interactions

Master Thesis (2024)
Author(s)

G.D. Trevnenski (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Marcel J.T. Reinders – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Jana M. Weber – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

L. Di Fruscia – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
12-12-2024
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Accurately predicting enzyme-substrate interactions is critical for applications in drug discovery, biocatalysis and protein engineering. Building upon the ProSmith algorithm, a machine learning framework with a multimodal transformer for protein-small molecule interaction prediction, this study introduces protein 3D structural data as an additional modality. To integrate this data, we explore additive and multiplicative modality fusion strategies without requiring retraining the original transformer from scratch. Our experiments demonstrate that while the incorporation of structural data does not offer improved performance in random splits, it has the potential to surpass ProSmith in challenging data splits involving unseen small molecules. Notably, the model shows better generalization for underrepresented substrates.

Files

License info not available