J.M. Weber | TU Delft Repository

Grounding Large Language Models in Reaction Knowledge Graphs for Synthesis Retrieval

Master thesis (2025) - O.A. Bunkova (author) , Marcel JT Reinders (mentor) , Jana M. Weber (mentor) , L. Di Fruscia (mentor) , S. Rupprecht (mentor) , Christoph Lofi (graduation committee member)

Grounding Large Language Models (LLMs) in chemical knowledge graphs (KGs) offers a promising way to support synthesis planning, but reliably retrieving information from these complex structures remains a challenge. Therefore, this work addresses that gap by constructing a biparti ...

Automated Discovery of Chemical Reaction Networks using Program Synthesis

Master thesis (2025) - R.A. Wijers (author) , S. Dumančić (mentor) , R.J. Gardos Reid (mentor) , J. Weber (mentor) , N. Yorke-Smith (graduation committee member) , J.A. Baaijens (graduation committee member)

This thesis explores the automated construction of Chemical Reaction Networks (CRNs) from incomplete experimental data, a task traditionally dependent on expert knowledge and manual effort. CRNs model the interactions between chemical species through a network of reactions and ar ...

Improving Chemical Reaction Completion using Atom-Balance Constraints in Transformer Models

Master thesis (2025) - M.T.W. Noordsij (author) , Jana M. Weber (mentor) , Marcel Reinders (mentor) , G. Vogel (mentor) , J. Yang (graduation committee member)

Online databases contain extensive collections of (bio)chemical reactions serving as valuable resources for a variety of applications. However, these large datasets often suffer from incomplete reaction data missing, for example, co-reactants and by-products. Machine learning can ...

Deep Reinforcement Learning for Inverse Synthetic Polymer Design

Master thesis (2024) - M.P.C. van der Werf (author) , G. Vogel (mentor) , Jana M. Weber (mentor) , M. J.T. Reinders (graduation committee member) , Megha Khosla (graduation committee member)

Synthetic polymers are crucial in diverse industries, but current AI-driven design methodologies primarily target linear homopolymers, with limited emphasis on developing customized approaches for copolymers. To address this gap, we introduce a generative model for goal-directed ...

Joint Embedding Predictive Architecture for Self-supervised Pretraining on Polymer Molecular Graphs

Master thesis (2024) - F. Piccoli (author) , G. Vogel (mentor) , J. Weber (mentor) , MJT Reinders (graduation committee member) , M. Khosla (graduation committee member)

Recent advancements in machine learning (ML) have shown promise in accelerating polymer discovery by aiding in tasks such as virtual screening via property prediction, and the design of new polymer materials with desired chemical properties. However, progress in polymer ML is ham ...

Influence of graph neural network architecture on explainability of protein-protein interaction networks

Bachelor thesis (2024) - H.T. Janczak (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (graduation committee member)

AI explainers are tools capable of approximating how a neural network arrived at a given predic- tion by providing parts of the input data most rel- evant for the model’s choice. These tools have become a major point of research due to a need for human-verifiable predictions in m ...

Modified GNN-SubNet: leveraging local versus global Graph Neural Network explanations for disease subnetwork detection

Bachelor thesis (2024) - E. Milchi (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (coach)

As graph neural networks (GNNs) become more frequently used in the biomedical field, there is a growing need to provide insight into how their predictions are made. An algorithm that does this is GNN-SubNet, developed with the aim of detecting disease subnetworks in protein-prote ...

Evaluating the Explainability of Graph Neural Networks for Disease Subnetwork Detection

Bachelor thesis (2024) - S. Rajesh (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (graduation committee member)

Graph neural networks (GNNs), while effective at various tasks on complex graph-structured data, lack interpretability. Post-hoc explainability techniques developed for these GNNs in order to overcome their inherent uninterpretability have been applied to the additional task of d ...

Evaluating GNN Explainer Faithfulness in Molecular Property Prediction Using Comprehensiveness and Sufficiency

Bachelor thesis (2024) - H.V.M. Pajari (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (graduation committee member)

Predicting properties, such as toxicity or water solubility of unknown molecules with Graph Neural Networks has applications in drug research. Because of the ethical concerns associated with using artificial intelligence techniques in the medical field, explainable artificial int ...

Protein Structure and Sequence Co-Design through Graph Based Generative Diffusion Modeling

Master thesis (2024) - M.H. Bhuradia (author) , J.M. Weber (mentor) , Hadi Jamali-Rad (mentor) , Amelia Villegas Morcillo (mentor) , Marcel Reinders (graduation committee member) , J.W. Böhmer (graduation committee member)

Proteins are fundamental biological macromolecules essential for cellular structure, enzymatic catalysis, and immune defense, making the generation of novel proteins crucial for advancements in medicine, biotechnology, and material sciences. This study explores protein design usi ...

Influence of molecular structures on graph neural network explainers' performance

Bachelor thesis (2024) - T.N. Stols (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (coach)

This study evaluates how the explainer for a Graph Neural Network creates explanations for chemical property prediction tasks. Explanations are masks over input molecules that indicate the importance of atoms and bonds toward the model output. Although these explainers have bee ...

All-Atom Novel Protein Sequence Generation Using Discrete Diffusion

Master thesis (2024) - G.J. Admiraal (author) , Amelia Villegas Morcillo (mentor) , J.M. Weber (mentor) , Marcel JT Reinders (mentor) , J.W. Böhmer (graduation committee member)

Advancing protein design is crucial for breakthroughs in medicine and biotechnology, yet traditional approaches often fall short by focusing solely on representing protein sequences using the 20 canonical amino acids. This thesis explores discrete diffusion models for generating ...

Modality fusion strategies in a transformer-based algorithm predicting enzyme-substrate interactions

Master thesis (2024) - G.D. Trevnenski (author) , Marcel Reinders (mentor) , Jana M. Weber (mentor) , L. Di Fruscia (mentor)

Accurately predicting enzyme-substrate interactions is critical for applications in drug discovery, biocatalysis and protein engineering. Building upon the ProSmith algorithm, a machine learning framework with a multimodal transformer for protein-small molecule interaction predic ...

Chemical reaction completion: a hybrid rule-based and language model-based approach

Master thesis (2023) - M.C. van Wijngaarden (author) , Jana M. Weber (mentor) , Marcel Reinders (graduation committee member) , G. Vogel (coach)

Large chemical reaction databases often suffer from incompleteness, such as missing molecules or stoichiometric information. Concurrently, numerous computational models are being developed in predictive chemistry that rely on reaction databases and would hugely benefit from compl ...

Domain-focused dataset discovery for tabular datasets, using easily-available information about the domain

Master thesis (2022) - R.M.S. Mokiem (author) , Christoph Lofi (mentor) , Geert Jan Houben (graduation committee member) , J.M. Weber (coach)

Dataset discovery techniques originally required datasets to have the same domain which made them unsuitable to be used on a larger scale. To avoid this requirement, newer techniques use additional information, aside from the datasets being processed, to better understand the dat ...