Circular Image

J.M. Weber

15 records found

Grounding Large Language Models (LLMs) in chemical knowledge graphs (KGs) offers a promising way to support synthesis planning, but reliably retrieving information from these complex structures remains a challenge. Therefore, this work addresses that gap by constructing a biparti ...
This thesis explores the automated construction of Chemical Reaction Networks (CRNs) from incomplete experimental data, a task traditionally dependent on expert knowledge and manual effort. CRNs model the interactions between chemical species through a network of reactions and ar ...
Online databases contain extensive collections of (bio)chemical reactions serving as valuable resources for a variety of applications. However, these large datasets often suffer from incomplete reaction data missing, for example, co-reactants and by-products. Machine learning can ...
Synthetic polymers are crucial in diverse industries, but current AI-driven design methodologies primarily target linear homopolymers, with limited emphasis on developing customized approaches for copolymers. To address this gap, we introduce a generative model for goal-directed ...
Recent advancements in machine learning (ML) have shown promise in accelerating polymer discovery by aiding in tasks such as virtual screening via property prediction, and the design of new polymer materials with desired chemical properties. However, progress in polymer ML is ham ...
AI explainers are tools capable of approximating how a neural network arrived at a given predic- tion by providing parts of the input data most rel- evant for the model’s choice. These tools have become a major point of research due to a need for human-verifiable predictions in m ...
As graph neural networks (GNNs) become more frequently used in the biomedical field, there is a growing need to provide insight into how their predictions are made. An algorithm that does this is GNN-SubNet, developed with the aim of detecting disease subnetworks in protein-prote ...
Graph neural networks (GNNs), while effective at various tasks on complex graph-structured data, lack interpretability. Post-hoc explainability techniques developed for these GNNs in order to overcome their inherent uninterpretability have been applied to the additional task of d ...
Predicting properties, such as toxicity or water solubility of unknown molecules with Graph Neural Networks has applications in drug research. Because of the ethical concerns associated with using artificial intelligence techniques in the medical field, explainable artificial int ...
Proteins are fundamental biological macromolecules essential for cellular structure, enzymatic catalysis, and immune defense, making the generation of novel proteins crucial for advancements in medicine, biotechnology, and material sciences. This study explores protein design usi ...
This study evaluates how the explainer for a Graph Neural Network creates explanations for chemical property prediction tasks. Explanations are masks over input molecules that indicate the importance of atoms and bonds toward the model output. Although these explainers have bee ...
Advancing protein design is crucial for breakthroughs in medicine and biotechnology, yet traditional approaches often fall short by focusing solely on representing protein sequences using the 20 canonical amino acids. This thesis explores discrete diffusion models for generating ...
Accurately predicting enzyme-substrate interactions is critical for applications in drug discovery, biocatalysis and protein engineering. Building upon the ProSmith algorithm, a machine learning framework with a multimodal transformer for protein-small molecule interaction predic ...
Large chemical reaction databases often suffer from incompleteness, such as missing molecules or stoichiometric information. Concurrently, numerous computational models are being developed in predictive chemistry that rely on reaction databases and would hugely benefit from compl ...
Dataset discovery techniques originally required datasets to have the same domain which made them unsuitable to be used on a larger scale. To avoid this requirement, newer techniques use additional information, aside from the datasets being processed, to better understand the dat ...