Application of Language Models to homogeneous catalysis

Master Thesis (2024)
Author(s)

J. de Korte (TU Delft - Applied Sciences)

Contributor(s)

A.V. Kalikadien – Mentor (TU Delft - ChemE/Inorganic Systems Engineering)

E.A. Pidko – Mentor (TU Delft - ChemE/Inorganic Systems Engineering)

Faculty
Applied Sciences
Research Group
ChemE/Inorganic Systems Engineering
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
12-07-2024
Awarding Institution
Delft University of Technology
Programme
['Chemical Engineering']
Faculty
Applied Sciences
Research Group
ChemE/Inorganic Systems Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Asymmetric hydrogenation is a field of major interest for the pharmaceutical industry. Using these catalyzed reactions instead of traditional stoichiometric reactions can reduce waste and energy, and can open up possibilities to new intermediates, products, and synthesis pathways. Finding an optimal catalyst to produce a selected enantiomer remains a struggle, however, requiring large time and resource investments. Determining ligand performance can be done experimentally using HTE campaigns, supplemented with predictive methods, either mechanism-based or mechanism-agnostic. Recent advancements in mechanism-agnostic predictive methods include a large range of studies using Machine Learning approaches, relying mostly on molecular descriptors to represent the catalyst structure to the models. More recently, with the rise of NLP models, string-based structural identifiers are used to train a Language Model to predict catalyst performance. In a recent study, an LSTM model was trained and used to predict the enantiomeric excess of a range of ligands for an asymmetric hydrogenation reaction. This work is based on the workflow used by them and validates the performance of this model as shown in their paper. Furthermore, this model was applied and tuned to predict the enantiomeric excess of a range of ligands, based on a dataset from the ISE research group.

Files

MEP_Report_Jan_de_Korte.pdf
(pdf | 2.62 Mb)
License info not available