Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis

None, None; None, None; None, None; None, None; None, None

Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis

Journal Article (2019)

Author(s)

Yehia Amar (University of Cambridge)

A.M. Schweidtmann (RWTH Aachen University)

Paul Deutsch (UCB Pharma)

Liwei Cao (Cambridge Centre for Advanced Research and Education in Singapore, University of Cambridge)

Alexei A. Lapkin (Cambridge Centre for Advanced Research and Education in Singapore, University of Cambridge)

Affiliation

External organisation

DOI related publication

https://doi.org/10.1039/c9sc01844a

To reference this document use:

https://resolver.tudelft.nl/uuid:5345fa02-fde7-48b3-8064-846f9c532d84

More Info

expand_more

Publication Year

2019

Language

English

Affiliation

External organisation

Issue number

27

Volume number

10

Pages (from-to)

6697-6706

Abstract

Rational solvent selection remains a significant challenge in process development. Here we describe a hybrid mechanistic-machine learning approach, geared towards automated process development workflow. A library of 459 solvents was used, for which 12 conventional molecular descriptors, two reaction-specific descriptors, and additional descriptors based on screening charge density, were calculated. Gaussian process surrogate models were trained on experimental data from a Rh(CO)₂(acac)/Josiphos catalysed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam. With two simultaneous objectives-high conversion and high diastereomeric excess-the multi-objective algorithm, trained on the initial dataset of 25 solvents, has identified solvents leading to better reaction outcomes. In addition to being a powerful design of experiments (DoE) methodology, the resulting Gaussian process surrogate model for conversion is, in statistical terms, predictive, with a cross-validation correlation coefficient of 0.84. After identifying promising solvents, the composition of solvent mixtures and optimal reaction temperature were found using a black-box Bayesian optimisation. We then demonstrated the application of a new genetic programming approach to select an appropriate machine learning model for a specific physical system, which should allow the transition of the overall process development workflow into the future robotic laboratories.

No files available

Metadata only record. There are no files for this record.