Automated data-driven exploration of chemical space for catalysts

More Info
expand_more

Abstract

Catalysts play an essential role in the daily lives of humans. These catalysts are used in many industries to make processes more energetically favourable. Climate change is pushing humanity towards the usage of more green energy and catalysts play an important role in this transition.For example, in the hydrogenation reaction used for the storage of H2, where the catalyst is involved in the storage and removal of H2 on a storage medium like CO2. The properties of the catalyst involved in this (de)hydrogenation reaction can affect the selectivity and yield of the reaction. Designing a catalyst that maximizes the property (yield for example) that we are interested in for a specific reaction, is an essential asset to tune catalyzed processes. Computational screening of many catalysts has attracted the attention of academia and industry due to constant developments in the field of computational chemistry.In these computational methods, predictive models together with DFT and/or DFTB methods can be used to correlate a set of reaction descriptors with catalyst properties. The model has a higher probability to find novel molecules with a high activity when more (reliable) training data is used and when the search spac eof the model is confined to a local chemical space. This means that newly added molecules for screening should be structurally closely related to the molecule that was used to build the model. Unfortunately, large data sets are not readily available for transition-metal containing complexes although these complexes are widely applied in the field of homogeneous catalysis. In this research a Python-based workflow, ChemSpaX, that is aimed at automating local chemical space exploration for any type of molecule is introduced. Thisworkflow enables the user to place fragments on molecules based on 3Dinformation, while staying close to the quality of the initial structure. This enables data-driven property calculations and prediction models, which could eventually be extended towards the automated design of new catalysts. Various representative applications of ChemSpaX are presented in which data-driven xTBand DFT property calculations are done. The found correlations between catalyst properties are shown and it is shown that ChemSpaX generates structures that have a reasonable quality for usage in data-driven prediction models for high-throughput screening.