Data-Driven Virtual Screening of Conformational Ensembles of Transition-Metal Complexes

Journal Article (2025)
Author(s)

Sára Finta (Student TU Delft)

Adarsh V. Kalikadien (TU Delft - ChemE/Inorganic Systems Engineering)

Evgeny Pidko (TU Delft - ChemE/Inorganic Systems Engineering)

Research Group
ChemE/Inorganic Systems Engineering
DOI related publication
https://doi.org/10.1021/acs.jctc.5c00303
More Info
expand_more
Publication Year
2025
Language
English
Research Group
ChemE/Inorganic Systems Engineering
Issue number
10
Volume number
21
Pages (from-to)
5334-5345
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Transition-metal complexes serve as highly enantioselective homogeneous catalysts for various transformations, making them valuable in the pharmaceutical industry. Data-driven prediction models can accelerate high-throughput catalyst design but require computer-readable representations that account for conformational flexibility. This is typically achieved through high-level conformer searches, followed by DFT optimization of the transition-metal complexes. However, conformer selection remains reliant on human assumptions, with no cost-efficient and generalizable workflow available. To address this, we introduce an automated approach to correlate CREST(GFN2-xTB//GFN-FF)-generated conformer ensembles with their DFT-optimized counterparts for systematic conformer selection. We analyzed 24 precatalyst structures, performing CREST conformer searches, followed by full DFT optimization. Three filtering methods were evaluated: (i) geometric ligand descriptors, (ii) PCA-based selection, and (iii) DBSCAN clustering using RMSD and energy. The proposed methods were validated on Rh-based catalysts featuring bisphosphine ligands, which are widely employed in hydrogenation reactions. To assess general applicability, both the precatalyst and its corresponding acrylate-bound complex were analyzed. Our results confirm that CREST overestimates ligand flexibility, and energy-based filtering is ineffective. PCA-based selection failed to distinguish conformers by DFT energy, while RMSD-based filtering improved selection but lacked tunability. DBSCAN clustering provided the most effective approach, eliminating redundancies while preserving key configurations. This method remained robust across data sets and is computationally efficient without requiring molecular descriptor calculations. These findings highlight the limitations of energy-based filtering and the advantages of structure-based approaches for conformer selection. While DBSCAN clustering is a practical solution, its parameters remain system-dependent. For high-accuracy applications, refined energy calculations may be necessary; however, DBSCAN-based clustering offers a computationally accessible strategy for rapid catalyst representations involving conformational flexibility.