ConforFormer

representation for molecules through understanding of conformers

Journal Article (2026)
Author(s)

Mas Pieter Klein (Student TU Delft)

Irina Rudenko (Avride Inc., Austin)

Evgeny A. Pidko (TU Delft - Applied Sciences, TU Delft - Applied Sciences)

Ivan Bushmarinov (Perplexity AI, Belgrade)

Research Group
ChemE/Inorganic Systems Engineering
DOI related publication
https://doi.org/10.1039/d6dd00096g Final published version
More Info
expand_more
Publication Year
2026
Language
English
Research Group
ChemE/Inorganic Systems Engineering
Journal title
Digital Discovery
Downloads counter
5
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Molecular properties of chemical compounds are governed not by a single unique arrangement of atoms (2D molecular graph) but by ensembles of three-dimensional conformers, yet most molecular representations for machine learning approaches either ignore conformational diversity or use it implicitly to augment molecular graphs. Here we introduce ConforFormer, a geometry-first foundation model capable of learning conformation-robust molecular embeddings directly from the 3D atomic coordinates. By aligning representations across multiple conformers of the same molecules through a novel contrastive objective, ConforFormer produces compact, task-agnostic embeddings that can be generated once and directly applied to downstream tasks, including property prediction and structural similarity, without extensive fine-tuning. Across a range of quantum-chemical and bioactivity benchmarks, these frozen embeddings achieve competitive performance without task-specific fine-tuning, while offering improved stability on small datasets. Beyond property prediction, the learned embedding space allows to discriminate with high-precision molecular conformers and isomers, substantially outperforming classical fingerprint-based similarity measures. This implies that explicit exposure to conformational relationships induces representations that generalize beyond the conformer recognition task itself, capturing chemically meaningful structural constraints directly from 3D geometries. More broadly, our results suggest that incorporating conformation-awareness as a foundational learning task provides a fundamental route towards transferable, geometry-centered molecular representations particularly relevant for complex chemical systems, where conventional graph-based representations are ambiguous or ill-defined.