Predicting the Maximum Loading in Zeolites for Hydroisomerization Applications

A Machine Learning Approach

Journal Article (2026)
Author(s)

Eric Johnsson (TU Delft - Flow Physics and Technology)

Shrinjay Sharma (Eindhoven University of Technology)

Arvind Gangoli Rao (TU Delft - Flight Performance and Propulsion)

David Dubbeldam (Science Park 904)

Sofia Calero (Eindhoven University of Technology)

Thijs J.H. Vlugt (TU Delft - Engineering Thermodynamics)

Research Group
Flight Performance and Propulsion
DOI related publication
https://doi.org/10.1021/acs.jpcc.5c08611
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Flight Performance and Propulsion
Journal title
Journal of Physical Chemistry C
Issue number
11
Volume number
130
Pages (from-to)
4299-4314
Downloads counter
1
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Hydroisomerization of alkane isomers is an important step in the manufacture of current kerosene and sustainable aviation fuels. Zeolites are used as acid catalysts in this process. It is therefore important to have predictions of the adsorption capacity or maximum loading of hydrocarbons in zeolites. Here, a cascade model using machine learning models is used to predict the maximum loading of alkane isomers in zeolites. The cascade is composed of a gradient-boosted tree classifier stage that predicts whether adsorption occurs and a regressor predicting the value of the maximum loading. The final data set consists of 45 different adsorbates (both linear and branched alkanes up to C16) and 97 different zeolite structures, resulting in 4365 data points. Descriptors include information on the geometry and topology of zeolite channels as well as the shape and size of the adsorbates. Extra composite descriptors are also present to provide the physical basis for predictions. Multiple regressors of different natures are considered: support vector regressors, gradient-boosted trees, extreme gradient-boosted trees, and the TabPFN pretrained model. TabPFN yields the highest generalization performance and the lowest error. An interpretability analysis using SHAP reveals that the most influential descriptors are physically meaningful, highlighting steric and volumetric constraints as the primary factors controlling the prediction of qmax. It is shown that despite both the classifier and the regressor being insensitive to random splits in data, the regressor is prone to overfitting at low fractions of data withheld for testing. The cascade model is compared to an Artificial Neural Network for training and resource efficiency. Despite training being longer for the neural network, the final model is lighter in both memory and storage. This work is built on our previous research in predicting the Henry coefficients of long-chain alkanes in zeolites. Using this previous model and the findings of this work, one could construct the adsorption isotherm for any alkane, thus enabling the analysis of adsorption behavior of alkane mixtures using IAST.