Predicting the Maximum Loading in Zeolites for Hydroisomerization Applications
A Machine Learning Approach
Eric Johnsson (TU Delft - Flow Physics and Technology)
Shrinjay Sharma (Eindhoven University of Technology)
Arvind Gangoli Rao (TU Delft - Flight Performance and Propulsion)
David Dubbeldam (Science Park 904)
Sofia Calero (Eindhoven University of Technology)
Thijs J.H. Vlugt (TU Delft - Engineering Thermodynamics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Hydroisomerization of alkane isomers is an important step in the manufacture of current kerosene and sustainable aviation fuels. Zeolites are used as acid catalysts in this process. It is therefore important to have predictions of the adsorption capacity or maximum loading of hydrocarbons in zeolites. Here, a cascade model using machine learning models is used to predict the maximum loading of alkane isomers in zeolites. The cascade is composed of a gradient-boosted tree classifier stage that predicts whether adsorption occurs and a regressor predicting the value of the maximum loading. The final data set consists of 45 different adsorbates (both linear and branched alkanes up to C16) and 97 different zeolite structures, resulting in 4365 data points. Descriptors include information on the geometry and topology of zeolite channels as well as the shape and size of the adsorbates. Extra composite descriptors are also present to provide the physical basis for predictions. Multiple regressors of different natures are considered: support vector regressors, gradient-boosted trees, extreme gradient-boosted trees, and the TabPFN pretrained model. TabPFN yields the highest generalization performance and the lowest error. An interpretability analysis using SHAP reveals that the most influential descriptors are physically meaningful, highlighting steric and volumetric constraints as the primary factors controlling the prediction of qmax. It is shown that despite both the classifier and the regressor being insensitive to random splits in data, the regressor is prone to overfitting at low fractions of data withheld for testing. The cascade model is compared to an Artificial Neural Network for training and resource efficiency. Despite training being longer for the neural network, the final model is lighter in both memory and storage. This work is built on our previous research in predicting the Henry coefficients of long-chain alkanes in zeolites. Using this previous model and the findings of this work, one could construct the adsorption isotherm for any alkane, thus enabling the analysis of adsorption behavior of alkane mixtures using IAST.