Predicting the Maximum Loading in Zeolites for Hydroisomerization Applications
A Machine Learning Approach
E. Johnsson (TU Delft - Aerospace Engineering)
C.M. de Servi – Mentor (TU Delft - Flight Performance and Propulsion)
A. Gangoli Rao – Mentor (TU Delft - Flight Performance and Propulsion)
P. Proesmans – Graduation committee member (TU Delft - Operations & Environment)
T.J.H. Vlugt – Graduation committee member (TU Delft - Engineering Thermodynamics)
S. Sharma – Graduation committee member (TU Delft - Engineering Thermodynamics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Hydroisomerization of alkane isomers is an important step in the manufacture of current kerosene and sustainable aviation fuels. Zeolites are used as acid catalysts in the process. It is therefore important to have predictions of the maximum loading of hydrocarbons in zeolites. Here, a cascade model using machine learning models is used to predict the maximum loading of alkane isomers in zeolites. The cascade is composed of a gradient-boosted tree classifier stage that predicts whether adsorption occurs or not, and a regressor predicting the value of the maximum loading. The final dataset consists of 45 different molecules (both linear and branched alkanes up to C16) and 97 different zeolites structures, resulting in 4365 datapoints. Descriptors include information on the geometry and topology of zeolite channels, as well as shape and size of molecules. Extra composite descriptors are also present to provide the models a physical basis for predictions. Multiple regressors of different nature are considered: Support Vector Regressors, Gradient-Boosted Trees, extreme Gradient-Boosted Trees, and the TabPFN pretrained model. Out of all the models, TabPFN yields the highest generalization performance and lowest error. An interpretability analysis is conducted to assess whether the decisions abide by the governing physics of adposition. It is confirmed that the top descriptor choices abided by the necessary physical constraints, but also that secondary properties such as shape-based selectivity are also accounted for. It is shown that despite both classifier and regressor being insensitive to random splits in data, the regressor is prone to overfitting at low fractions of data withheld for testing. The cascade model is compared with an Artificial Neural Network for training and deployability. Despite training taking more resources for the neural network, the latter is lighter both in memory and storage when compared to the cascade. This work builds on previous research in predicting the Henry coefficient at zero loading. Using this previous model and the findings of this work, one can draw the full adsorption isotherm for any alkane, thus enabling the analysis of adsorption behaviour of alkane mixtures using IAST.