Identifying biological markers in the gut microbiome associated with celiac disease using machine learning

Bachelor Thesis (2023)
Author(s)

P. Persianov (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Thomas Abeel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

E.A. van der Toorn – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

David Calderon Franco – Mentor (TU Delft - BT/Environmental Biotechnology)

Thomas Hollt – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Petr Persianov
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Petr Persianov
Graduation Date
29-06-2023
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Celiac disease is a genetic autoimmune disorder caused by a negative reaction to gluten associated with alterations in the gut microbiome. This study explored the potential of machine learning models and feature selection methods in identifying biomarkers for celiac disease using gut microbiome data. The performance of several machine learning models was evaluated, and the impact of different feature selection methods, including MRMR, ANOVA, and information gain, was examined. The findings revealed comparable performance among the models without feature selection. However, the choice of feature selection method had varying effects on model performance, with logistic regression and support vector machines being more sensitive than random forest and XGBoost models. Notably, several identified bacteria species, such as Bacteroides eggerthii, Parabacteroides johnsonii, Faecalibacterium prausnitzii, and Ruminococcus_D bicirculans, have been previously associated with celiac disease, reinforcing their potential as biomarkers.

Files

RP_Report_30.pdf
(pdf | 0.424 Mb)
License info not available