Finding Biomarkers for Schizophrenia
Can Machine Learning algorithms identify schizophrenia-related biomarkers within metagenomic data derived from the human gut microbiome?
T.M. Bastow (TU Delft - Electrical Engineering, Mathematics and Computer Science)
E.A. van der Toorn – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
D. Calderon Franco – Mentor
Thomas Abeel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
T. Höllt – Graduation committee member (TU Delft - Computer Graphics and Visualisation)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
There is mounting evidence indicating a relation- ship between the gut microbiome composition and the development of mental diseases but the mech- anisms remain unclear. Shotgun sequenced data from 90 schizophrenic patients and 81 sex, age, weight, and location matched controls was used for three machine learning models: Logistic Re- gression, Random Forests, and XGBoost. The 20 most relevant species in the decision mak- ing of each classifier was retained and the over- lap between models recorded. There is a total 19 overlapping species between the models’ top 20 most relevant species, with 10 species over- lapping on all three models. Bifidobacterium bi- fidum, Akkermansia muciniphila, Eubacterium sir- aeum, Alistipes finegoldii, Intestinibacter bartlet- tii, Bifidobacterium pseudocatenulatum, and Strep- tococcus thermophilus are of particular interest as they are reported as enriched in schizophrenia sam- ples in existing literatures. Phoceicola vulgatus has been found to play a significant role in the classi- fiers decisions and is enriched in healthy samples in the literature. One species, Ruthenibacterium lactatiformans, and one co-abundant gene group, Eubacterium sp. CAG:180, consistently ranked as the most important features across all three classi- fiers, despite the absence of reporting in existing literature. This study could be expanded by using genus-level data. Further research should be done to validate the species mentioned above as potential biomarkers for schizophrenia.