Capturing Clinical Heterogeneity in Rheumatoid Arthritis

Evaluating the LIVI Latent Space using Gene Expression Data

Bachelor Thesis (2026)
Author(s)

P.A. Lo (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

I.C. den Hond – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

K. Biharie – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

M.J.T. Reinders – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

C. Lofi – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
24-06-2026
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
5
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Rheumatoid arthritis (RA) is a heterogeneous autoimmune disease: patients who share the same diagnosis respond differently to the same therapy. Zhang et al. stratified the RA synovium into six cell-type abundance phenotypes (CTAPs) by clustering counted, pre-annotated cell-type abundances. The LIVI model, built on a variational autoencoder developed to map trans-eQTLs in non-diagnosed donors, instead learns donor structure directly from gene expression, separating donor-level variation from cell-state variation. The developers of the model left two questions open: whether LIVI can also capture disease status in a diagnosed cohort, and what the optimal number of donor-level embeddings is for a given dataset. We address these by applying LIVI to a CITE-seq dataset of 314,011 cells from 70 RA and 9 OA donors across four different numbers of donor embeddings. In this research, we show that although LIVI is given no cell-type or diagnostic labels, its donor space recovers the underlying cell-type relationships between the six cell types defining CTAPs: a lymphoid (T, B, NK) versus non-lymphoid (myeloid, endothelial, fibroblast) block, which is consistent across all four dimensions. The CTAPs themselves do not form discretely separable groups in the donor space, but at lower dimensionalities, individual donor factors begin to distinguish them along the same axis. Reading the genes behind these factors was limited. We hypothesize this is due to LIVI's sparsity penalty, which was tuned for detecting trans-eQTLs on a much larger cohort, leaving ribosomal pathways dominating the loadings. Therefore, LIVI's donor space captures disease state information, but on a broader scale compared to Zhang et al.'s discretely defined CTAPs. For this particular dataset, the signal becomes stronger at lower dimensionalities, but the interpretability of the signal is limited.

Files

Final_paper_rp.pdf
(pdf | 1.55 Mb)
License info not available