Unsupervised Manifold Alignment with TopoGAN

Aligning multi-modal biological data without correspondence information available across modalities

Master Thesis (2021)
Author(s)

A. Singh (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Ahmed Mahfouz – Mentor (Leiden University Medical Center)

Marcel Reinders – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Christoph Lofi – Graduation committee member (TU Delft - Web Information Systems)

Tamim R. Abdelaal – Coach (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Akash Singh
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Akash Singh
Graduation Date
26-08-2021
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Data Science and Technology']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Single-cell multi-modal omics promises to open new doors in bioinformatics by measuring different aspects of cells, thus offering multiple perspectives on the underlying biological phenomenon. Although simultaneous multi-modal measurement protocols do exist, their inherent technical limitations necessitate focus on single modality measurements. These single modality measurements, however, destroy the cell in question, thus making simultaneous measurements impossible. This gives rise to a great availability of multi-modal biological data with no inter-data set sample/feature correspondence. This work proposes a novel approach to align multi-modal data sets in an unsupervised fashion using an Autoencoder to obtain latent embeddings of the modalities and a Generative Adversarial Network to align these latent representations. Minimising the topological error between the original and latent representations of a data set is central to this approach which enables not just the superposition but also alignment of different modalities. Two recently published methods, UnionCom and MMD-MA, have been used for comparison and benchmarking. The approach, termed TopoGAN, has been demonstrated to give consistently stable alignments, give better quantitative performance in realistic unsupervised settings, and scale much better in terms of memory requirements as compared to these state-of-the-art methods.

Files

License info not available