Unsupervised Manifold Alignment with TopoGAN

Aligning multi-modal biological data without correspondence information available across modalities

More Info
expand_more

Abstract

Single-cell multi-modal omics promises to open new doors in bioinformatics by measuring different aspects of cells, thus offering multiple perspectives on the underlying biological phenomenon. Although simultaneous multi-modal measurement protocols do exist, their inherent technical limitations necessitate focus on single modality measurements. These single modality measurements, however, destroy the cell in question, thus making simultaneous measurements impossible. This gives rise to a great availability of multi-modal biological data with no inter-data set sample/feature correspondence. This work proposes a novel approach to align multi-modal data sets in an unsupervised fashion using an Autoencoder to obtain latent embeddings of the modalities and a Generative Adversarial Network to align these latent representations. Minimising the topological error between the original and latent representations of a data set is central to this approach which enables not just the superposition but also alignment of different modalities. Two recently published methods, UnionCom and MMD-MA, have been used for comparison and benchmarking. The approach, termed TopoGAN, has been demonstrated to give consistently stable alignments, give better quantitative performance in realistic unsupervised settings, and scale much better in terms of memory requirements as compared to these state-of-the-art methods.