Visual analysis of mass cytometry data by hierarchical stochastic neighbour embedding reveals rare cell types

Journal Article (2017)
Author(s)

Vincent van Unen (Leiden University Medical Center)

Thomas Höllt (TU Delft - Electrical Engineering, Mathematics and Computer Science, Leiden University Medical Center)

Nicola Pezzotti (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Na Li (Leiden University Medical Center)

Marcel J.T. Reinders (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Elmar Eisemann (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Frits Koning (Leiden University Medical Center)

Anna Vilanova (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Boudewijn P.F. Lelieveldt (Leiden University Medical Center, TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group
Computer Graphics and Visualisation
DOI related publication
https://doi.org/10.1038/s41467-017-01689-9 Final published version
More Info
expand_more
Publication Year
2017
Language
English
Research Group
Computer Graphics and Visualisation
Volume number
8
Article number
1740
Pages (from-to)
1-10
Downloads counter
334
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.