Print Email Facebook Twitter Predicting Cell Populations in Single Cell Mass Cytometry Data Title Predicting Cell Populations in Single Cell Mass Cytometry Data Author Abdelaal, T.R.M. (TU Delft Pattern Recognition and Bioinformatics; Leiden University Medical Center) van Unen, Vincent (Leiden University Medical Center) Höllt, T. (TU Delft Computer Graphics and Visualisation; Leiden University Medical Center) Koning, Frits (Leiden University Medical Center) Reinders, M.J.T. (TU Delft Pattern Recognition and Bioinformatics; Leiden University Medical Center) Mahfouz, A.M.E.T.A. (TU Delft Pattern Recognition and Bioinformatics; Leiden University Medical Center) Date 2019 Abstract Mass cytometry by time-of-flight (CyTOF) is a valuable technology for high-dimensional analysis at the single cell level. Identification of different cell populations is an important task during the data analysis. Many clustering tools can perform this task, which is essential to identify “new” cell populations in explorative experiments. However, relying on clustering is laborious since it often involves manual annotation, which significantly limits the reproducibility of identifying cell-populations across different samples. The latter is particularly important in studies comparing different conditions, for example in cohort studies. Learning cell populations from an annotated set of cells solves these problems. However, currently available methods for automatic cell population identification are either complex, dependent on prior biological knowledge about the populations during the learning process, or can only identify canonical cell populations. We propose to use a linear discriminant analysis (LDA) classifier to automatically identify cell populations in CyTOF data. LDA outperforms two state-of-the-art algorithms on four benchmark datasets. Compared to more complex classifiers, LDA has substantial advantages with respect to the interpretable performance, reproducibility, and scalability to larger datasets with deeper annotations. We apply LDA to a dataset of ~3.5 million cells representing 57 cell populations in the Human Mucosal Immune System. LDA has high performance on abundant cell populations as well as the majority of rare cell populations, and provides accurate estimates of cell population frequencies. Further incorporating a rejection option, based on the estimated posterior probabilities, allows LDA to identify previously unknown (new) cell populations that were not encountered during training. Altogether, reproducible prediction of cell population compositions using LDA opens up possibilities to analyze large cohort studies based on CyTOF data. Subject cell population predictionmachine learningmass cytometrysingle cell To reference this document use: http://resolver.tudelft.nl/uuid:77d1f5e3-65db-4464-a5bb-53d7b4739678 DOI https://doi.org/10.1002/cyto.a.23738 ISSN 1552-4922 Source Cytometry. Part A, 95 (7), 769-781 Part of collection Institutional Repository Document type journal article Rights © 2019 T.R.M. Abdelaal, Vincent van Unen, T. Höllt, Frits Koning, M.J.T. Reinders, A.M.E.T.A. Mahfouz Files PDF Abdelaal_et_al_2019_Cytom ... Part_A.pdf 851.86 KB Close viewer /islandora/object/uuid:77d1f5e3-65db-4464-a5bb-53d7b4739678/datastream/OBJ/view