scMoC: single-cell multi-omics clustering

Journal Article (2022)
Author(s)

Mostafa Eltager (TU Delft - Pattern Recognition and Bioinformatics)

Tamim R.M. Abdelaal (Leiden University Medical Center, TU Delft - Pattern Recognition and Bioinformatics)

A.M.E.T.A. Mahfouz (TU Delft - Pattern Recognition and Bioinformatics, Leiden University Medical Center)

Marcel Reinders (Leiden University Medical Center, TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2022 M.A.M.E. Eltager, T.R.M. Abdelaal, A.M.E.T.A. Mahfouz, M.J.T. Reinders
DOI related publication
https://doi.org/10.1093/bioadv/vbac011
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 M.A.M.E. Eltager, T.R.M. Abdelaal, A.M.E.T.A. Mahfouz, M.J.T. Reinders
Research Group
Pattern Recognition and Bioinformatics
Issue number
1
Volume number
2
Pages (from-to)
1-8
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Motivation: Single-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells. Results: We propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with comeasurements of scRNA-seq and scATAC-seq from the same cell. We overcome the high sparsity of the scATAC-seq data by using an imputation strategy that exploits the less-sparse scRNA-seq data available from the same cell. Subsequently, scMoC identifies clusters of cells by merging clusterings derived from both data domains individually. We tested scMoC on datasets generated using different protocols with variable data sparsity levels. We show that scMoC (i) is able to generate informative scATAC-seq data due to its RNA-guided imputation strategy and (ii) results in integrated clusters based on both RNA and ATAC information that are biologically meaningful either from the RNA or from the ATAC perspective.