scMoC: single-cell multi-omics clustering

Journal Article (2022)
Author(s)

M.A.M.E. Eltager (TU Delft - Electrical Engineering, Mathematics and Computer Science)

T.R.M. Abdelaal (Leiden University Medical Center, TU Delft - Electrical Engineering, Mathematics and Computer Science)

A.M.E.T.A. Mahfouz (TU Delft - Electrical Engineering, Mathematics and Computer Science, Leiden University Medical Center)

M.J.T. Reinders (Leiden University Medical Center, TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1093/bioadv/vbac011 Final published version
More Info
expand_more
Publication Year
2022
Language
English
Research Group
Pattern Recognition and Bioinformatics
Issue number
1
Volume number
2
Article number
vbac011
Pages (from-to)
1-8
Downloads counter
456
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Motivation: Single-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells. Results: We propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with comeasurements of scRNA-seq and scATAC-seq from the same cell. We overcome the high sparsity of the scATAC-seq data by using an imputation strategy that exploits the less-sparse scRNA-seq data available from the same cell. Subsequently, scMoC identifies clusters of cells by merging clusterings derived from both data domains individually. We tested scMoC on datasets generated using different protocols with variable data sparsity levels. We show that scMoC (i) is able to generate informative scATAC-seq data due to its RNA-guided imputation strategy and (ii) results in integrated clusters based on both RNA and ATAC information that are biologically meaningful either from the RNA or from the ATAC perspective.