Consequences and opportunities arising due to sparser single-cell RNA-seq datasets

Journal Article (2023)
Author(s)

Gerard A. Bouland (TU Delft - Electrical Engineering, Mathematics and Computer Science, Leiden University Medical Center)

Ahmed Mahfouz (TU Delft - Electrical Engineering, Mathematics and Computer Science, Leiden University Medical Center)

Marcel J.T. Reinders (TU Delft - Electrical Engineering, Mathematics and Computer Science, Leiden University Medical Center)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1186/s13059-023-02933-w Final published version
More Info
expand_more
Publication Year
2023
Language
English
Research Group
Pattern Recognition and Bioinformatics
Issue number
1
Volume number
24
Article number
86
Downloads counter
568
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With the number of cells measured in single-cell RNA sequencing (scRNA-seq) datasets increasing exponentially and concurrent increased sparsity due to more zero counts being measured for many genes, we demonstrate here that downstream analyses on binary-based gene expression give similar results as count-based analyses. Moreover, a binary representation scales up to ~ 50-fold more cells that can be analyzed using the same computational resources. We also highlight the possibilities provided by binarized scRNA-seq data. Development of specialized tools for bit-aware implementations of downstream analytical tasks will enable a more fine-grained resolution of biological heterogeneity.