Consequences and opportunities arising due to sparser single-cell RNA-seq datasets

Journal Article (2023)
Authors

G.A. Bouland (TU Delft - Pattern Recognition and Bioinformatics, Leiden University Medical Center)

Ahmed Mahfouz (TU Delft - Pattern Recognition and Bioinformatics, Leiden University Medical Center)

Marcel J T Reinders (TU Delft - Pattern Recognition and Bioinformatics, Leiden University Medical Center)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2023 G.A. Bouland, A.M.E.T.A. Mahfouz, M.J.T. Reinders
To reference this document use:
https://doi.org/10.1186/s13059-023-02933-w
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 G.A. Bouland, A.M.E.T.A. Mahfouz, M.J.T. Reinders
Research Group
Pattern Recognition and Bioinformatics
Issue number
1
Volume number
24
DOI:
https://doi.org/10.1186/s13059-023-02933-w
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With the number of cells measured in single-cell RNA sequencing (scRNA-seq) datasets increasing exponentially and concurrent increased sparsity due to more zero counts being measured for many genes, we demonstrate here that downstream analyses on binary-based gene expression give similar results as count-based analyses. Moreover, a binary representation scales up to ~ 50-fold more cells that can be analyzed using the same computational resources. We also highlight the possibilities provided by binarized scRNA-seq data. Development of specialized tools for bit-aware implementations of downstream analytical tasks will enable a more fine-grained resolution of biological heterogeneity.