Memory usage analysis of binary clustering algorithm

What is the gain in peak memory usage of the binary clustering algorithm compared to current state-of-the-art clustering methods?

Bachelor Thesis (2023)
Author(s)

P. Verigo (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

G.A. Bouland – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Marcel J. T. Reinders – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

BHM Gerritsen – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Pavel Verigo
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Pavel Verigo
Graduation Date
28-06-2023
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The rapid increase in the size of single-cell RNAseq datasets presents significant performance challenges when conducting evaluations and extracting information. We research an alternative input data format that utilizes binarization. Our main focus is an analysis of peak memory usage. An in-depth exploration of the solution’s design and implementation is provided, specifically emphasizing the strategies used to minimize memory usage. We analyzed and validated memory usage patterns and asymptotes using memory profiling tools. However, our findings suggest that gains in reducing memory usage on big modern datasets are attributable only to binarized data format rather than workflow interaction with the new format, which we found to be independent of the input format.

Files

PVFinal.pdf
(pdf | 0.815 Mb)
License info not available