Exploring the computational feasibility limits of perplexity in t-SNE for scenarios of limited working memory

None, None

Exploring the computational feasibility limits of perplexity in t-SNE for scenarios of limited working memory

Bachelor Thesis (2025)

Author(s)

D. Netzov (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

K Hildebrandt – Mentor (TU Delft - Computer Graphics and Visualisation)

C. Lofi – Graduation committee member (TU Delft - Web Information Systems)

Martin Skrodzki – Mentor (TU Delft - Computer Graphics and Visualisation)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:c81211d5-c67e-47a7-b787-83317a27b4c3

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

27-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Modern data analysis often involves working with large multidimensional datasets. Visualizing this kind of data helps leverage human intuition and pattern recognition to reveal hidden relationships. t-SNE is a widely used tool for creating such visualizations. Despite its popularity, it suffers drawbacks in the form of hard-to-tune parameters with no heuristic for guaranteed best results. Due to the size of the data researchers have to work with, the algorithm can often exceed the available memory and lead to slowdowns and crashes. This paper investigates the behaviour of memory usage with respect to the tunable parameter perplexity and the size of the data. It provides a reliable way for researchers to predict the memory consumption before running the algorithm for the popular openTSNE implementation of t-SNE. In addition, a modification to reduce the peak memory usage of the implementation is presented. Together, these contributions improve the reliability and efficiency of t-SNE pipelines in memory-constrained environments.

Files

Research_paper_final.pdf

(pdf | 0.429 Mb)

License info not available