Approximated and User Steerable tSNE for Progressive Visual Analytics
Nicola Pezzotti (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Boudewijn P.F. Lelieveldt (Leiden University Medical Center, TU Delft - Electrical Engineering, Mathematics and Computer Science)
Laurens van der Maaten (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Thomas Höllt (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Elmar Eisemann (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Anna Vilanova (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE), which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local refinements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis.