Hv

H.C. van Geffen

info

Please Note

2 records found

t-distributed stochastic neighbor embedding (t-SNE) is a dimensionality reduction technique able to embed high-dimensional data into 2 or 3 dimensions for the purpose of visualization, and has proven to be useful in various applications, such as single-cell analysis. Normally t-SNE embeds into Euclidean space, but recent work shows that embedding into hyperbolic space can lead to higher quality embeddings. Most notably on data containing hierarchical structures, as seen in single-cell measurements. As such it would be interesting to generalize computation of t-SNE embeddings to embeddings in hyperbolic space. A missing tool for this is the approximation of the objective and the gradient. This is needed as the cost for evaluating the objective and gradient is quadratic in the number of data points to be embedded. In practice such approximations are already used for Euclidean embeddings. In this thesis we introduce the first approximation scheme for hyperbolic t-SNE embeddings. To achieve this we generalize a Barnes-Hut approximation scheme to hyperbolic space. A major difficulty for this is to transfer the use of a quadtree in the Euclidean plane to hyperbolic space. We propose the usage of a polar quadtree in the Poincaré disk and show that our solution yields substantial gain in run time, in particular when embedding larger data sets, with the resulting embeddings being similar in quantitative and qualitative aspects. ...
Bachelor thesis (2019) - Tom Edixhoven, Hunter van Geffen, Bas Kruit, Mels Smit, Maurício Aniche, Otto Visser, Huijuan Wang
For a steel company it is advantageous to be able to easily track steel through the production process. At Tata Steel this is currently done with the Material Tracking Table. However, generating this table takes months. Therefore a new system had to be developed. This paper describes the building of such a new system, which generates this Material Tracking Table in less than 1 hour, as well as the related systems concerning the acquisition of the input data and the visualisation of the resulting output data. ...