Benchmarking VAE latent features in downstream tasks for cancer related predictions

van Groeningen, Boris

Benchmarking VAE latent features in downstream tasks for cancer related predictions

Title

Benchmarking VAE latent features in downstream tasks for cancer related predictions

Author

van Groeningen, Boris (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Eltager, M.A.M.E. (mentor)
Abdelaal, T.R.M. (mentor)
Charrout, M. (mentor)
Makrodimitris, S. (mentor)
Reinders, M.J.T. (graduation committee)
Isufi, E. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2021-07-02

Abstract

Using RNA sequence data for predicting patient properties is fairly common by now. In this paper, Variational Auto-Encoders (VAEs) are used to assist in this process. VAEs are a type of neural network seeking to encode data into a smaller dimension called latent space. These latent features are then used to do downstream task analysis such as cancer types, survival time and cancer stages, with the help of a MLP classifier. Furthermore, the training process itself is also analyzed with the usage of UMaps. The purpose of this paper is to compare different VAE models on their effectiveness in providing training data used for the predictions. The predictions mostly consist of guessing when using any of the latent spaces, constructed by the VAE models, as input data for the MLP classifier. The NoVAE model is the only model with slightly better performance when it comes to mean accuracy and standard deviation. The guessing issue is further analyzed with the help of UMaps. The VAEs are able to classify the input data during the training process, but when faced with new data, this end up not being the case. Both the learning rate and β term yield interesting results regarding the modification of the input data and variational property respectively. A lower learning rate leads to better classification, but this is due it deviation less from the original input data. When using a small β term with the β-VAE, the output is similar to that of the VanillaVAE. Meaning the VanillaVAE does not perform better than a regular autoencoder.

Subject

VAE
cancer
machine learning
prediction
RNA

To reference this document use:

http://resolver.tudelft.nl/uuid:6aaeb628-2b7d-43fc-bacd-68c6b875701d

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

Research_Paper_BvG.pdf

772.9 KB

Close viewer