Data-Driven Soft Discriminant Maps

None, None

Data-Driven Soft Discriminant Maps

Class-aware Linear Feature Extraction in Imaging Mass Spectrometry

Master Thesis (2021)

Author(s)

T.C. Booij (TU Delft - Mechanical Engineering)

Contributor(s)

Raf Van de Plas – Mentor (TU Delft - Team Raf Van de Plas)

Gleb Vdovin – Graduation committee member (TU Delft - Team Raf Van de Plas)

M.W.E.M. Alfeld – Graduation committee member (TU Delft - Team Matthias Alfeld)

Faculty

Mechanical Engineering

Copyright

Supervised Learning Feature Extraction Dimensionality Reduction Imaging Mass Spectrometry

To reference this document use:

https://resolver.tudelft.nl/uuid:52e2a426-623c-47b0-836d-f764a49be1c3

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

29-01-2021

Awarding Institution

Delft University of Technology

Programme

['Mechanical Engineering | Systems and Control']

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Retrieving actionable information from large datasets is increasingly computationally expensive due to the current trend of ever-increasing dataset sizes. Reducing dataset sizes with dimensionality reduction techniques is often necessary for statistical analysis techniques, such as classification, to be computationally feasible. Most dimensionality reduction methods do not require any additional information to accomplish their task. However, datasets used for classification, for example, are accompanied by a set of class-labels as well. This extra information can improve dimensionality reduction techniques by explicitly preserving features that explain differences between classes. A field where high-dimensional and large datasets are standard is Imaging Mass Spectrometry (IMS), a technique that simultaneously records the abundance and spatial location of molecules throughout biological tissue samples. Classification has been applied to IMS datasets for a wide range of scenarios, including the diagnosis of disease, distinguishing between tumour types for personalized treatment, and identifying biomarkers. A recently introduced dimensionality reduction method called Soft Discriminant Map (SDM), designed to incorporate class information and prevent overfitting when used on high-dimensional datasets, is a promising candidate to reduce the size and dimensionality of IMS datasets. However, SDM currently requires manual setting of a free parameter β that influences class separation in the newly constructed feature-space. This thesis explores the use of SDM on IMS datasets in classification use cases and proposes a framework to set β in a data-driven way: Data-Driven Soft Discriminant Map (DD-SDM). Furthermore, the sensitivity of the classification performance to changes in β is examined. DD-SDM is compared to similar state-of-the-art dimensionality reduction methods in terms of classification performance. The performed experiments show that DD-SDM successfully finds a value for β where the classification performance is on par with, or in some scenarios better than, state-of-the-art dimensionality reduction methods while using fewer features. Setting β either too low or too high results in a suboptimal feature space and worsens classification performance. Golden section search, the search strategy used to find the optimal β in DD-SDM, succeeds in finding the optimal β in fewer iterations than more naive methods. With the use of an artificial dataset in combination with a novel evaluation metric, the Peak Conservation Score (PCS), the distinctive ability of DD-SDM to discard features that are common between classes and to actively select for discriminative features is demonstrated. The DD-SDM framework is furthermore applied to real-world IMS measurements of rat brain and mouse kidney tissue.

Files

Master_Thesis_Data_Driven_Soft... (pdf)

(pdf | 19.8 Mb)

License info not available