Design and Interpretation of a Convolutional Neural Network Architecture for Imaging Mass Spectrometry Data

More Info
expand_more

Abstract

Convolutional Neural Networks (CNNs) have emerged primarily from research focusing on image classification tasks and as a result, most of the well-motivated design choices found in literature are relevant to computer vision applications. CNNs' application on Imaging Mass Spectrometry (IMS) data is quite recent and involves new challenges, such as taking into account their unique structure (e.g. both spatial and spectral dimensions).

In this thesis, we suggest a 1-D CNN architecture that extracts local features along the spectral dimension. The aim is to investigate if CNNs improve the classification accuracy compared to other classic Machine Learning (ML) methods such as linear models. Furthermore, we explore Neural Networks (NNs) that employ the novel Sharpened Cosine Similarity (SCS) as a feature extraction method, opposed to convolution. We call those networks SCS-NN in correspondence to the Convolutional-NN (CNN). To evaluate these methods, we implement our pipeline for various IMS datasets, with different characteristics and classification tasks, using several performance metrics such as balanced accuracy and F1 score.

Moreover, we provide a detailed description of the methodology pipeline used for the CNN architecture design. The suggested methodology is the Tree-structured Parzen Estimator (TPE) algorithm, a Bayesian optimization technique for automated architecture selection. By implementing TPE, we manage to explore and exploit efficiently a complex and large hyperparameter configuration space and automatically select optimal hyperparameters (such as number of convolutional layers, kernel size, strides, learning rates etc.). This automated approach reduces time consumption, errors, and the need for specialized knowledge in biology and biochemistry that would be associated with manual design. In addition to developing a pipeline for designing, training and evaluating a CNN for IMS data classification, we also apply a model agnostic interpretation methodology based on SHapley Additive exPlanations (SHAP) and provide SHAP score maps that visualize the importance of features in the spatial dimension of the IMS datacube.

In this thesis, we present and analyse the automated selection of 1-D CNN architectures for IMS data classification based on the TPE algorithm. Furthermore, we investigate a novel alternative to convolution, SCS, and evaluate its strengths and weaknesses in IMS data classification. The experimental results show that the TPE-generated CNN architectures outperform all the other applied classifiers. Finally, our interpretation of the CNN models reveals that accuracy performance alone might not be a sufficient criterion to trust the model's output.