Self-supervised Learning for Tumor Microenvironment Analysis

Addressing Label Scarcity in Multiplexed Immunofluorescence Imaging with Novel Feature Extraction Techniques

Master Thesis (2023)
Author(s)

D.M. Spengler (TU Delft - Mechanical Engineering)

Contributor(s)

C.S. Smith – Mentor (TU Delft - BN/Nynke Dekker Lab)

Hayri E. Balcioglu – Mentor (Erasmus MC)

R. Van de Plas – Graduation committee member (TU Delft - Team Raf Van de Plas)

S. Korovin – Graduation committee member (TU Delft - Team Carlas Smith)

Faculty
Mechanical Engineering
Copyright
© 2023 Daniel Spengler
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Daniel Spengler
Graduation Date
17-05-2023
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Systems and Control']
Sponsors
Erasmus MC
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The study of tumor microenvironments (TMEs) and immune cell composition in cancer, a disease characterized by uncontrolled growth and spread of tumor cells, has become increasingly important for understanding tumor progression and patient outcomes. Tools such as the TME-Analyzer enable this kind of research, but their manual workflows highlight a common problem in medical imaging: the scarcity of labeled data. This limits the efficiency and applicability of supervised learning algorithms to improve such medical image analysis tools. Self-supervised learning algorithms offer a promising alternative by learning feature representations without requiring labeled data. This thesis aims to address the issue of label scarcity by exploring the potential of self-supervised learning models for TME analysis involving the classification of individual cells in multiplex immunofluorescence (MxIF) microscopy images of triple-negative breast cancer (TNBC) tissue.

To enable the learning of feature representations from MxIF images with an arbitrary number of color channels, this thesis proposes to pre-train an encoder network on every image channel separately according to the SimCLR algorithm and perform classification of multi-channel images by feeding the concatenated feature representation outputs of every channel to a classifier network — referred to as the Siamese configuration. A hyperparameter search is conducted to optimize the SimCLR encoder’s ability to learn high-quality feature representations of individual cells in MxIF images of TNBC tissue. Upon obtaining an optimal set of hyperparameters, the effectiveness of the learned feature representations in improving label-efficiency for individual cell classification is assessed.

The results demonstrate that the proposed Siamese configuration improves the accuracy of classifying the inflammation status of TNBC tumor sections by 2.63%. Additionally, the optimal set of hyperparameters identified through the search include the use of the normalized temperature cross-entropy loss function with low temperature and an added image intensity thresholding term, as well as zoom and brightness/contrast augmentations. Furthermore, the optimized self-supervised learning model improves label-efficiency for individual cell classification, maintaining performance with only 40% of labeled data, while performance drops only when the label percentage is reduced below this threshold.

Files

License info not available