Sensitivities and where to find them

Domain shift robustness, attacks, and training variations in visual learning

Doctoral thesis (2024)

Authors

Z. Wang Pattern Recognition and Bioinformatics -

Research Group

Pattern Recognition and Bioinformatics () (TU Delft)

DOI: https://doi.org/10.4233/uuid:21e5b9c2-3e58-495c-881c-634e67ebe645

To reference this document use:

http://resolver.tudelft.nl/uuid:21e5b9c2-3e58-495c-881c-634e67ebe645

More Info

expand_more

Published Date

2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Intelligent Systems

Research Group

Pattern Recognition and Bioinformatics

Abstract

Machine learning aims to solve a task with a certain algorithm or statistical model that is trained on data, with or without labels. As a subcategory of machine learning, deep learning achieves good performance with its flexibility on end-to-end representation learning and architecture design. Despite the successes of deep learning, the output of which can be sensitive to various factors. This work visits three sensitivity factors: distribution shifts, attacks, and human impact.
One factor that can impair the performance of a deep net is a distribution shift between the training data and the test data. Depending on the availability of either data or label, some coping strategies for distribution shifts are domain adaptation, domain generalization, transfer learning and multi-domain learning. We first show how domain adaptation can help to mitigate the gap between historic and modern photos for visual place recognition. We show that this can be realized by focusing the network on the buildings rather than the background with an attention module. In addition, we introduce a domain adaptation loss to align the source domain and the target domain. We thenmove to domain generalization and show that learning domain invariant representations cannot lead to good performance for domain generalization. We suggest to relax the constraint of learning domain invariant representation by learning representations that guarantee a domain invariant posterior, but the resulting representations are not necessarily domain invariant. We coin this type of representation as hypothesis invariant representation. Finally, we study multi-domain learning and transfer learning with the application of deep learning to classify Parkinson’s disease. We show that a temporal attention mechanism is key for transferring useful information from large non-medical public video datasets to Parkinson videos. Weights are learned for various tasks involved in this Parkinson dataset to decide a final score for each single patient.
A deep net is also sensitive to malicious attacks, e.g., adversarial classification attacks or explanation attacks. Adversarial classification attacks manipulate the classification result while explanation attacks change the explanation heatmap but do not alter the original classification results. We notice that the robustness to an adversarial classification attack is linked to the shape of the softmax function and can be improved by using a polynomial softRmax, which is based on a Cauchy class conditional distribution. This also shows that the performance of deep learning is sensitive to the choice of class conditional distribution. Regarding the explanation attacks, we design several ways to attack the GradCAM explanation heatmap to become a predetermined target explanation which does not explain the classification result.
We further explore the influence of human trainers in hyperparameter tuning during the learning of deep nets. A user study is designed to explore the correlation between the performance of a network and the human trainer’s experience of deep learning. Experience of deep learning is found to be correlated with the performance of the deep net.

Files

Thesis_final.pdf

(.pdf | 14.2 Mb)