Domain Adaptation for Rare Classes Augmented with Synthetic Samples

Bachelor Thesis (2021)
Author(s)

T. Das (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

R. Bruintjes – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

A. Lengyel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

J.C. van Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

S. Beery – Mentor (California Institute of Technology)

A. Zarras – Graduation committee member (TU Delft - Cyber Security)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Tuhin Das
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Tuhin Das
Graduation Date
01-07-2021
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

To alleviate lower classification performance on rare
classes in imbalanced datasets, a possible solution is to augment the
underrepresented classes with synthetic samples. Domain adaptation can be
incorporated in a classifier to decrease the domain discrepancy between real
and synthetic samples. While domain adaptation is generally applied on
completely synthetic source domains and real target domains, we explore how
domain adaptation can be applied when only a single rare class is augmented
with simulated samples. As a testbed, we use a camera trap animal dataset with
a rare deer class, which is augmented with synthetic deer samples. We adapt
existing domain adaptation methods to two new methods for the single rare class
setting: DeerDANN, based on the Domain-Adversarial Neural Network (DANN), and
DeerCORAL, based on deep correlation alignment (Deep CORAL) architectures. Experiments
show that DeerDANN has the highest improvement in deer classification accuracy
of 24.0% versus 22.4% improvement of DeerCORAL when compared to the baseline. Further,
both methods require fewer than 10k synthetic samples, as used by the baseline,
to achieve these higher accuracies. DeerCORAL requires the least number of
synthetic samples (2k deer), followed by DeerDANN (8k deer).



Files

License info not available