TrustNet

Learning from Trusted Data Against (A)symmetric Label Noise

Conference Paper (2021)
Author(s)

S. Ghiassi (TU Delft - Data-Intensive Systems)

Robert Birke (ABB (Switzerland))

Y. Chen (TU Delft - Data-Intensive Systems)

Research Group
Data-Intensive Systems
Copyright
© 2021 S. Ghiassi, Robert Birke, Lydia Y. Chen
DOI related publication
https://doi.org/10.1145/3492324.3494166
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 S. Ghiassi, Robert Birke, Lydia Y. Chen
Research Group
Data-Intensive Systems
Pages (from-to)
52-62
ISBN (electronic)
9781450391641
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Big Data systems allow collecting massive datasets to feed the data hungry deep learning. Labelling these ever-bigger datasets is increasingly challenging and label errors affect even highly curated sets. This makes robustness to label noise a critical property for weakly-supervised classifiers. The related works on resilient deep networks tend to focus on a limited set of synthetic noise patterns, and with disparate views on their impacts, e.g., robustness against symmetric v.s. asymmetric noise patterns. In this paper, we first extend the theoretical analysis of test accuracy for any given noise patterns. Based on the insights, we design TrustNet that first learns the pattern of noise corruption, being it both symmetric or asymmetric, from a small set of trusted data. Then, TrustNet is trained via a robust loss function, which weights the given labels against the inferred labels from the learned noise pattern. The weight is adjusted based on model uncertainty across training epochs. We evaluate TrustNet on synthetic label noise for CIFAR-10, CIFAR-100 and big real-world data with label noise, i.e., Clothing1M. We compare against state-of-The-Art methods demonstrating the strong robustness of TrustNet under a diverse set of noise patterns.