TrustNet

None, None; None, None; None, None

TrustNet

Learning from Trusted Data Against (A)symmetric Label Noise

Conference Paper (2021)

Author(s)

S. Ghiassi (TU Delft - Data-Intensive Systems)

Robert Birke (ABB (Switzerland))

Lydia Y. Chen (TU Delft - Data-Intensive Systems)

Research Group

Data-Intensive Systems

Copyright

DOI related publication

https://doi.org/10.1145/3492324.3494166

Deep neural networks Noise estimation Noise transition matrix Noisy labels in big data Robust loss function

To reference this document use:

https://resolver.tudelft.nl/uuid:23f38d97-b491-4b66-9830-6f0773e243c9

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Research Group

Data-Intensive Systems

Pages (from-to)

52-62

ISBN (electronic)

9781450391641

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Big Data systems allow collecting massive datasets to feed the data hungry deep learning. Labelling these ever-bigger datasets is increasingly challenging and label errors affect even highly curated sets. This makes robustness to label noise a critical property for weakly-supervised classifiers. The related works on resilient deep networks tend to focus on a limited set of synthetic noise patterns, and with disparate views on their impacts, e.g., robustness against symmetric v.s. asymmetric noise patterns. In this paper, we first extend the theoretical analysis of test accuracy for any given noise patterns. Based on the insights, we design TrustNet that first learns the pattern of noise corruption, being it both symmetric or asymmetric, from a small set of trusted data. Then, TrustNet is trained via a robust loss function, which weights the given labels against the inferred labels from the learned noise pattern. The weight is adjusted based on model uncertainty across training epochs. We evaluate TrustNet on synthetic label noise for CIFAR-10, CIFAR-100 and big real-world data with label noise, i.e., Clothing1M. We compare against state-of-The-Art methods demonstrating the strong robustness of TrustNet under a diverse set of noise patterns.

Files

3492324.3494166.pdf

(pdf | 1.08 Mb)