Robust Learning via Golden Symmetric Loss of (un)Trusted Labels

Conference paper (2023)

Authors

S. Ghiassi Data-Intensive Systems -

Robert Birke University of Turin

Lydia Y. Chen Data-Intensive Systems -

Research Group

Data-Intensive Systems () (TU Delft)

Deep learning models Noisy labels Robust training Symmetric loss function

To reference this document use:

http://resolver.tudelft.nl/uuid:753e81de-d1cc-4c05-9c0a-1fe0a796a583

More Info

expand_more

Published Date

2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Software Technology

Research Group

Data-Intensive Systems

Abstract

Learning robust deep models against noisy labels becomes ever critical when today's data is commonly collected from open platforms and subject to adversarial corruption. The information on the label corruption process, i.e., corruption matrix, can greatly enhance the robustness of deep models but still fall behind in combating hard classes. In this paper, we propose to construct a golden symmetric loss (GSL) based on the estimated corruption matrix as to avoid overfitting to noisy labels and learn effectively from hard classes. GSL is the weighted sum of the corrected regular cross entropy and reverse cross entropy. By leveraging a small fraction of trusted clean data, we estimate the corruption matrix and use it to correct the loss as well as to determine the weights of GSL. We theoretically prove the robustness of the proposed loss function in the presence of dirty labels. We provide a heuristics to adaptively tune the loss weights of GSL according to the noise rate and diversity measured from the dataset. We evaluate our proposed golden symmetric loss on both vision and natural language deep models subject to different types of label noise patterns. Empirical results show that GSL can significantly outperform the existing robust training methods on different noise patterns, showing accuracy improvement up to 18% on CIFAR-100 and 1% on real world noisy dataset of Clothing1M.

Files

1.9781611977653.ch64.pdf

(.pdf | 1.43 Mb)

- Embargo expired in 11-01-2024