Multi Label Loss Correction against Missing and Corrupted Labels

None, None; None, None; None, None

Multi Label Loss Correction against Missing and Corrupted Labels

Journal Article (2022)

Author(s)

S. Ghiassi (TU Delft - Data-Intensive Systems)

Robert Birke (University of Turin)

Y. Chen (TU Delft - Data-Intensive Systems)

Research Group

Data-Intensive Systems

Multi-label learning Corrupted labels Loss correction Missing labels Robust classifier

To reference this document use:

https://resolver.tudelft.nl/uuid:26f7998b-de34-4afe-82ff-891de9cd3d04

More Info

expand_more

Publication Year

2022

Language

English

Research Group

Data-Intensive Systems

Volume number

189

Pages (from-to)

359-374

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Missing and corrupted labels can significantly ruin the learning process and, consequently, the classifier performance. Multi-label learning where each instance is tagged with variable number of labels is particularly affected. Although missing labels (false-negatives) is a well-studied problem in multi-label learning, it is considerably more challenging to have both false-negatives (missing labels) and false-positives (corrupted labels) simultaneously in multi-label datasets. In this paper, we propose Multi-Label Loss with Self Correction (MLLSC) which is a loss robust against coincident missing and corrupted labels. MLLSC computes the loss based on the true-positive (true-negative) or false-positive (false-negative) labels and deep neural network expertise. To distinguish between false-positive (false-negative) and true-positive (true-negative) labels, we use the output probability of the deep neural network during the learning process. Our method As MLLSC can be combined with different types of multi-label loss functions, we also address the label imbalance problem of multi-label datasets. Empirical evaluation on real-world vision datasets, i.e., MS-COCO, and MIR-FLICKR, shows that our method under medium (0.3) and high (0.6) corrupted and missing label probabilities outperform the state-of-the-art methods by, on average 23.97% and 9.31% mean average precision (mAP) points, respectively.

Files

Ghiassi23b.pdf

(pdf | 2.12 Mb)