Multi-AL: Robust Active learning for Multi-label Classifier

None, None

Multi-AL: Robust Active learning for Multi-label Classifier

Bachelor Thesis (2021)

Author(s)

M.J. Basting (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Lydia Chen – Graduation committee member (TU Delft - Data-Intensive Systems)

Taraneh Younesian – Mentor (TU Delft - Data-Intensive Systems)

Amirmasoud Ghiassi – Mentor (TU Delft - Data-Intensive Systems)

F.A. Kuipers – Graduation committee member (TU Delft - Embedded Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Multi-label classification Active Learning Deep Neural Networks Label Noise

To reference this document use:

https://resolver.tudelft.nl/uuid:93fba5b3-f31a-452a-9af6-5c372a00abda

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

02-07-2021

Awarding Institution

Delft University of Technology

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Multi-label learning is becoming more and moreimportant as real-world data often contains multi-ple labels. The dataset used for learning such aclassifier is of great importance. Acquiring a cor-rectly labelled dataset is however a difficult task.Active learning is a method which can, given anoisy dataset, identify important instances for anexpert to label. This greatly reduces the amountof instances needed to train an accurate classi-fier, and thus reduces the cost of cleaning a noisydataset. Therefore, this paper aims to present an ac-tive learning algorithm, focused on wrongly labeleddata, combined with a deep neural network formulti-label image classification. The proposed ac-tive learning solution is divided into two measures;a mislabelling likelihood and an informativenessmeasure together with an option to identify anduse highly probable clean instances in the dataset.Experiments performed on the real world dataset,called Microsoft COCO, with 20, 40 and 60% in-jected label noise show that Multi-AL outperformsthe current state-of-the-art multi-label learning al-gorithm called ASL by 28% while only using 600labelled instances in total and 250 extracted ’clean’instances. Multi-AL additionally outperforms ran-dom sampling by 3% on average for 20 and 40%random label noise when sampling from a wronglylabelled dataset of 23k instances.

Files

Final_report_mark_basting.pdf

(pdf | 1.69 Mb)

License info not available