Robust Multi-label Active Learning for Missing Labels

More Info


Multi-label classification has gained a lot of attraction in the field of computer vision over the past couple of years. Here, each instance belongs to multiple class labels simultaneously. There are numerous methods for Multi-label classification, however all of them make the assumption that either the training images are completely labelled or that label correlations are given. Since Active Learning is frequently used when not much data is available, it could be used to determine the missing labels by querying an oracle. This paper proposes a novel solution that combines the current state-of-the-art for Multi-label classification with Active Learning to infer the missing labels. This is done with sampling strategies that try to select the most informative sample from the dataset by exploring the amount of missing labels. With these strategies, we try to minimize the relabeling cost for all samples, while maximizing the information gained. The chosen method called Hard sampling with entropy then looks to select those samples that both the model and we find informative. The chosen measure along with the other measure are then explored and evaluated on a subset of the MSCOCO dataset on 20%, 40% and 60% noise. Hard sampling with entropy then outperforms the state-of-the-art by more then 30%, as well as the baseline sampling method by 2% for 60% noise.