Relevance Detection of Unknown Classes through Cluster Distances

None, None

Relevance Detection of Unknown Classes through Cluster Distances

Based on Statistical Distance Measures in Feature Space

Master Thesis (2022)

Author(s)

D.D.D.T.D. Sitaldin (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Kees Vuik – Mentor (TU Delft - Delft Institute of Applied Mathematics)

Gertjan Burghouts – Graduation committee member (TNO)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine learning Deep Learning Computer Vision

To reference this document use:

https://resolver.tudelft.nl/uuid:7625c980-f44c-4c53-b171-a6425774d5c9

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

25-10-2022

Awarding Institution

Delft University of Technology

Programme

['Applied Mathematics']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the open world, machine learning (ML) models can encounter a multitude of unknown or novel classes. In a surveillance, safety, or security use case, unknown samples can pose potential threats that are hard to detect since those samples have never been trained on. At the same time, most of the unknowns that will be encountered by a surveillance ML model will be harmless. This results in too many unwanted alerts and manual analyses, of harmless unknowns that have been flagged.

Through this thesis, for the first time (to the best of our knowledge), a method is developed that can automatically assess the relevance of unknown classes, by modelling their image features as clusters (or distributions) and comparing them using statistical distance measures. Our use case lies in computer vision for military applications, where based on the user input, relevance is defined. We define road vehicles as relevant classes and use those for our training set. Our aim is to build a model that can successfully classify new unseen road vehicles as ‘relevant unknowns’, while also successfully classifying harmless unknown birds that are not part of the training set, as ‘irrelevant unknowns’. On the DomainNet data-set, we demonstrate that our novel method can very accurately determine the relevance of unknown classes at test time for both low and high-dimensional data, with AUC scores ranging from 0.99 to a perfect 1.00.

Files

Thesis_Dewwret_Sitaldin.pdf

(pdf | 53.5 Mb)

License info not available