Population-based Active Learning for Black-Box Regression

None, None

Population-based Active Learning for Black-Box Regression

Master Thesis (2020)

Author(s)

M. van Deursen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Matthijs T.J. Spaan – Mentor (TU Delft - Algorithmics)

Marco Loog – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

H. Farah – Mentor (TU Delft - Transport and Planning)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Active Learning Population-based Black-Box regression

To reference this document use:

https://resolver.tudelft.nl/uuid:dada244e-3263-488f-b3dc-4f8012923b0d

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

19-10-2020

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Many applications employ models to represent real-life environments efficiently. To allow these models to be realistic it is commonly fitted using a dataset containing labeled samples. When obtaining a label for a sample from the environment is expensive, it is key that the dataset contains only those samples that aid in providing a realistic model the most. Active Learning (AL) provides searching strategies for selecting these samples based on different heuristics: diversity, informativeness, and representativeness. This thesis focuses specifically on population-based AL for regression, where both sample and output space are infinite. Its goal is to create a performant, efficient, extensible, and generally applicable selection strategy for this setting. To allow for the latter a black-box model, through which its strategy can be used with virtually any model. The strategy itself is modular, allowing for extensions. This strategy iteratively concentrates on an interesting subregion within the sampling space through three modular steps: discretizing the sample space, providing fitness scores to this discretization, and restricting the sample space based on these fitness scores further. This strategy is applied to both a scientific polynomial setting, as well as a car-following setting. Experiments show that this approach outperforms randomly selecting a sample in both cases, especially when a long labeling time is considered.

Files

Max_van_Deursen_Thesis.pdf

(pdf | 2.44 Mb)

License info not available