Uncertainty-based Interactive Machine Learning

Master thesis (2020)

Authors

P. Valletta Mechanical Engineering

Contributors

J. Kober Learning & Autonomous Control - Mechanical, Maritime and Materials Engineering (mentor)

Rodrigo Pérez-Dattari Learning & Autonomous Control - Mechanical, Maritime and Materials Engineering (mentor)

P. Mohajerin Esfahani Team Tamas Keviczky - Mechanical, Maritime and Materials Engineering (graduation committee member)

W. Pan Robot Dynamics - Mechanical, Maritime and Materials Engineering (graduation committee member)

Faculty

Mechanical Engineering, Mechanical Engineering

Uncertainty Estimation Learning from Demonstrations Interactive machine learning

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:885ee74c-4ae1-4a5e-a58f-4e2801a69844

Published Date

24-08-2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical Engineering

Abstract

Interactive machine learning describes a collection of methodologies in which a human user actively participates in a novice agent’s learning process, through providing corrective or evaluate feedback or demonstrative actions. A primary assumption in these methods is that user input is at worst nearoptimal, however a realistic set of demonstrations will often contain conflicting or poor examples, which degrade the quality of the learnt policies. This project explores methods for the detection of such undesirable features in data and develops an algorithm for policy training with suboptimal demonstrations, while leveraging the generalisation and scalability qualities of artificial neural networks. Uncertainty estimation, which presents a structured approach for the quantification of a network’s confidence in the accuracy of its output, based on the observed training data, is applied for the detection of unwanted features in a demonstration dataset. The particular focus of this project is conflicting data resulting from scenarios with equivalent action choices, such as the obstacle avoidance setting. Following thorough testing on various environments, it is shown that novice policies may be trained to achieve a desired goal in multi-dimensional spaces with either discrete or continuous data, despite the presence of conflicts in this training data.

Files

ThesisReport_PeterValletta.pdf

(.pdf | 20.8 Mb)