Segmented Active Reward Learning

Master thesis (2017)

Authors

R.M. Olsthoorn

Contributors

J. Kober (mentor)

Department

Delft Center for Systems and Control (Mechanical, Maritime and Materials Engineering) (TU Delft)

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:136dca36-139d-48fa-b2c4-de1f4a94dd6d

Published Date

26-04-2017

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical, Maritime and Materials Engineering

Department

Delft Center for Systems and Control

Abstract

The use of robotic systems outside the branch of tasks currently common in industry requires the development of novel intelligent control methods. In this thesis we will aim to improve on a recent machine learning method known as active reward learning. This method is able to teach a robotic system a task using human expert ratings on demonstrated robotic trajectories. Current implementations of this method use information collected from complete trajectories without regard for time specific features. This work will incorporate time segmentation as a new feature in two extensions of the active reward learning framework. In one extension, demonstrations are still rated over entire trajectories, leaving extraction of the important time segments to the learning algorithm. In the second extension we allow the expert to rate trajectory segments directly. The two constructed algorithms are tested using a robot simulator. It is shown that these new methods are able to learn simple end effector tasks using reasonable numbers of queries and rollouts.

Files

Msc_Thesis_R.M.Olsthoorn.pdf

(pdf | 2.41 Mb)