Do we use the Right Measure? Challenges in Evaluating Reward Learning Algorithms

None, None; None, None

Do we use the Right Measure? Challenges in Evaluating Reward Learning Algorithms

Journal Article (2023)

Author(s)

Nils Wilde (TU Delft - Mechanical Engineering)

Javier Alonso-Mora (TU Delft - Mechanical Engineering)

Research Group

Learning & Autonomous Control

Human Robot Interaction Reward Learning

To reference this document use

https://resolver.tudelft.nl/uuid:e71a68da-f7df-4b9b-a24e-8cbc211d763a

More Info

expand_more

Publication Year

2023

Language

English

Research Group

Learning & Autonomous Control

Volume number

205

Pages (from-to)

1553-1562

Event

6th Conference on Robot Learning, CoRL 2022 (2022-12-14 - 2022-12-18), Auckland, New Zealand

Downloads counter

173

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reward learning is a highly active area of research in human-robot interaction (HRI), allowing a broad range of users to specify complex robot behaviour. Experiments with simulated user input play a major role in the development and evaluation of reward learning algorithms due to the availability of a ground truth. In this paper, we review measures for evaluating reward learning algorithms used in HRI, most of which fall into two classes. In a theoretical worst case analysis and several examples, we show that both classes of measures can fail to effectively indicate how good the learned robot behaviour is. Thus, our work contributes to the characterization of sim-to-real gaps of reward learning in HRI.

Files

Wilde23a.pdf

(pdf | 0.817 Mb)

License info not available