Learning Human Preferences for Motion Planning in Robotic Manipulation
S. Avaei (TU Delft - Mechanical Engineering)
J. Kober – Mentor (TU Delft - Learning & Autonomous Control)
Luka Peternel – Mentor (TU Delft - Human-Robot Interaction)
L.F. van der Spaa – Mentor (TU Delft - Mechanical Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Humans often demonstrate diverse behaviours due to their personal preferences, for instance related to their individual execution style or personal margin for safety. In this paper, we consider the problem of integrating such preferences into planning of trajectories for robotic manipulators. We first learn
reward functions that represent the users path and motion preferences from kinesthetic demonstration. We then use a discrete-time trajectory optimization scheme to produce trajectories that adhere to both task requirements and user preferences. Our work goes beyond the state of art by achieving generalization
of preferences to new task instances, and designing a large feature set that enables capturing of the dynamical aspects of the manipulation, such as preferences about the timing of motion. We implement our algorithm on a Franka Emika Panda 7-DoF robotic arm, and present the functionality and flexibility of our approach by testing it in a user study. The results show that non-expert users are able to teach the robot their preferences with just a few iterations of feedback.