Learning Human Preferences for Motion Planning in Robotic Manipulation
More Info
expand_more
Abstract
Humans often demonstrate diverse behaviours due to their personal preferences, for instance related to their individual execution style or personal margin for safety. In this paper, we consider the problem of integrating such preferences into planning of trajectories for robotic manipulators. We first learn
reward functions that represent the users path and motion preferences from kinesthetic demonstration. We then use a discrete-time trajectory optimization scheme to produce trajectories that adhere to both task requirements and user preferences. Our work goes beyond the state of art by achieving generalization
of preferences to new task instances, and designing a large feature set that enables capturing of the dynamical aspects of the manipulation, such as preferences about the timing of motion. We implement our algorithm on a Franka Emika Panda 7-DoF robotic arm, and present the functionality and flexibility of our approach by testing it in a user study. The results show that non-expert users are able to teach the robot their preferences with just a few iterations of feedback.