L.F. van der Spaa | TU Delft Repository

Learning Human Preferences for Physical Human-Robot Cooperation

Doctoral thesis (2024) - L.F. van der Spaa, J. Kober, R. Babuska

Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner’s intention and preferences plays an important role. Preferences tend to be highly personal, and additionally depend on the cooperation partner and the cooperation itself. They can be hard to define in terms a robot would understand, and may change over time. This thesis focuses on learning ‘useful models’ from observed behavior, to let our robot adapt its behavior to better match its human partner’s preferences, and thus improve the cooperation.
The aim is to capture personalized approximate models of human preferences –how a person likes to do something– from very few interactive observations, providing only small amounts of imprecise data, such that the robot can use the model to improve each user’s comfort. First, we learn a model to predict and optimize the human ergonomics in a pHRC task, such that our robot can ropose a plan, for both the human and itself, to solve the task in a way that is more ergonomic for its human partner. However, people do not necessarily prefer to act ergonomically, nor do we want to impose on them what a robot thinks best. Therefore, next, we apply inverse reinforcement learning (IRL), to capture less restrictive preference models: 1) path and velocity preferences for motion planning, and 2) on a higher level of abstraction, which (grasp or motion) action to initiate for proactive physical support. For learning to take the correct action in cooperation, we developed the disagreement-aware variable impedance (DAVI) controller to smoothly transition between providing active guidance and allowing the human to demonstrate alternative behavior..... ...

Development and Evaluation of Advanced Cyclist Assistance Systems on a Bicycle Simulator

Conference paper (2024) - Yu Wang, Sonja Dorfbauer, Linda Van Der Spaa, Alexander G. Mirnig, Florian Michahelles, Philipp Wintersberger

Research on cycling safety has recently gained the attention of the HCI community. While there have been multiple proposals for automated driving features on bikes, we are unaware of a project that systematically aims to translate and evaluate driver assistance systems from the automotive to the bike domain to promote cycling safety in traffic. Thus, we implemented an adaptive cruise control and a lane-keeping/centering system with hard- and software on a motion-based bicycle simulator and investigated their potential in a virtual reality experiment. Based on performance measurements and subjective ratings, results showed significant improvements in technology acceptance, subjective workload, and driving performance regarding the cruise control. In contrast, the lane-centering and lane-keeping features were rated significantly worse than the baseline without such assistance. The paper concludes with a critical reflection on automated driving features for bicycles. ...

Simultaneously learning intentions and preferences during physical human-robot cooperation

Journal article (2024) - Linda van der Spaa, Jens Kober, Michael Gienger

The advent of collaborative robots allows humans and robots to cooperate in a direct and physical way. While this leads to amazing new opportunities to create novel robotics applications, it is challenging to make the collaboration intuitive for the human. From a system’s perspective, understanding the human intentions seems to be one promising way to get there. However, human behavior exhibits large variations between individuals, such as for instance preferences or physical abilities. This paper presents a novel concept for simultaneously learning a model of the human intentions and preferences incrementally during collaboration with a robot. Starting out with a nominal model, the system acquires collaborative skills step-by-step within only very few trials. The concept is based on a combination of model-based reinforcement learning and inverse reinforcement learning, adapted to fit collaborations in which human and robot think and act independently. We test the method and compare it to two baselines: one that imitates the human and one that uses plain maximum entropy inverse reinforcement learning, both in simulation and in a user study with a Franka Emika Panda robot arm. ...

An Incremental Inverse Reinforcement Learning Approach for Motion Planning with Separated Path and Velocity Preferences

Journal article (2023) - S. Avaei, L.F. van der Spaa, L. Peternel, J. Kober

Humans often demonstrate diverse behaviors due to their personal preferences, for instance, related to their individual execution style or personal margin for safety. In this paper, we consider the problem of integrating both path and velocity preferences into trajectory planning for robotic manipulators. We first learn reward functions that represent the user path and velocity preferences from kinesthetic demonstration. We then optimize the trajectory in two steps, first the path and then the velocity, to produce trajectories that adhere to both task requirements and user preferences. We design a set of parameterized features that capture the fundamental preferences in a pick-and-place type of object transportation task, both in the shape and timing of the motion. We demonstrate that our method is capable of generalizing such preferences to new scenarios. We implement our algorithm on a Franka Emika 7-DoF robot arm and validate the functionality and flexibility of our approach in a user study. The results show that non-expert users are able to teach the robot their preferences with just a few iterations of feedback. ...

Disagreement-Aware Variable Impedance Control for Online Learning of Physical Human-Robot Cooperation Tasks

Conference paper (2022) - L.F. van der Spaa, G. Franzese, J. Kober, Michael Gienger

In order to make the coexistence between humans and robots a reality, we must understand how they may cooperate more effectively. Modern robots, empowered with reliable controls and advanced machine learning reasoning can face this challenge. In this article, we presented a Disagreement- Aware Variable Impedance (DAVI) Controller, where the robot stiffness is regulated as a function of the perceived disagreement with the human cooperator. We tested the algorithm on a 7 DoF Franka Emika Panda robot performing the learning of a pick&place task with continuous adaptation of the goal location and the via-points with human interactive corrections, triggered by our proposed approach. A validation study was conducted with 5 users in order to understand the reliability of the method. ...

Predicting and Optimizing Ergonomics in Physical Human-Robot Cooperation Tasks

Conference paper (2020) - Linda van der Spaa, Michael Gienger, Tamas Bates, Jens Kober

This paper presents a method to incorporate ergonomics into the optimization of action sequences for bi-manual human-robot cooperation tasks with continuous physical interaction. Our first contribution is a novel computational model of the human that allows prediction of an ergonomics assessment corresponding to each step in a task. The model is learned from human motion capture data in order to predict the human pose as realistically as possible. The second contribution is a combination of this prediction model with an informed graph search algorithm, which allows computation of human-robot cooperative plans with improved ergonomics according to the incorporated method for ergonomic assessment. The concepts have been evaluated in simulation and in a small user study in which the subjects manipulate a large object with a 32 DoF bimanual mobile robot as partner. For all subjects, the ergonomic-enhanced planner shows their reduced ergonomic cost compared to a baseline planner. ...

Unparameterized optimization of the spring characteristic of parallel elastic actuators

Journal article (2019) - Linda F. van der Spaa, Wouter J. Wolfslag, Martijn Wisse

In electrically actuated robots most energy losses are due to the heating of the actuators. This energy loss can be greatly reduced with parallel elastic actuators, by optimizing the elastic element such that it delivers most of the required torques. Previously used optimization methods relied on parameterizing the spring characteristic, thereby limiting the set of spring characteristics optimized over and with that the loss reduction that can be obtained. This letter shows that such parametrization is not necessary; a method is presented to compute the optimal characteristic as an analytic function of the trajectory. The efficacy of this method is demonstrated using two examples. The first example considers the optimal spring characteristic for a parallel elastic actuator supporting the human ankle during walking. The second example applies the method in combination with trajectory optimization on a single degree of freedom robot performing a specific pick-and-place task. The task at hand has a height difference between the pick and the place location. With the analytical optimal spring, it is shown that the robot can recover enough of the energy released by the package to function without external electric energy supply. ...