R.J. Perez Dattari | TU Delft Repository

TamedPUMA

Safe and stable imitation learning with geometric fabrics

Journal article (2025) - Saray Bakker, Rodrigo Pérez-Dattari, Cosimo Della Santina, Wendelin Böhmer, Javier Alonso-Mora

Using the language of dynamical systems, Imitation learning (IL) provides an intuitive and effective way of teaching stable task-space motions to robots with goal convergence. Yet, IL techniques are affected by serious limitations when it comes to ensuring safety and fulfillment of physical constraints. With this work, we solve this challenge via TamedPUMA, an IL algorithm augmented with a recent development in motion generation called geometric fabrics. As both the IL policy and geometric fabrics describe motions as artificial second-order dynamical systems, we propose two variations where IL provides a navigation policy for geometric fabrics. The result is a stable imitation learning strategy within which we can seamlessly blend geometrical constraints like collision avoidance and joint limits. Beyond providing a theoretical analysis, we demonstrate TamedPUMA with simulated and real-world tasks, including a 7-DoF manipulator. ...

Scalable Task Planning via Large Language Models and Structured World Representations

Journal article (2025) - Rodrigo Pérez-Dattari, Z. Li, R. Babuska, J. Kober, C. Della Santina

Planning methods often struggle with computational intractability when solving task-level problems in large-scale environments. This work explores how the commonsense knowledge encoded in Large Language Models (LLMs) can be leveraged to enhance planning techniques for such complex scenarios. Specifically, we propose an approach that uses LLMs to efficiently prune irrelevant components from the planning problem's state space, thereby substantially reducing its complexity. We demonstrate the efficacy of our system through extensive experiments in a household simulation environment as well as real-world validation on a 7-DoF manipulator (video: https://youtu.be/6ro2UOtOQS4). ...

PUMA

Deep Metric Imitation Learning for Stable Motion Primitives

Journal article (2024) - Rodrigo Pérez-Dattari, Cosimo Della Santina, Jens Kober

Imitation learning (IL) facilitates intuitive robotic programming. However, ensuring the reliability of learned behaviors remains a challenge. In the context of reaching motions, a robot should consistently reach its goal, regardless of its initial conditions. To meet this requirement, IL methods often employ specialized function approximators that guarantee this property by construction. Although effective, these approaches come with some limitations: 1) they are typically restricted in the range of motions they can model, resulting in suboptimal IL capabilities, and 2) they require explicit extensions to account for the geometry of motions that consider orientations. To address these challenges, we introduce a novel stability loss function that does not constrain the function approximator's architecture and enables learning policies that yield accurate results. Furthermore, it is not restricted to a specific state space geometry; therefore, it can easily incorporate the geometry of the robot's state space. Proof of the stability properties induced by this loss is provided and the method is empirically validated in various settings. These settings include Euclidean and non-Euclidean state spaces, as well as first-order and second-order motions, both in simulation and with real robots. More details about the experimental results can be found at https://youtu.be/ZWKLGntCI6w. ...

Generalizable Robotic Imitation Learning

Interactive Learning and Inductive Bias

Doctoral thesis (2024) - Rodrigo Pérez-Dattari, Jens Kober, Robert Babuska

Robots have the potential to assume tasks across various real-world scenarios. To achieve this, we require adaptable and reactive robots that can robustly deal with products and environments that present variability. For example, in the agro-food sector, each tomato plant inside a greenhouse is unique; hence, different robotic motions are required when interacting with different plants. Unfortunately, due to their simplicity, most robotic solutions currently employed are rigid and rely on hand-crafted rules. Such solutions perform well in controlled and repetitive environments; however, they fall short when these conditions are not met. As a consequence, a large family of problems remains unsolved..... ...

Stable Motion Primitives via Imitation and Contrastive Learning

Journal article (2023) - Rodrigo Pérez-Dattari, Jens Kober

Learning from humans allows nonexperts to program robots with ease, lowering the resources required to build complex robotic solutions. Nevertheless, such data-driven approaches often lack the ability to provide guarantees regarding their learned behaviors, which is critical for avoiding failures and/or accidents. In this work, we focus on reaching/point-to-point motions, where robots must always reach their goal, independently of their initial state. This can be achieved by modeling motions as dynamical systems and ensuring that they are globally asymptotically stable. Hence, we introduce a novel Contrastive Learning loss for training deep neural networks (DNN) that, when used together with an Imitation Learning loss, enforces the aforementioned stability in the learned motions. Differently from previous work, our method does not restrict the structure of its function approximator, enabling its use with arbitrary DNNs and allowing it to learn complex motions with high accuracy. We validate it using datasets and a real robot. In the former case, motions are two- and four-dimensional, modeled as first- and second-order dynamical systems. In the latter, motions are three, four, and six-dimensional, of first and second order, and are used to control a 7-DoF robot manipulator in its end effector space and joint space. ...

Robotic Packaging Optimization with Reinforcement Learning

Conference paper (2023) - Eveline Drijver, Rodrigo Pérez-Dattari, Jens Kober, Cosimo Della Santina, Zlatan Ajanovic

Intelligent manufacturing is becoming increasingly important due to the growing demand for maximizing productivity and flexibility while minimizing waste and lead times. This work investigates automated secondary robotic food packaging solutions that transfer food products from the conveyor belt into containers. A major problem in these solutions is varying product supply which can cause drastic productivity drops. Conventional rule-based approaches, used to address this issue, are often inadequate, leading to violation of the industry's requirements. Reinforcement learning, on the other hand, has the potential of solving this problem by learning responsive and predictive policy, based on experience. However, it is challenging to utilize it in highly complex control schemes. In this paper, we propose a reinforcement learning framework, designed to optimize the conveyor belt speed while minimizing interference with the rest of the control system. When tested on real-world data, the framework exceeds the performance requirements (99.8% packed products) and maintains quality (100% filled boxes). Compared to the existing solution, our proposed framework improves productivity, has smoother control, and reduces computation time. ...

Visually-guided motion planning for autonomous driving from interactive demonstrations

Journal article (2022) - Rodrigo Pérez-Dattari, Bruno Brito, Oscar de Groot, Jens Kober, Javier Alonso-Mora

The successful integration of autonomous robots in real-world environments strongly depends on their ability to reason from context and take socially acceptable actions. Current autonomous navigation systems mainly rely on geometric information and hard-coded rules to induce safe and socially compliant behaviors. Yet, in unstructured urban scenarios these approaches can become costly and suboptimal. In this paper, we introduce a motion planning framework consisting of two components: a data-driven policy that uses visual inputs and human feedback to generate socially compliant driving behaviors (encoded by high-level decision variables), and a local trajectory optimization method that executes these behaviors (ensuring safety). In particular, we employ Interactive Imitation Learning to jointly train the policy with the local planner, a Model Predictive Controller (MPC), which results in safe and human-like driving behaviors. Our approach is validated in realistic simulated urban scenarios. Qualitative results show the similarity of the learned behaviors with human driving. Furthermore, navigation performance is substantially improved in terms of safety, i.e., number of collisions, as compared to prior trajectory optimization frameworks, and in terms of data-efficiency as compared to prior learning-based frameworks, broadening the operational domain of MPC to more realistic autonomous driving scenarios. ...

Imitation Learning with Inconsistent Demonstrations through Uncertainty-based Data Manipulation

Conference paper (2021) - Peter Valletta, Rodrigo Pérez-Dattari, Jens Kober

Aleatoric uncertainty estimation, based on the observed training data, is applied for the detection of conflicts in a demonstration data set. The particular focus of this paper is the resolution of conflicting data resulting from scenarios with equivalent action choices, such as obstacle avoidance, path planning or multiple joint configurations. In terms of the estimated uncertainty, the proposed algorithm aims to decrease this otherwise irreducible value through direct alteration of the accrued data set and to provide data that a policy-learning neural network is able to fit appropriately. The proposed algorithm was validated with real robot scenarios while learning from inconsistent demonstrations, where the resulting policies consistently achieved their prescribed objectives. A video showing our method and experiments can be found at: https://youtu.be/oGYnzlW9Ncw. ...

Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

Conference paper (2020) - Rodrigo Perez Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober

Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent’s actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL. ...

Interactive Learning of Temporal Features for Control

Shaping Policies and State Representations From Human Feedback

Journal article (2020) - Rodrigo Perez-Dattari, Carlos Celemin, Giovanni Franzese, Javier Ruiz-del-Solar, Jens Kober

Current ongoing industry revolution demands more flexible products, including robots in household environments and medium-scale factories. Such robots should be able to adapt to new conditions and environments and be programmed with ease. As an example, let us suppose that there are robot manipulators working on an industrial production line and that they need to perform a new task. If these robots were hard coded, it could take days to adapt them to the new settings, which would stop production at the factory. Robots that non-expert humans could easily program would speed up the process considerably. ...

Continuous control for high-dimensional state spaces

An interactive learning approach

Conference paper (2019) - Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-Del-Solar, Jens Kober

Deep Reinforcement Learning (DRL) has become a powerful methodology to solve complex decision-making problems. However, DRL has several limitations when used in real-world problems (e.g., robotics applications). For instance, long training times are required and cannot be accelerated in contrast to simulated environments, and reward functions may be hard to specify/model and/or to compute. Moreover, the transfer of policies learned in a simulator to the real-world has limitations (reality gap). On the other hand, machine learning methods that rely on the transfer of human knowledge to an agent have shown to be time efficient for obtaining well performing policies and do not require a reward function. In this context, we analyze the use of human corrective feedback during task execution to learn policies with high-dimensional state spaces, by using the D-COACH framework, and we propose new variants of this framework. D-COACH is a Deep Learning based extension of COACH (COrrective Advice Communicated by Humans), where humans are able to shape policies through corrective advice. The enhanced version of DCOACH, which is proposed in this paper, largely reduces the time and effort of a human for training a policy. Experimental results validate the efficiency of the D-COACH framework in three different problems (simulated and with real robots), and show that its enhanced version reduces the human training effort considerably, and makes it feasible to learn policies within periods of time in which a DRL agent do not reach any improvement. ...