A. Serra Gomez | TU Delft Repository

Motion Planning in Dynamic Environments with Learned Scalable Policies

Doctoral thesis (2025) - A. Serra Gomez, Javier Alonso-Mora, J.W. Böhmer

The application of multi-robot systems has gained popularity in recent years. Multi-robot systems show great potential in scaling up robotic applications in surveillance, monitoring, and exploration. Although single robots can already be used to automatize search and rescue, and surveillance tasks, their capability to solve their given task is still dependent on the region of interest size. For example, in expansive environments, a single robot’s ability to cover or surveil the entire area effectively diminishes, resulting in information gaps. To this end, this thesis aims to improve multi-robot coordination and navigation in active perception tasks, e.g. exploration and surveillance. The multi-robot coordination problem aims to determine when and with whom multiple robots should exchange information to avoid collisions while navigating to their goal. The active perception problem involves guiding a sensing robot to positions rich in task-relevant information. The primary objective of this dissertation is to provide algorithmic contributions, specifically focusing on the obtention of robust policies that can adapt and scale to fleets and tasks of varying sizes. To this end, this thesis leverages the combination of high-level policies, learned through Reinforcement Learning, with robust low-level optimal control policies. Firstly, the communication-based coordination problem for decentralized multi-robot systems is formally defined and modeled as a Markov Decision Process. This thesis then proposes a novel communication policy for decentralized multi-robot systems, improving coordination and collision avoidance. The proposed policy, learned via Reinforcement Learning, allows robots to selectively communicate, requesting trajectory plans from potential risks while assuming constant velocities for others. This utilizes an attention-based neural network architecture for scalability, integrated with Non-Linear Model Predictive Control for safe and robust motion planning. When tested with 12 robots, it reduced communications compared to alternatives, maintaining safety. Scalable and robust, it performed well with different team sizes and in the presence of observation noise. Real-world tests on quadrotors confirmed its practical applicability. The Active Perception problem represents the second major challenge addressed in this dissertation. Specifically, this thesis addresses the challenge of using a drone to collect semantic information for classifying multiple moving targets, focusing on computing control inputs for optimal viewpoints. This task is complicated by the variable amount of targets to classify in the region of interest, and the use of a ”black-box" classifier, like a deep learning neural network, which lacks clear analytical relationships between viewpoints and outputs. This thesis proposes an attention-based architecture, trained with Reinforcement Learning (RL), which determines the best viewpoints for the drone to gather evidence from multiple unclassified targets, considering their movement, orientation, and potential occlusions. A low-level MPC controller then guides the drone to these viewpoints. The approach outperforms various baselines and shows adaptability to new scenarios and scalability to numerous targets with varying movement dynamics. To conclude the thesis, the previous approach is extended to more realistic and larger environments where targets need to be localized, tracked, and then classified. This thesis introduces a novel decentralized hybrid multi-camera system designed for surveillance and monitoring applications. Traditional fixed camera networks suffer from blind spots and backlighting issues. A decentralized hybrid framework is proposed that integrates both static and mobile cameras to actively and dynamically enhance critical information gathering. All networked cameras collaborate to monitor and localize people in the environment by comparing their local information. The mobile camera is guided by a viewpoint control policy to maximize semantic information from observed targets. The framework was implemented in a photorealistic environment using Unreal Engine and enabled distributed communications through the Robot Operating System (ROS), bridging the gap between simulation and real-world applications. Results in large environments demonstrate the advantages of collaborative mobile cameras over static and individual setups both in target identification and tracking accuracy, respectively. In crowded scenarios, mobile cameras excel in avoiding occlusions and capturing desired viewpoints, improving the percentage of classified tracked targets compared to static setups. Qualitatively, mobile cameras provide superior target observation quality unmatched by the static framework. In summary, this thesis makes significant contributions that are validated through extensive evaluations in simulated photo-realistic environments and with commercial drones, demonstrating the potential for practical applications. Despite the progress, the thesis acknowledges the remaining challenges in deploying multi-robot systems in realworld perception tasks, especially when the policy is learned, and suggests directions for future research. ...

The application of multi-robot systems has gained popularity in recent years. Multi-robot systems show great potential in scaling up robotic applications in surveillance, monitoring, and exploration. Although single robots can already be used to automatize search and rescue, and surveillance tasks, their capability to solve their given task is still dependent on the region of interest size. For example, in expansive environments, a single robot’s ability to cover or surveil the entire area effectively diminishes, resulting in information gaps. To this end, this thesis aims to improve multi-robot coordination and navigation in active perception tasks, e.g. exploration and surveillance. The multi-robot coordination problem aims to determine when and with whom multiple robots should exchange information to avoid collisions while navigating to their goal. The active perception problem involves guiding a sensing robot to positions rich in task-relevant information. The primary objective of this dissertation is to provide algorithmic contributions, specifically focusing on the obtention of robust policies that can adapt and scale to fleets and tasks of varying sizes. To this end, this thesis leverages the combination of high-level policies, learned through Reinforcement Learning, with robust low-level optimal control policies. Firstly, the communication-based coordination problem for decentralized multi-robot systems is formally defined and modeled as a Markov Decision Process. This thesis then proposes a novel communication policy for decentralized multi-robot systems, improving coordination and collision avoidance. The proposed policy, learned via Reinforcement Learning, allows robots to selectively communicate, requesting trajectory plans from potential risks while assuming constant velocities for others. This utilizes an attention-based neural network architecture for scalability, integrated with Non-Linear Model Predictive Control for safe and robust motion planning. When tested with 12 robots, it reduced communications compared to alternatives, maintaining safety. Scalable and robust, it performed well with different team sizes and in the presence of observation noise. Real-world tests on quadrotors confirmed its practical applicability. The Active Perception problem represents the second major challenge addressed in this dissertation. Specifically, this thesis addresses the challenge of using a drone to collect semantic information for classifying multiple moving targets, focusing on computing control inputs for optimal viewpoints. This task is complicated by the variable amount of targets to classify in the region of interest, and the use of a ”black-box" classifier, like a deep learning neural network, which lacks clear analytical relationships between viewpoints and outputs. This thesis proposes an attention-based architecture, trained with Reinforcement Learning (RL), which determines the best viewpoints for the drone to gather evidence from multiple unclassified targets, considering their movement, orientation, and potential occlusions. A low-level MPC controller then guides the drone to these viewpoints. The approach outperforms various baselines and shows adaptability to new scenarios and scalability to numerous targets with varying movement dynamics. To conclude the thesis, the previous approach is extended to more realistic and larger environments where targets need to be localized, tracked, and then classified. This thesis introduces a novel decentralized hybrid multi-camera system designed for surveillance and monitoring applications. Traditional fixed camera networks suffer from blind spots and backlighting issues. A decentralized hybrid framework is proposed that integrates both static and mobile cameras to actively and dynamically enhance critical information gathering. All networked cameras collaborate to monitor and localize people in the environment by comparing their local information. The mobile camera is guided by a viewpoint control policy to maximize semantic information from observed targets. The framework was implemented in a photorealistic environment using Unreal Engine and enabled distributed communications through the Robot Operating System (ROS), bridging the gap between simulation and real-world applications. Results in large environments demonstrate the advantages of collaborative mobile cameras over static and individual setups both in target identification and tracking accuracy, respectively. In crowded scenarios, mobile cameras excel in avoiding occlusions and capturing desired viewpoints, improving the percentage of classified tracked targets compared to static setups. Qualitatively, mobile cameras provide superior target observation quality unmatched by the static framework. In summary, this thesis makes significant contributions that are validated through extensive evaluations in simulated photo-realistic environments and with commercial drones, demonstrating the potential for practical applications. Despite the progress, the thesis acknowledges the remaining challenges in deploying multi-robot systems in realworld perception tasks, especially when the policy is learned, and suggests directions for future research.

Interaction-Aware Sampling-Based MPC with Learned Local Goal Predictions

Conference paper (2024) - W.T.C.M. Jansma, E. Trevisan, A. Serra Gomez, J. Alonso-Mora

Motion planning for autonomous robots in tight, interaction-rich, and mixed human-robot environments is challenging. State-of-the-art methods typically separate prediction and planning, predicting other agents’ trajectories first and then planning the ego agent’s motion in the remaining free space. However, agents’ lack of awareness of their influence on others can lead to the freezing robot problem. We build upon Interaction-Aware Model Predictive Path Integral (IAMPPI) control and combine it with learning-based trajectory predictions, thereby relaxing its reliance on communicated short-term goals for other agents. We apply this framework to Autonomous Surface Vessels (ASVs) navigating urban canals. By generating an artificial dataset in real sections of Amsterdam’s canals, adapting and training a prediction model for our domain, and proposing heuristics to extract local goals, we enable effective cooperation in planning. Our approach improves autonomous robot navigation in complex, crowded environments, with potential implications for multi-agent systems and human-robot interaction. available at: autonomousrobots.nl/pubpage/IA_MPPI_LBM.html ...

Distributed multi-target tracking and active perception with mobile camera networks

Journal article (2024) - Sara Casao, Álvaro Serra-Gómez, Ana C. Murillo, Wendelin Böhmer, Javier Alonso-Mora, Eduardo Montijano

Smart cameras are an essential component in surveillance and monitoring applications, and they have been typically deployed in networks of fixed camera locations. The addition of mobile cameras, mounted on robots, can overcome some of the limitations of static networks such as blind spots or back-lightning, allowing the system to gather the best information at each time by active positioning. This work presents a hybrid camera system, with static and mobile cameras, where all the cameras collaborate to observe people moving freely in the environment and efficiently visualize certain attributes from each person. Our solution combines a multi-camera distributed tracking system, to localize with precision all the people, with a control scheme that moves the mobile cameras to the best viewpoints for a specific classification task. The main contribution of this paper is a novel framework that exploits the synergies that result from the cooperation of the tracking and the control modules, obtaining a system closer to the real-world application and capable of high-level scene understanding. The static camera network provides global awareness of the control scheme to move the robots. In exchange, the mobile cameras onboard the robots provide enhanced information about the people on the scene. We perform a thorough analysis of the people monitoring application performance under different conditions thanks to the use of a photo-realistic simulation environment. Our experiments demonstrate the benefits of collaborative mobile cameras with respect to static or individual camera setups. ...

Evaluating Dynamic Environment Difficulty for Obstacle Avoidance Benchmarking

Conference paper (2024) - Moji Shi, Gang Chen, Álvaro Serra Gómez, Siyuan Wu, Javier Alonso-Mora

Dynamic obstacle avoidance is a popular research topic for autonomous systems, such as micro aerial vehicles and service robots. Accurately evaluating the performance of dynamic obstacle avoidance methods necessitates the establishment of a metric to quantify the environment's difficulty, a crucial aspect that remains unexplored. In this paper, we propose four metrics to measure the difficulty of dynamic environments. These metrics aim to comprehensively capture the influence of obstacles' number, size, velocity, and other factors on the difficulty. We compare the proposed metrics with existing static environment difficulty metrics and validate them through over 1.5 million trials in a customized simulator. This simulator excludes the effects of perception and control errors and supports different motion and gaze planners for obstacle avoidance. The results indicate that the survivability metric outperforms and establishes a monotonic relationship between the success rate, with a Spearman's Rank Correlation Coefficient (SRCC) of over 0.9. Specifically, for every planner, lower survivability leads to a higher success rate. This metric not only facilitates fair and comprehensive benchmarking but also provides insights for refining collision avoidance methods, thereby furthering the evolution of autonomous systems in dynamic environments. ...

Active Classification of Moving Targets With Learned Control Policies

Journal article (2023) - Álvaro Serra-Gómez, Eduardo Montijano, Wendelin Böhmer, Javier Alonso-Mora

In this paper, we consider the problem where a drone has to collect semantic information to classify multiple moving targets. In particular, we address the challenge of computing control inputs that move the drone to informative viewpoints, position and orientation, when the information is extracted using a “black-box” classifier, e.g., a deep learning neural network. These algorithms typically lack of analytical relationships between the viewpoints and their associated outputs, preventing their use in information-gathering schemes. To fill this gap, we propose a novel attention-based architecture, trained via Reinforcement Learning (RL), that outputs the next viewpoint for the drone favoring the acquisition of evidence from as many unclassified targets as possible while reasoning about their movement, orientation, and occlusions. Then, we use a low-level MPC controller to move the drone to the desired viewpoint taking into account its actual dynamics. We show that our approach not only outperforms a variety of baselines but also generalizes to scenarios unseen during training. Additionally, we show that the network scales to large numbers of targets and generalizes well to different movement dynamics of the targets. ...

A Framework for Fast Prototyping of Photo-realistic Environments with Multiple Pedestrians

Conference paper (2023) - S. Casao, Andrés Otero, A. Serra Gomez, Ana C. Murillo, J. Alonso-Mora, Eduardo Montijano

Robotic applications involving people often require advanced perception systems to better understand complex real-world scenarios. To address this challenge, photo-realistic and physics simulators are gaining popularity as a means of generating accurate data labeling and designing scenarios for evaluating generalization capabilities, e.g., lighting changes, camera movements or different weather conditions. We develop a photo-realistic framework built on Unreal Engine and AirSim to generate easily scenarios with pedestrians and mobile robots. The framework is capable to generate random and customized trajectories for each person and provides up to 50 ready-to-use people models along with an API for their metadata retrieval. We demonstrate the usefulness of the proposed framework with a use case of multi-target tracking, a popular problem in real pedestrian scenarios. The notable feature variability in the obtained perception data is presented and evaluated. ...

Learning scalable and efficient communication policies for multi-robot collision avoidance

Journal article (2023) - Álvaro Serra-Gómez, Hai Zhu, B.F. Ferreira de Brito, Wendelin Böhmer, Javier Alonso-Mora

Decentralized multi-robot systems typically perform coordinated motion planning by constantly broadcasting their intentions to avoid collisions. However, the risk of collision between robots varies as they move and communication may not always be needed. This paper presents an efficient communication method that addresses the problem of “when” and “with whom” to communicate in multi-robot collision avoidance scenarios. In this approach, each robot learns to reason about other robots’ states and considers the risk of future collisions before asking for the trajectory plans of other robots. We introduce a new neural architecture for the learned communication policy which allows our method to be scalable. We evaluate and verify the proposed communication strategy in simulation with up to twelve quadrotors, and present results on the zero-shot generalization/robustness capabilities of the policy in different scenarios. We demonstrate that our policy (learned in a simulated environment) can be successfully transferred to real robots. ...

¿Existe la misma exigencia en la obtención del doctorado (PhD) en todos los departamentos de cirugía de las universidades españolas?

Journal article (2022) - Xavier Serra-Aracil, Manel Armengol Carrasco, Joan Morote Robles, Eloy Espin Basany, Natalia Amat-Lefort, Álvaro Serra-Gómez, Salvador Navarro-Soto

Introduction: The doctorate is the third cycle of official university studies, which, through the defense of the doctoral thesis leads to the acquisition of the title of doctor or PhD from the Anglo-Saxon countries. Royal Decree law 99/2011 regulates doctoral programs, with a wide margin on quality requirements. The objective of this study is to find out if there is this variation in the requirements of the doctorate programs of the different departments of surgery of the Spanish public universities and to establish a quality scale. Methods: Cross-sectional observational study from 2/22/2021 to 3/3/2021, through a survey sent electronically to the professors of the departments of surgery. Results: Thirty-five departments of surgery were consulted, obtaining a response in 29 of them (82.9%). The observed variation regarding requirements has been basically in the quality of the research project, in fact in 25 (86.2%) there are no regulations on this. When it is presented in the form of a compendium of articles, these are required to be original in 15 (51.7%). Regarding the position as author, the doctoral student must be the preferred author, at least in 2 articles in 14 (48.4%) of the programs. In 14 departments (48.4%) there are no regulations on the position of the articles and quartiles of journals. When scoring the different programs according to their requirements, the variability is high, ranging between 2 and 19 points. Funding for the development of the doctorate is meager. Conclusions: There is a wide variability in the requirement of doctoral programs. Homogeneous levels of demand must be defined to promote and protect higher-level doctorates. ...

Where to Look Next: Learning Viewpoint Recommendations for Informative Trajectory Planning

Conference paper (2022) - M. Lodel, B.F. Ferreira de Brito, A. Serra Gomez, L. Ferranti, R. Babuska, J. Alonso-Mora

Search missions require motion planning and navigation methods for information gathering that continuously replan based on new observations of the robot's surroundings. Current methods for information gathering, such as Monte Carlo Tree Search, are capable of reasoning over long horizons, but they are computationally expensive. An alternative for fast online execution is to train, offline, an information gathering policy, which indirectly reasons about the information value of new observations. However, these policies lack safety guarantees and do not account for the robot dynamics. To overcome these limitations we train an information-aware policy via deep reinforcement learning, that guides a receding-horizon trajectory optimization planner. In particular, the policy continuously recommends a reference viewpoint to the local planner, such that the resulting dynamically feasible and collision-free trajectories lead to observations that maximize the information gain and reduce the uncertainty about the environment. In simulation tests in previously unseen environments, our method consistently outperforms greedy next-best-view policies and achieves competitive performance compared to Monte Carlo Tree Search, in terms of information gains and coverage time, with a reduction in execution time by three orders of magnitude. ...

With whom to communicate

Learning efficient communication for multi-robot collision avoidance

Conference paper (2020) - Alvaro Serra-Gomez, Bruno Brito, Hai Zhu, Jen Jen Chung, Javier Alonso-Mora

Decentralized multi-robot systems typically perform coordinated motion planning by constantly broadcasting their intentions as a means to cope with the lack of a central system coordinating the efforts of all robots. Especially in complex dynamic environments, the coordination boost allowed by communication is critical to avoid collisions between cooperating robots. However, the risk of collision between a pair of robots fluctuates through their motion and communication is not always needed. Additionally, constant communication makes much of the still valuable information shared in previous time steps redundant. This paper presents an efficient communication method that solves the problem of "when"and with "whom"to communicate in multi-robot collision avoidance scenarios. In this approach, every robot learns to reason about other robots' states and considers the risk of future collisions before asking for the trajectory plans of other robots. We evaluate and verify the proposed communication strategy in simulation with four quadrotors and compare it with three baseline strategies: non-communicating, broadcasting and a distance-based method broadcasting information with quadrotors within a predefined distance. ...