L. Peters | TU Delft Repository

Game-Theoretic Motion Planning for Multi-Agent Interaction

Doctoral thesis (2026) - L. Peters, Javier Alonso-Mora, L. Ferranti

As robots leave factory floors and deploy into unstructured human environments, they must navigate safely and efficiently alongside people and other autonomous systems. In these dynamic settings, decisions are inherently interdependent: a robot's optimal action depends heavily on how others will respond, and vice versa. Anticipating and actively shaping these responses is central to competent multi-agent behavior. However, this interaction unfolds under significant uncertainty, with limited prior knowledge of other agents' objectives, constrained sensing capabilities, and tight computational limits.

This dissertation addresses these fundamental challenges by framing robot motion planning for interactive scenarios through the lens of non-cooperative game theory. This perspective provides a principled mathematical framework for modeling multiple self-interested decision-makers who act simultaneously with partially aligned objectives. Focusing on environments where a single controlled robot interacts with uncontrolled agents whose intents are not known a priori, this dissertation develops a comprehensive suite of game-theoretic tools. These tools enable robots to infer underlying intents from observations and generate motion plans that capture the complex interdependence of self-interested decision-making.

The core contributions of this dissertation span intent inference, real-time adaptation, uncertainty-aware planning, computational efficiency, and complex non-smooth dynamics.

First, we formalize the problem of learning unknown intents from observed past behavior as an inverse game. By casting this as maximum-likelihood estimation with equilibrium constraints, our transcription jointly estimates game parameters, hidden states, and future decisions, significantly improving inference accuracy. Second, we tightly integrate these inverse games with online planning. We propose a novel solution technique handling inequality constraints with a first-order update rule for amortized inference, yielding a game-theoretic planner that dynamically adapts to evolving intent estimates.

Third, we address situations demanding explicit reasoning over a distribution of possible intents. We introduce contingency games, an uncertainty-aware planning technique that jointly generates multi-hypothesis predictions of others alongside conditional plans for the robot. By explicitly anticipating future information gains, this approach seamlessly bridges the gap between conservatively ignoring uncertainty and assuming it will never resolve. Fourth, to alleviate the substantial computational burden of online game-theoretic planning, we introduce an amortized solver for mixed strategies. An offline model learns to propose dynamically feasible trajectory candidates, while a discrete game solved online rapidly computes competitive mixed Nash equilibria.

Finally, we tackle interaction domains with inherently non-smooth dynamics, such as multi-agent manipulation, where constraints are not continuously differentiable. We propose a data-driven approach leveraging probabilistic inference and generative diffusion models. This blends learning from single-agent demonstrations with reasoning about joint multi-agent costs, discovering collaborative strategies without requiring massive multi-agent datasets.

In summary, this dissertation advances interactive motion planning by equipping robots to accurately infer intents, act safely under uncertainty, and navigate complex interactions. These algorithmic contributions are extensively validated via simulation and ground robots across autonomous driving, mobile navigation, and multi-agent manipulation, accompanied by open-source libraries to accelerate future research. ...

As robots leave factory floors and deploy into unstructured human environments, they must navigate safely and efficiently alongside people and other autonomous systems. In these dynamic settings, decisions are inherently interdependent: a robot's optimal action depends heavily on how others will respond, and vice versa. Anticipating and actively shaping these responses is central to competent multi-agent behavior. However, this interaction unfolds under significant uncertainty, with limited prior knowledge of other agents' objectives, constrained sensing capabilities, and tight computational limits.

This dissertation addresses these fundamental challenges by framing robot motion planning for interactive scenarios through the lens of non-cooperative game theory. This perspective provides a principled mathematical framework for modeling multiple self-interested decision-makers who act simultaneously with partially aligned objectives. Focusing on environments where a single controlled robot interacts with uncontrolled agents whose intents are not known a priori, this dissertation develops a comprehensive suite of game-theoretic tools. These tools enable robots to infer underlying intents from observations and generate motion plans that capture the complex interdependence of self-interested decision-making.

The core contributions of this dissertation span intent inference, real-time adaptation, uncertainty-aware planning, computational efficiency, and complex non-smooth dynamics.

First, we formalize the problem of learning unknown intents from observed past behavior as an inverse game. By casting this as maximum-likelihood estimation with equilibrium constraints, our transcription jointly estimates game parameters, hidden states, and future decisions, significantly improving inference accuracy. Second, we tightly integrate these inverse games with online planning. We propose a novel solution technique handling inequality constraints with a first-order update rule for amortized inference, yielding a game-theoretic planner that dynamically adapts to evolving intent estimates.

Third, we address situations demanding explicit reasoning over a distribution of possible intents. We introduce contingency games, an uncertainty-aware planning technique that jointly generates multi-hypothesis predictions of others alongside conditional plans for the robot. By explicitly anticipating future information gains, this approach seamlessly bridges the gap between conservatively ignoring uncertainty and assuming it will never resolve. Fourth, to alleviate the substantial computational burden of online game-theoretic planning, we introduce an amortized solver for mixed strategies. An offline model learns to propose dynamically feasible trajectory candidates, while a discrete game solved online rapidly computes competitive mixed Nash equilibria.

Finally, we tackle interaction domains with inherently non-smooth dynamics, such as multi-agent manipulation, where constraints are not continuously differentiable. We propose a data-driven approach leveraging probabilistic inference and generative diffusion models. This blends learning from single-agent demonstrations with reasoning about joint multi-agent costs, discovering collaborative strategies without requiring massive multi-agent datasets.

In summary, this dissertation advances interactive motion planning by equipping robots to accurately infer intents, act safely under uncertainty, and navigate complex interactions. These algorithmic contributions are extensively validated via simulation and ground robots across autonomous driving, mobile navigation, and multi-agent manipulation, accompanied by open-source libraries to accelerate future research.

Homotopy-Guided Potential Games for Congestion-Aware Navigation

Journal article (2026) - M. I.I.Sathyamangalam Imran, Lasse Peters, Michael Khayyat, Stefano Arrigoni, Francesco Braghin, Laura Ferranti

We address the multi-Agent motion planning problem where interactions, collisions, and congestion co-exist. Conventional game-Theoretic planners capture interactions among agents but often converge to conservative, congested equilibria. Homotopy planners, on the other hand, can explore topologically distinct paths, but lack mechanisms to account for the interdependence of agents' future actions. We propose a unified framework that leverages homotopy classes as structured strategy sets within a receding-horizon setup. At each planning stage, a deterministic homotopy planner generates topologically distinct paths for each agent, conditioned on the joint configuration. To avoid intractable growth of candidate paths, we propose a simple heuristic filtering step that selects a top-K subset of the most suitable congestion-free joint strategies to ensure computational tractability. These serve as initializations for a potential game that enforces homotopy-consistent constraints and yields a generalized open-loop Nash equilibrium (OLNE), with penalties discouraging abrupt strategy shifts in a receding-horizon setting. Simulations with three agents demonstrate improved efficiency (faster completion) and enhanced safety (greater inter-Agent clearance, leading to reduced congestion) compared to a local baseline and NH-ORCA that do not reason about homotopies. Hardware trials with two robots and one human demonstrate robustness to irrational behaviors, where our method adapts by switching to alternative feasible equilibria while the baseline game fails. ...

Updating Robot Safety Representations Online from Natural Language Feedback

Conference paper (2025) - Leonardo Santos, Zirui Li, Lasse Peters, Somil Bansal, Andrea Bajcsy

Robots must operate safely when deployed in novel and human-centered environments, like homes. Current safe control approaches typically assume that the safety constraints are known a priori, and thus, the robot can precompute a corresponding safety controller. While this may make sense for some safety constraints (e.g., avoiding collision with walls by analyzing a floor plan), other constraints are more complex (e.g., spills), inherently personal, context-dependent, and can only be identified at deployment time when the robot is interacting in a specific environment and with a specific person (e.g., fragile objects, expensive rugs). Here, language provides a flexible mechanism to communicate these evolving safety constraints to the robot. In this work, we use vision language models (VLMs) to interpret language feedback and the robot's image observations to continuously update the robot's representation of safety constraints. With these inferred constraints, we update a Hamilton-Jacobi reachability safety controller online via efficient warm-starting techniques. Through simulation and hardware experiments, we demonstrate the robot's ability to infer and respect language-based safety constraints with the proposed approach. ...

You Can't Always Get What You Want

Games of Ordered Preference

Journal article (2025) - Dong Ho Lee, Lasse Peters, David Fridovich-Keil

We study noncooperative games, in which each player's objective is composed of a sequence of ordered—and potentially conflicting—preferences. Problems of this type naturally model a wide variety of scenarios: for example, drivers at a busy intersection must balance the desire to make forward progress with the risk of collision. Mathematically, these problems possess a nested structure, and to behave properly players must prioritize their most important preference, and only consider less important preferences to the extent that they do not compromise performance on more important ones. We consider multi-agent, noncooperative variants of these problems, and seek generalized Nash equilibria in which each player's decision reflects both its hierarchy of preferences and other players' actions. We make two key contributions. First, we develop a recursive approach for deriving the first-order optimality conditions of each player's nested problem. Second, we propose a sequence of increasingly tight relaxations, each of which can be transcribed as a mixed complementarity problem and solved via existing methods. Experimental results demonstrate that our approach reliably converges to equilibrium solutions that strictly reflect players' individual ordered preferences. ...

Contingency Games for Multi-Agent Interaction

Journal article (2024) - Lasse Peters, Andrea Bajcsy, Chih Yuan Chiu, David Fridovich-Keil, Forrest Laine, Laura Ferranti, Javier Alonso-Mora

Contingency planning, wherein an agent generates a set of possible plans conditioned on the outcome of an uncertain event, is an increasingly popular way for robots to act under uncertainty. In this work we take a game-theoretic perspective on contingency planning, tailored to multi-agent scenarios in which a robot's actions impact the decisions of other agents and vice versa. The resulting contingency game allows the robot to efficiently interact with other agents by generating strategic motion plans conditioned on multiple possible intents for other actors in the scene. Contingency games are parameterized via a scalar variable which represents a future time when intent uncertainty will be resolved. By estimating this parameter online, we construct a game-theoretic motion planner that adapts to changing beliefs while anticipating future certainty. We show that existing variants of game-theoretic planning under uncertainty are readily obtained as special cases of contingency games. Through a series of simulated autonomous driving scenarios, we demonstrate that contingency games close the gap between certainty-equivalent games that commit to a single hypothesis and non-contingent multi-hypothesis games that do not account for future uncertainty reduction. ...

Online and offline learning of player objectives from partial observations in dynamic games

Journal article (2023) - Lasse Peters, Vicenç Rubies-Royo, Claire J. Tomlin, Laura Ferranti, Javier Alonso-Mora, Cyrill Stachniss, David Fridovich-Keil

Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches. ...

Cost Inference for Feedback Dynamic Games from Noisy Partial State Observations and Incomplete Trajectories

Journal article (2023) - Jingqi Li, Chih Yuan Chiu, Lasse Peters, Somayeh Sojoudi, Claire Tomlin, David Fridovich-Keil

In multi-agent dynamic games, the Nash equilibrium state trajectory of each agent is determined by its cost function and the information pattern of the game. However, the cost and trajectory of each agent may be unavailable to the other agents. Prior work on using partial observations to infer the costs in dynamic games assumes an open-loop information pattern. In this work, we demonstrate that the feedback Nash equilibrium concept is more expressive and encodes more complex behavior. It is desirable to develop specific tools for inferring players' objectives in feedback games. Therefore, we consider the dynamic game cost inference problem under the feedback information pattern, using only partial state observations and incomplete trajectory data. To this end, we first propose an inverse feedback game loss function, whose minimizer yields a feedback Nash equilibrium state trajectory closest to the observation data. We characterize the landscape and differentiability of the loss function. Given the difficulty of obtaining the exact gradient, our main contribution is an efficient gradient approximator, which enables a novel inverse feedback game solver that minimizes the loss using first-order optimization. In thorough empirical evaluations, we demonstrate that our algorithm converges reliably and has better robustness and generalization performance than the open-loop baseline method when the observation data reflects a group of players acting in a feedback Nash game. ...

Scenario-Game ADMM

A Parallelized Scenario-Based Solver for Stochastic Noncooperative Games

Conference paper (2023) - Jingqi Li, Chih Yuan Chiu, Lasse Peters, Fernando Palafox, Mustafa Karabag, Javier Alonso-Mora, Somayeh Sojoudi, Claire Tomlin, David Fridovich-Keil

Decision-making in multi-player games can be extremely challenging, particularly under uncertainty. In this work, we propose a new sample-based approximation to a class of stochastic, general-sum, pure Nash games, where each player has an expected-value objective and a set of chance constraints. This new approximation scheme inherits the accuracy of objective approximation from the established sample average approximation (SAA) method and enjoys a feasibility guarantee derived from the scenario optimization literature. We characterize the sample complexity of this new game-theoretic approximation scheme, and observe that high accuracy usually requires a large number of samples, which results in a large number of sampled constraints. To accommodate this, we decompose the approximated game into a set of smaller games with few constraints for each sampled scenario, and propose a decentralized, consensus-based ADMM algorithm to efficiently compute a generalized Nash equilibrium (GNE) of the approximated game. We prove the convergence of our algorithm to a GNE and empirically demonstrate superior performance relative to a recent baseline algorithm based on ADMM and interior point method. ...

Learning to Play Trajectory Games Against Opponents with Unknown Objectives

Journal article (2023) - Xinjie Liu, Lasse Peters, Javier Alonso-Mora

Many autonomous agents, such as intelligent vehicles, are inherently required to interact with one another. Game theory provides a natural mathematical tool for robot motion planning in such interactive settings. However, tractable algorithms for such problems usually rely on a strong assumption, namely that the objectives of all players in the scene are known. To make such tools applicable for ego-centric planning with only local information, we propose an adaptive model-predictive game solver, which jointly infers other players' objectives online and computes a corresponding generalized Nash equilibrium (GNE) strategy. The adaptivity of our approach is enabled by a differentiable trajectory game solver whose gradient signal is used for maximum likelihood estimation (MLE) of opponents' objectives. This differentiability of our pipeline facilitates direct integration with other differentiable elements, such as neural networks (NNs). Furthermore, in contrast to existing solvers for cost inference in games, our method handles not only partial state observations but also general inequality constraints. In two simulated traffic scenarios, we find superior performance of our approach over both existing game-theoretic methods and non-game-theoretic model-predictive control (MPC) approaches. We also demonstrate our approach's real-time planning capabilities and robustness in two-player hardware experiments. ...

Learning Mixed Strategies in Trajectory Games

Conference paper (2022) - L. Peters, David Fridovich-Keil, L. Ferranti, Cyrill Stachniss, Javier Alonso-Mora, Forrest Laine

In multi-agent settings, game theory is a natural framework for describing the strategic interactions of agents whose objectives depend upon one another’s behavior. Trajectory games capture these complex effects by design. In competitive settings, this makes them a more faithful interaction model than traditional “predict then plan” approaches. However, current game-theoretic planning methods have important limitations. In this work, we propose two main contributions. First, we introduce an offline training phase which reduces the online computational burden of solving trajectory games. Second, we formulate a lifted game which allows players to optimize multiple candidate trajectories in unison and thereby construct more competitive “mixed” strategies. We validate our approach on a number of experiments using the pursuit-evasion game “tag.” ...