On Game-Theoretic Planning with Unknown Opponents' Objectives

More Info
expand_more

Abstract

Many autonomous navigation tasks require mobile robots to operate in dynamic environments involving interactions between agents. Developing interaction-aware motion planning algorithms that enable safe and intelligent interactions remains challenging. Dynamic game theory renders a powerful mathematical framework to model these interactions rigorously as coupled optimization problems. By solving the resultant coupled optimization problems to equilibrium solutions, the game-theoretic models explicitly account for the interdependence of agents’ decisions and achieve simultaneous prediction and planning. Coupled constraints between players, such as collision avoidance, can also be handled explicitly. However, most existing game-theoretic motion planning approaches rely on known objective models of all agents. This assumption presents a key obstacle to real-world ego-centric planning applications of these methods, where only local information is available. This thesis investigates solution approaches to relax this assumption and explicitly account for the ego agent’s uncertainty about other agents’ objectives while adaptively conducting game-theoretic motion planning.

The main contribution of this work is an online adaptive model-predictive game-play (MPGP) framework that jointly infers other players’ objectives and computes corresponding generalized Nash equilibrium (GNE) strategies. These strategies are then used as predictions for other players and control strategies for the ego agent. The adaptivity of the proposed approach is enabled by differentiating through a trajectory game solver whose gradient signal is used for maximum likelihood estimation (MLE) of opponents’ objectives. Compared with existing objective inference solutions in dynamic games, the proposed approach handles general inequality constraints in games and further supports direct integration with other differentiable modules, such as neural networks (NNs). Two simulation experiments indicate that the proposed approach performs closely to solving games with known objectives and outperforms the game-theoretic and model-predictive control (MPC) baselines. Two hardware experiments further demonstrate the real-time planning capability of the planner and its real-world applicability.

In addition to this main contribution, the second contribution of this work is a variational autoencoder (VAE) pipeline built upon the proposed differentiable game solver. This contribution aims at going beyond the point estimation in the first contribution and inferring potentially multi-modal beliefs about players’ objectives based on observations. The main idea is to employ variational inference (VI) to approximate Bayesian inference of players’ objectives. The variational autoencoder (VAE) framework is utilized for amortization to avoid per-sample optimization. Initial results on a single-player example show that after training, the proposed pipeline can: (i) generate a game objective distribution that resembles the underlying training data distribution and (ii) accurately predict a narrow, uni-modal posterior objective distribution when the observation is unambiguous based on seen data in the past and (iii) generate a multi-modal belief distribution of player’s objective to capture mostly likely modes in case of high uncertainty.