Game-Theoretic Intention-Aware Planning for Autonomous Vehicles

More Info


As Autonomous Vehicles (AVs) navigate through dynamic and constantly changing environments, it is crucial that they take into account the impact of their actions on the decisions of others for safe and efficient interaction with humans. In doing so, they need to anticipate how humans will behave in different situations based on their intentions. This work aims to address these challenges by proposing an expressive game-theoretic framework for modeling the interactions between AVs and human drivers as a multi-agent dynamic game, in which each agent seeks to optimize their respective objective. The optimization problem is solved by obtaining the Nash equilibrium, which accounts for the potential non-cooperative behavior of human drivers. To incorporate the notion of intention into the game-theoretic formulation, we introduce for each agent a parameter known as Social Value Orientation (SVO), reflecting the degree to which an agent is willing to prioritize the welfare of others over its own. We then develop efficient methods to solve this nonlinear optimization problem in a receding-horizon fashion given the agents' SVOs. However, cluttered traffic scenarios are typically characterized by uncertainty regarding the intentions of other traffic participants due to noisy sensor data, and multiple equally admissible equilibrium strategies that humans may adapt to achieve their objective. Therefore, an approximate Bayesian inference method is developed to infer the intentions of the surrounding human participants by estimating the likelihood of SVO based on newly received state observations. We then integrate the estimation module into the game-theoretic planning module in a combined framework and evaluate its predictive performance against algorithms that ignore these sources of uncertainty in two simulated traffic scenarios; ramp-merging at a highway and crossing at uncontrolled intersections. Our results show that the proposed inference method exhibits superior performance compared to all other approaches, with the average prediction error approaching zero. This implies that dynamically changing the SVO values, while planning, effectively captures the true intentions of the surrounding agents.