HiPSS-LSTM: A Hierarchical Probabilistic LSTM model for motion forecasting

More Info
expand_more

Abstract

When driving a car, people can usually predict the intention of other road users with high confidence. They can spot small variations in a recent trajectory, take into account road infrastructure, traffic rules and other factors, which influence a future trajectory. As a result, they can interact with other drivers smoothly and safely. Even though human drivers are generally very good at driving, they still make a lot of errors, which lead to many accidents. Therefore, future technology should support the drivers in this task and eventually replace them, to increase safety on roads. The future vehicle should be able to understand and predict the behavior of the surrounding road users and react accordingly. However, predicting trajectories of other road users is an extremely complex task for an algorithm, because of many different factors influencing road users' decisions. Existing approaches predicting future trajectories are mostly based only on the past, observed trajectories, often resulting in very unrealistic predictions. The predicted trajectories are often not in line with road infrastructure, nor do they take into account other road users surrounding them. Moreover, current methods usually predict only one most likely trajectory, disregarding all other possibilities. The main objective of this thesis is to find a Deep Learning architecture that can predict a few most likely future destinations for a target vehicle, and generate accurate trajectories leading to each of them, while taking into account infrastructure, possible trajectories there, target vehicle's recent behavior and other road users around the target vehicle. This work uses a well-performing architecture for pedestrian trajectory prediction in crowded spaces and further develops it, by adapting it to a different domain - vehicle motion prediction in an urban environment. Moreover, this work extends the method, to predict a set of possible trajectories. This was achieved by implementing two-stage architecture. The first stage predicts several high-level destinations, which indicate an estimated location where a target vehicle will be after 3 seconds. Each high-level destination is defined with a probability indicating confidence of the prediction. The second stage uses those predictions to generate low-level trajectories, which are 30 consecutive locations, leading from the last observed position towards the high-level goal. This two-step architecture is trained on an extensive dataset, which was specifically made for autonomous-vehicle applications. To make even better use of the dataset, the network was pretrained on lane centerlines before it was trained on real trajectories. The centerlines were supplied with the dataset. The results show high prediction accuracy, both in terms of the high-level goals, and in terms of the low-level trajectories generated for each high-level goal.