Path following and stabilization of a bicycle model using a reinforcement learning approach
Sebastian Weyrer (University of Innsbruck)
Peter Manzl (University of Innsbruck)
A. L. Schwab (TU Delft - Biomechatronics & Human-Machine Control)
Johannes Gerstmayr (University of Innsbruck)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Over the years, complex control approaches have been developed to control the motion of a bicycle. Reinforcement Learning (RL), a branch of machine learning, promises to be an automated approach for solving optimal control problems. By interacting with and observing an environment, a so-called agent is trained, ultimately leading to a learned controller. The present work introduces a pure RL approach to do path following with a virtual bicycle model while simultaneously stabilizing it laterally. The bicycle, modeled using the Whipple benchmark model and multibody system dynamics, has no stabilization aids. The observation of the environment consists of the minimal positional and velocity coordinates of the bicycle, as well as of information about the path ahead of the bicycle provided by moving preview points. Both path following and stabilization of the bicycle model are achieved exclusively by controlling the steering angle setpoint of the bicycle. Curriculum learning is applied as a state-of-the-art training strategy. Different settings for the RL approach are investigated and compared. The ability of the learned controllers to do path following and stabilization of the bicycle model traveling between 2 m/s and 7 m/s along complex paths including full circles, slalom maneuvers, and lane changes is demonstrated. Explanatory methods for machine learning are used to analyze the learned controller and identify connections to research in bicycle dynamics.