Reinforcement Learning Compensated Filter for Position and Orientation Estimation

More Info
expand_more

Abstract

Pose estimation provides accurate position and orientation information of the intelligent agents in real time. The accuracy of the estimation directly affects the performance of sequential tasks such as mapping, motion planning, and control. EKF (Extended Kalman Filter) is a standard theory for nonlinear pose estimation by modeling state uncertainty to Gaussian distribution. However, EKF has requirements for proper initial estimate and system noise to obtain bounded optimal estimate. Meanwhile, model nonlinearity and non-gaussian noise modeling affect the performance of EKF significantly in practical applications. In this thesis, we focus on improving the performance of nonlinear pose estimation by reinforcement learning. By formulating an EKF measurement update as a Markov Decision Process (MDP), reinforcement learning agents can be trained to learn the estimator gain through data samples and executed as the online estimator for pose estimation tasks.

Based on the above idea, we propose a novel reinforcement learning-compensated EKF estimator (RLC-EKF), where the RL agent serves as a second-time measurement update that subsides the residual error from the standard EKF estimate. The estimator is developed and testified on two specific pose estimation scenarios. Firstly, as a continuous work from the previous study, a framework for 3 DOF orientation estimation using inertial sensor and magnetometer is replicated. Then, the framework is extended by different RL algorithms training and multi-scale robustness validation. Besides, we implement the estimator on a feature-based 2D plane localization framework. The proposed framework shows the feasibility of the underlying algorithm on a localization task with a known map. As a result, the RLC-EKF estimator gives superior performance and convincible robustness compared to classical methods in severe conditions such as varying initial states, degree of noise intensities, and model covariance.