Learning a guidance policy to navigate among dynamic agents in constrained environments with continual reinforcement learning

More Info
expand_more

Abstract

Mobile robots that operate in human environments require the ability to safely navigate among humans and other obstacles. Existing approaches use Deep Reinforcement Learning (DRL) to obtain safe robot behavior in such environments, but do not ensure collision avoidance or trajectory feasibility. This issue is solved by methods combining DRL with model predictive control (MPC). However, they do not account for static obstacle avoidance. Moreover, DRL-based approaches train their network in multiple environments with increasing difficulty to speed up training and improve generalization ability. By sequentially training the model on new (more complex) environments, new knowledge could interfere with old knowledge, a problem known as catastrophic forgetting.
Despite performing well on the most challenging scenarios, performance on more simple scenarios is diminished. This paper introduces a continual reinforcement learning (CRL) strategy that can sequentially learn multiple navigation tasks while retaining performance on all previously learned tasks. For robot navigation, we utilize a combination of MPC with DRL, to develop an algorithm that can avoid not only dynamic interacting agents but also static obstacles. Our approach is shown to be able to safely navigate to a goal position in multiple environments. In addition, by sequentially learning multiple tasks, we improve navigation performance in terms of successful events and travel times and distances compared to other DRL approaches.