Safe Curriculum Learning for Linear Systems With Unknown Dynamics in Primary Flight Control

More Info
expand_more

Abstract

Safe Curriculum Learning constitutes a collection of methods that aim at enabling Rein- forcement Learning (RL) algorithms on complex systems and tasks whilst considering the safety and efficiency aspect of the learning process. On the one hand, curricular reinforce- ment learning approaches divide the task into more gradual complexity stages to promote learning efficiency. On the other, safe learning provides a framework to consider a system’s safety during the learning process. The latter’s contribution is significant on safety-critical systems, such as transport vehicles where stringent (safety) requirements apply. This pa- per proposes a black box safe curriculum learning architecture applicable to systems with unknown dynamics. It only requires knowledge of the state and action spaces’ orders for a given task and system. By adding system identification capabilities to existing safe cur- riculum learning paradigms, the proposed architecture successfully ensures safe learning proceedings of tracking tasks without requiring initial knowledge of internal system dynam- ics. More specifically, a model estimate is generated online to complement safety filters that rely on uncertain models for their safety guarantees. This research explicitly targets linearised systems with decoupled dynamics in the experiments provided in this article as proof of concept. The paradigm is initially verified on a mass-spring-damper system. After that, the architecture is applied to a quadrotor where it is able to successfully track the system’s four degrees of freedom independently, namely attitude angles and altitude. The RL agent is able to safely learn an optimal policy that can track an independent reference on each degree of freedom.

Files