Safe Reinforcement Learning in Flight Control

Introduction to Safe Incremental Dual Heuristic Programming

More Info
expand_more

Abstract

Online continuous reinforcement learning has shown promising result in flight control achieving near optimal control within seconds and the capability to adapt to sudden changes in the environment. However no guarantees about safety can be given, needed for use in general aviation. Furthermore performance is often dependent on the precise tuning of hyperparameters inside the system. As a new initiative in providing safety guarantees Safe Incremental Dual Heuristic Programming (SIDHP) is presented. SIDHP combines the fast learning speed of Incremental Dual Heuristic Programming (IDHP) with a safety layer, able to keep the aircraft within a predetermined safe flight envelope. SIDHP is demonstrated and compared to IDHP using a high fidelity flight simulation of a Cessna Citation-II in three separate experiments. SIDHP shows to be more robust with respect to changing hyperparameters compared to IDHP and results in less failures overall.