Incremental Model Based Actor Critic Design for Optimal Adaptive Flight Control

None, None

Incremental Model Based Actor Critic Design for Optimal Adaptive Flight Control

Investigation and Implementation of Online Flight Control Methods

Master Thesis (2018)

Author(s)

S.A. Zafar (TU Delft - Aerospace Engineering)

Contributor(s)

QP Chu – Mentor

Faculty

Aerospace Engineering

Copyright

Reinforcement Learning (RL) Online learning Finite Difference Method (FDM) Action dependent dual heuristic programming (ADDHP) Incremental model Model-free controllers

To reference this document use:

https://resolver.tudelft.nl/uuid:f9b82116-1cfd-4ffb-aca4-ee6bbdad9d26

More Info

expand_more

Publication Year

2018

Language

English

Copyright

Graduation Date

04-01-2018

Awarding Institution

Delft University of Technology

Programme

['Aerospace Engineering']

Faculty

Aerospace Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

To solve the problem of optimal control for nonlinear system, Actor Critic Designs (ACD) can be utilized which use the concept of Reinforcement learning (RL) and function approximators such as Neural networks (NN). Traditional ACD methods require a model NN that needs to be trained offline. Recently, research focus has been shifted to model-free approaches that do not require any model information beforehand and can be applied for online control. This thesis furthers the online methods in ACD by developing Incremental Model based Action Dependent Dual Heuristic Programming (IADDHP). In IADDHP, local system dynamics is identified online which does not require any priori knowledge about the system thus making it essentially ‘model-free’. Experiments are performed using missile model for reference tracking control and the results show that the IADDHP is capable of finding near-optimal control policy for the tasks with noise and system failure. It also outperforms the already existing model-free ADDHP which uses finite difference method (FDM) and has advantage over it in failure detection and adaptation. Being a model-free approach, IADDHP should be applicable for reference tracking control of any system.

Files

Report.pdf

(pdf | 3.88 Mb)

- Embargo expired in 31-03-2018

License info not available