JH
J.M. Hoogvliet
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Hierarchical Reinforcement Learning for Model-Free Flight Control
A sample efficient tabular approach using Q(lambda)-learning and options in a traditional flight control structure
Reinforcement learning (RL) is a model-free adaptive approach to learn a non-linear control law for flight control. However, for flat-RL (FRL) the size of the search space grows exponentially with the number of states, resulting in low sample efficiency. This research aims to improve the efficiency with Hierarchical Reinforcement Learning (HRL). Performance in terms of the number of samples and the mean tracking error is evaluated on an altitude reference tracking task using a simulated F16 aircraft model. FRL is used as the baseline performance index. HRL is used to define a three-level learning structure, re-using an existing flight control structure. Finally, options is used with HRL to add temporal abstraction. It is shown that by re-using the flight control structure the learning process is made more sample efficient. Adding options further increases this efficiency, but does not lead to better tracking
performance. ...
performance. ...
Reinforcement learning (RL) is a model-free adaptive approach to learn a non-linear control law for flight control. However, for flat-RL (FRL) the size of the search space grows exponentially with the number of states, resulting in low sample efficiency. This research aims to improve the efficiency with Hierarchical Reinforcement Learning (HRL). Performance in terms of the number of samples and the mean tracking error is evaluated on an altitude reference tracking task using a simulated F16 aircraft model. FRL is used as the baseline performance index. HRL is used to define a three-level learning structure, re-using an existing flight control structure. Finally, options is used with HRL to add temporal abstraction. It is shown that by re-using the flight control structure the learning process is made more sample efficient. Adding options further increases this efficiency, but does not lead to better tracking
performance.
performance.
Bachelor thesis
(2015)
-
V. van den Bercken, S. Burger, N.M. Dekkers, J.M. Hoogvliet, A.V. Kassem, R. Kok, T.A.H. Kranen, G.P. van Marrewijk, F.T. Melman, B. Mulder, E. Mooij, F.S. Esrail, S. Woicke