Deep Reinforcement Learning for the Synthesis of Self-Triggered Sampling Strategies
R.J.F. de Ruijter (TU Delft - Mechanical Engineering)
M Mazo Espinosa – Mentor (TU Delft - Team Manuel Mazo Jr)
Daniel Jarne Ornia – Mentor (TU Delft - Learning & Autonomous Control)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Control engineering researchers are increasingly embracing data-driven techniques like reinforcement learning for control and optimisation. An example of a case where reinforcement learning could be useful is the synthesis of near-optimal sampling strategies for self-triggered control. Self-triggered control is an aperiodic control method that aims to reduce the number of communications between the controller and sensors in a control loop, by predicting when some triggering condition is met and only transmitting a sample accordingly. Recent research has shown that greedily following the proposed sampling times can result in sub-optimal long-term average inter-sample times. Abstraction-based methods have been able to synthesise sampling strategies that result in better long-term average inter-sample times, by allowing for early sampling and considering the proposed sampling times as deadlines. However, these abstraction-based methods suffer greatly from the curse of dimensionality in the form of combinatorial explosion, which limits their practicality for more complex systems. This thesis proposes a novel deep reinforcement learning tool for finding near-optimal sampling strategies for self-triggered control of LTI systems. The proposed tool is evaluated and compared to a state-of-the-art abstraction-based method. The proposed tool is shown to match the performance of the abstraction-based method for smaller systems, while still achieving good results on more complex systems that prohibit the use of abstraction-based methods.