Regret Analysis of Learning-Based Linear Quadratic Gaussian Control with Additive Exploration

Conference Paper (2024)
Author(s)

Archith Athrey (Student TU Delft)

Othmane Mazhar (Université Paris Cité Grands)

Meichen Guo (TU Delft - Team Meichen Guo)

BHK Schutter (TU Delft - Delft Center for Systems and Control)

Shengling Shi (TU Delft - Team Bart De Schutter)

Research Group
Team Bart De Schutter
DOI related publication
https://doi.org/10.23919/ECC64448.2024.10590739
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Team Bart De Schutter
Pages (from-to)
1795-1801
ISBN (electronic)
978-3-9071-4410-7
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by a second phase of an interplay between naive exploration and control in an episodic fashion. We show that LQG-NAIVE achieves a regret growth rate of Õ(√T), i.e., O(√T) up to logarithmic factors after T time steps, and we validate its performance through numerical simulations. Additionally, we propose LQG-IF2E, which extends the exploration signal to a 'closed-loop' setting by incorporating the Fisher Information Matrix (FIM). We provide compelling numerical evidence of the competitive performance of LQG-IF2E compared to LQG-NAIVE.

Files

Regret_Analysis_of_Learning-Ba... (pdf)
(pdf | 0.58 Mb)
- Embargo expired in 24-01-2025
License info not available