Regret Analysis of Learning-Based Linear Quadratic Gaussian Control with Additive Exploration

None, None; None, None; None, None; None, None; None, None

Regret Analysis of Learning-Based Linear Quadratic Gaussian Control with Additive Exploration

Conference Paper (2024)

Author(s)

Archith Athrey (Student TU Delft)

Othmane Mazhar (Université Paris Cité Grands)

Meichen Guo (TU Delft - Team Meichen Guo)

B. De Schutter (TU Delft - Delft Center for Systems and Control)

S. Shi (TU Delft - Team Bart De Schutter)

Research Group

Team Bart De Schutter

DOI related publication

https://doi.org/10.23919/ECC64448.2024.10590739

To reference this document use:

https://resolver.tudelft.nl/uuid:9d7488a1-1715-467a-8761-9c60a0eb2bcd

More Info

expand_more

Publication Year

2024

Language

English

Research Group

Team Bart De Schutter

Pages (from-to)

1795-1801

ISBN (electronic)

978-3-9071-4410-7

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by a second phase of an interplay between naive exploration and control in an episodic fashion. We show that LQG-NAIVE achieves a regret growth rate of Õ(√T), i.e., O(√T) up to logarithmic factors after T time steps, and we validate its performance through numerical simulations. Additionally, we propose LQG-IF2E, which extends the exploration signal to a 'closed-loop' setting by incorporating the Fisher Information Matrix (FIM). We provide compelling numerical evidence of the competitive performance of LQG-IF2E compared to LQG-NAIVE.

Files

Regret_Analysis_of_Learning-Ba... (pdf)

(pdf | 0.58 Mb)

- Embargo expired in 24-01-2025

License info not available