Online policy iterations for optimal control of input-saturated systems

None, None; None, None; None, None; None, None

Online policy iterations for optimal control of input-saturated systems

Conference Paper (2016)

Author(s)

S Baldi (TU Delft - Team Bart De Schutter)

Giorgio Valmorbida (University of Oxford)

Antonis Papachristodoulou (University of Oxford)

Elias B. Kosmatopoulos (Democritus University of Thrace, Centre for Research and Technology Hellas)

Research Group

Team Bart De Schutter

Copyright

DOI related publication

https://doi.org/10.1109/ACC.2016.7526568

Convergence Estimation Optimal control Trajectory Lyapunov methods Linear systems Asymptotic stability

To reference this document use:

https://resolver.tudelft.nl/uuid:54f0e769-3fbc-4d5c-be5f-337321526f70

More Info

expand_more

Publication Year

2016

Language

English

Copyright

Research Group

Team Bart De Schutter

Pages (from-to)

5734-5739

ISBN (electronic)

978-1-4673-8682-1

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This work proposes an online policy iteration procedure for the synthesis of sub-optimal control laws for uncertain Linear Time Invariant (LTI) Asymptotically Null-Controllable with Bounded Inputs (ANCBI) systems. The proposed policy iteration method relies on: a policy evaluation step with a piecewise quadratic Lyapunov function in both the state and the deadzone functions of the input signals; a policy improvement step which guarantees at the same time close to optimality (exploitation) and persistence of excitation (exploration). The proposed approach guarantees convergence of the trajectory to a neighborhood around the origin. Besides, the trajectories can be made arbitrarily close to the optimal one provided that the rate at which the the value function and the control policy are updated is fast enough. The solution to the inequalities required to hold at each policy evaluation step can be efficiently implemented with semidefinite programming (SDP) solvers. A numerical example illustrates the results.

Files

Sat_resub_ACC3_final.pdf

(pdf | 0.218 Mb)

License info not available