Model+Learning-based Optimal Control

None, None; None, None; None, None

Model+Learning-based Optimal Control

An Inverted Pendulum Study

Conference Paper (2020)

Author(s)

S Baldi (TU Delft - Team Bart De Schutter, Southeast University)

Muhammad Ridho Rosa (Telkom University, Student TU Delft)

Yuzhang Wang (Student TU Delft, Southeast University)

Research Group

Team Bart De Schutter

Copyright

DOI related publication

https://doi.org/10.1109/ICCA51439.2020.9264402

To reference this document use:

https://resolver.tudelft.nl/uuid:552ddb83-fa0d-4d9f-943f-b50c4b27b027

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Research Group

Team Bart De Schutter

Pages (from-to)

773-778

ISBN (electronic)

978-1-7281-9093-8

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This work extends and compares some recent model+learning-based methodologies for optimal control with input saturation. We focus on two methodologies: a model-based actor-critic (MBAC) strategy, and a nonlinear policy iteration strategy. To evaluate the performance of the algorithms, these strategies are applied to the swinging up an inverted pendulum. Numerical simulations show that the neural network approximation in the MBAC strategy can be poor, and the algorithm may converge far from the optimum. In the MBAC approach neither stabilization nor monotonic convergence can be guaranteed, and it is observed that the best value function is not always corresponding to the last one. On the other side the nonlinear policy iteration approach guarantees that every new control policy is stabilizing and generally leads to a monotonically decreasing cost.

Files

Root_swing3_short.pdf

(pdf | 0.936 Mb)

License info not available