Conjugate Dynamic Programming

More Info
expand_more

Abstract

In decision making problems, the ability to compute the optimal solution can pose a serious challenge. Dynamic Programming (DP) aims to provide a framework to deal with a category of such problems, namely ones that involve sequential decision making. By dividing the original control problem into sub-problems and solving it backwards in time, from the end of the time horizon to the start, the method can compute a map of optimal solutions with respect to the initial condition. In order to divide the original problem into subproblems the DP method takes advantage of the principle of optimality, which states that a sub-solution of the optimal solution should be the optimal solution for the equivalent subproblem. In control systems, where the state and decision spaces are continuous, the original DP framework can be intractable due to the size of the discretization needed to simulate the continuous space. Therefore, efficient approximations are needed to solve such problems. One promising method is called Conjugate Dynamic Programming (CDP). The CDP algorithm is able to transform the original framework and solve problems in the conjugate domain providing a computational advantage over the standard method. In this work, we aim to improve and extend the setting under which the CDP algorithm operates, thus providing a more concrete advantage over standard method . In that regard, we will extract the optimal control actions from within the CDP algorithm, eliminating the need for solving an extra optimization problem for their computation. In addition, we will introduce a different interpolation technique that can outperform the current one, in certain scenarios, thus granting the user more choice when solving a decision making problem.