S.P. Nageshrao | TU Delft Repository

Model-based real-time control of a magnetic manipulator system

Conference paper (2017) - Jan Willem Damsteeg, Subramanya P. Nageshrao, Robert Babuška

Precise magnetic manipulation has numerous applications, ranging from manufacturing to the medical field. Owing to the nonlinear nature of the electromagnetic force, magnetic manipulation requires advanced nonlinear control. In this paper, we design and experimentally evaluate two nonlinear controllers for a magnetic manipulation (Magman) system, which consists of four electromagnetic coils arranged linearly. The current through the coils is controlled in order to accurately position a steel ball, rolling freely in a track above the coils. We benchmark two nonlinear control methods, namely feedback linearization and a constrained state-dependent Riccati equation (SDRE) control. These methods are chosen due to their widespread use in academia as well as industrial applications. On the actual setup, constrained SDRE has performed considerably better in terms of the settling time, overshoot, and the amount of control effort when compared to feedback linearization. ...

Actor-critic reinforcement learning for tracking control in robotics

Conference paper (2016) - Yudha P. Pane, Subramanya P. Nageshrao, Robert Babuska

In this article we provide experimental results and evaluation of a compensation method which improves the tracking performance of a nominal feedback controller by means of reinforcement learning (RL). The compensator is based on the actor-critic scheme and it adds a correction signal to the nominal control input with the goal to improve the tracking performance using on-line learning. The algorithm has been evaluated on a 6 DOF industrial robot manipulator with the objective to accurately track different types of reference trajectories. An extensive experimental study has shown that the proposed RL-based compensation method significantly improves the performance of the nominal feedback controller. ...

Online learning algorithms

For passivity-based and distributed control

Doctoral thesis (2016) - Subramanya Nageshrao

Over the last couple of decades the demand for high precision and enhanced performance of physical systems has been steadily increasing. This demand often results in miniaturization and complex design, thus increasing the need for complex nonlinear control methods. Some of the state of the art nonlinear methods are stymied by the requirement of full state information, model and parameter uncertainties, mathematical complexity, etc. For many scenarios it is nearly impossible to consider all the uncertainties during the design of a feedback controller. Additionally, while designing a modelbased nonlinear control there is no standard mechanism to incorporate performance measures. Some of the mentioned issues can be addressed by using online learning. Animals and humans have the ability to share, explore, act or respond, memorize the outcome and repeat the task to achieve a better outcome when they encounter the same or a similar scenario. This is called learning from interaction. One instance of this approach is reinforcement learning (RL). However, RL methods are hindered by the curse of dimensionality, non-interpretability and non-monotonic convergence of the learning algorithms. This can be attributed to the intrinsic characteristics of RL, as it is a modelfree approach and hence no standard mechanism exists to incorporate à priori model information. In this thesis, learning methods are proposed which explicitly use the available system knowledge. This can be seen as a new class of approaches that bridge model-based and model-free methods. These methods can address some of the hurdles mentioned earlier. For example, i) a prior system information can speed up the learning, ii) new control objectives can be achieved which otherwise would be extremely difficult to attain using only model-based methods, iii) physical meaning can be attributed to the learned controller. The developed approach is as follows: themodel of the given physical system is represented in the port-Hamiltonian (PH) form. For the system dynamics in PH form a passivity-based control (PBC) law is formulated, which often requires the solution to a set of partial differential equations (PDEs). Instead of finding an analytical solution, the PBC control law is parameterized using an unknown parameter vector. Then, by using a variation of the standard actor-critic learning algorithm, the unknown parameters can be learned online. Using the principles of stochastic approximation theory, a proof of convergence for the developed method is shown. The proposedmethods are evaluated for the stabilization and regulation ofmechanical and electro-mechanical systems. The simulation and experimental results show comparable learning curves. In the final part of the thesis a novel integral reinforcement learning approach is developed to solve for the optimal output tracking control problem for a set of linear heterogeneous multi-agent systems. Unlike existing methods, this approach does not need to solve either the output regulator equation or requires a p-copy of the leader’s dynamics in the agent’s control law. A detailed numerical evaluation has been conducted to show the feasibility of the developed method. ...

Over the last couple of decades the demand for high precision and enhanced performance of physical systems has been steadily increasing. This demand often results in miniaturization and complex design, thus increasing the need for complex nonlinear control methods. Some of the state of the art nonlinear methods are stymied by the requirement of full state information, model and parameter uncertainties, mathematical complexity, etc. For many scenarios it is nearly impossible to consider all the uncertainties during the design of a feedback controller. Additionally, while designing a modelbased nonlinear control there is no standard mechanism to incorporate performance measures. Some of the mentioned issues can be addressed by using online learning. Animals and humans have the ability to share, explore, act or respond, memorize the outcome and repeat the task to achieve a better outcome when they encounter the same or a similar scenario. This is called learning from interaction. One instance of this approach is reinforcement learning (RL). However, RL methods are hindered by the curse of dimensionality, non-interpretability and non-monotonic convergence of the learning algorithms. This can be attributed to the intrinsic characteristics of RL, as it is a modelfree approach and hence no standard mechanism exists to incorporate à priori model information. In this thesis, learning methods are proposed which explicitly use the available system knowledge. This can be seen as a new class of approaches that bridge model-based and model-free methods. These methods can address some of the hurdles mentioned earlier. For example, i) a prior system information can speed up the learning, ii) new control objectives can be achieved which otherwise would be extremely difficult to attain using only model-based methods, iii) physical meaning can be attributed to the learned controller. The developed approach is as follows: themodel of the given physical system is represented in the port-Hamiltonian (PH) form. For the system dynamics in PH form a passivity-based control (PBC) law is formulated, which often requires the solution to a set of partial differential equations (PDEs). Instead of finding an analytical solution, the PBC control law is parameterized using an unknown parameter vector. Then, by using a variation of the standard actor-critic learning algorithm, the unknown parameters can be learned online. Using the principles of stochastic approximation theory, a proof of convergence for the developed method is shown. The proposedmethods are evaluated for the stabilization and regulation ofmechanical and electro-mechanical systems. The simulation and experimental results show comparable learning curves. In the final part of the thesis a novel integral reinforcement learning approach is developed to solve for the optimal output tracking control problem for a set of linear heterogeneous multi-agent systems. Unlike existing methods, this approach does not need to solve either the output regulator equation or requires a p-copy of the leader’s dynamics in the agent’s control law. A detailed numerical evaluation has been conducted to show the feasibility of the developed method.

Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning

Journal article (2016) - H Modares, Subramanya Nageshrao, Gabriel Delgado Lopes, Robert Babuska, FL Lewis

This paper considers optimal output synchronization of heterogeneous linear multi-agent systems. Standard approaches to output synchronization of heterogeneous systems require either the solution of the output regulator equations or the incorporation of a p-copy of the leader’s dynamics in the controller of each agent. By contrast, in this paper neither one is needed. Moreover, here both the leader’s and the follower’s dynamics are assumed to be unknown. First, a distributed adaptive observer is designed to estimate the leader’s state for each agent. The output synchronization problem is then formulated as an optimal control problem and a novel model-free off-policy reinforcement learning algorithm is developed to solve the optimal output synchronization problem online in real time. It is shown that this optimal distributed approach implicitly solves the output regulation equations without actually doing so.
Simulation results are provided to verify the effectiveness of the proposed approach. ...

Learning complex behaviors via sequential composition and passivity-based control

Book chapter (2015) - GAD Lopes, E Najafi, SP Nageshrao, R Babuska