Learning-based control for pushing with a non-holonomic mobile robot

More Info
expand_more

Abstract

Mobile robots are getting more common in warehouses, distribution centers and factories, where they are used to boost productivity. At the same time, they are moving into the everyday world, where they need to operate in uncontrolled and cluttered environments. In order to extend the set of tasks that a robot can autonomously accomplish in such environments, we can equip mobile bases with a pushing skill. Possible use cases are delivering packages by pushing objects to a goal location, or clearing a path by pushing movable obstacles out of the way.

This thesis considers pushing with a non-holonomic mobile robot in cluttered environments. We develop a learning-based method, based on ensembles of probabilistic neural networks for learning the object’s dynamics. The models capture the inherent stochasticity of pushing from the variability of friction. In addition, the ensembles can capture the uncertainty that arises from a lack of data, so we can reason about which regions to avoid during the pushing and do not come into unrecoverable states.

The model is combined with a Model Predictive Path Integral (MPPI) controller. Each timestep the controller optimizes for a finite time horizon, which allows us to recover from slight model inaccuracies. The MPPI controller is capable of handling complex cost functions, which allows us to include the estimated uncertainty from the model to avoid uncertain regions. For obstacle-aware pushing, we provide the MPPI controller with a costmap grid, that contains the free pushing space, which the robot and object can use to maneuver to the goal location. Both the robot and object are constrained to keep within the free pushing space, which is constructed using LiDAR measurements.

We verify the method in simulation and compare it to two baseline methods: the state-of-the-art adaptive pushing controller proposed by Krivic et al. [31] and a pushing policy trained with the model-free reinforcement learning baseline Soft Actor-Critic (SAC). All three baselines reach comparable success rates in simulation of 100%, 98% and 96%, respectively. Furthermore, the learning-based MPPI shows improvements of 50% and 23% in accuracy compared to the baselines. The results are consistent for pushing in cluttered environments, where we show an increase of accuracy by 43%, compared to Krivic et al. [31].

Lastly, we validate the approach in a real-world setting, where the robot and object are tracked with a motion capture system. The approach has an overall success rate of 94% and performs successful pushing in a cluttered environment.