Optimizing a Robot Fleet Scheduling Model and Floorplan using Max-Plus Linear Algebra and Deep Q-Learning

More Info
expand_more

Abstract

Automation of machines is becoming increasingly widespread and advanced, of which an example is the use robots for Prime Vision, which sorts parcels for postal services. The coordination of scheduling a fleet of robots picking up and dropping off many parcels while avoiding collisions, within a limited space, following predefined routes in a floorplan, is a complex problem.

This logistical challenge can be effectively modelled using max-plus linear algebra to allow an optimization for the route scheduling as was previously done by L. Smeets. The goal of this research is to improve the existing scheduling model and use this to develop a reinforcement learning-based algorithm that determines the optimal floorplan for the parcel delivery robots. Two methods are applied to improve the existing scheduling model. Firstly, nodes where no decisions are made are identified and removed. Secondly, certain constraints are also removed to simplify the model.

The results of the scheduler are used to determine a key performance indicator to allow a reinforcement learning based algorithm to identify the optimal floorplan for the robots. The reinforcement learning algorithm employed a deep Q-learning approach, with the neural network trained using various action space approaches, tuned rewards and hyper-parameters. The greedy-epsilon method was applied to address the exploration vs. exploitation problem. While the scheduler improvements significantly enhanced its computational costs, the neural network did not converge, and the potential causes are thoroughly discussed.

Files

Master_Thesis_-_Emma_Boelen.pd... (.pdf)
warning

File under embargo until 30-08-2026