Scheduling a Flexible Manufacturing System

A reinforcement learning based approach

Master Thesis (2023)
Author(s)

C.H. Pennings (TU Delft - Mechanical Engineering)

Contributor(s)

T. Keviczky – Mentor (TU Delft - Team Tamas Keviczky)

N. Yorke-Smith – Graduation committee member (TU Delft - Algorithmics)

Faculty
Mechanical Engineering
More Info
expand_more
Publication Year
2023
Language
English
Graduation Date
12-06-2023
Awarding Institution
Delft University of Technology
Programme
Mechanical Engineering, Systems and Control
Faculty
Mechanical Engineering
Downloads counter
215
Collections
thesis
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A flexible manufacturing system (FMS) has advantages over traditional manufacturing systems due to its ability to deal with unpredicted circumstances such as changes in demand or component breakdowns by re-routing. However, this flexibility increases the complexity of controlling such a system. Traditionally, the system model is simplified to reduce the solution space by removing intra-machine transportation complexities. This thesis explores how these complexities can be kept and accounted for during scheduling. A scheme is used where short term schedules are continuously calculated to determine the optimal schedule over the next timeframe. The flexible job shop scheduling problem with transport (FJSPT) is used to represent the complexities of the FMS. To calculate part-schedules repeatedly a fast constructive search method is needed, the AlphaZero framework is identified as a fitting candidate. The FJSPT is translated into the reinforcement learning framework using a reduced action space, a graph neural network based state representation and normalized reward function. A naive normalization approach for the reward function is found to introduce problems in the value function sensitivity, while other adaptive method show fundamental flaws. A novel normalization method is introduced using min-max adaptive normalization and suboptimal node inclusion to improve value function training data. Implementing and training the algorithm shows the method performs poorly in comparison to metaheuristic based algorithms for the FJSPT problem. The value function is not able to converge to training data, while this is critical for the self-improvement training of the algorithm. Future work should focus on developing a normalized value function that is sensitive to solution quality and is able to converge. Despite the challenges, the work provides insights into the complexities of implementing AlphaZero for combinatorial optimization.

Files

Thesis_Casper_Pennings.pdf
(pdf | 3.54 Mb)
License info not available