Mixed Integer (Non-) Linear Programming Formulations of Graph Neural Networks

Master Thesis (2022)
Author(s)

T.H.J.N. Mc Donald (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Artur Schweidtmanna – Mentor (TU Delft - ChemE/Product and Process Engineering)

N. Yorke-Smith – Graduation committee member (TU Delft - Algorithmics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Tom Mc Donald
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Tom Mc Donald
Graduation Date
11-11-2022
Awarding Institution
Delft University of Technology
Programme
Applied Mathematics
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recently, ReLU neural networks have been modelled as constraints in mixed integer linear programming (MILP) enabling surrogate-based optimisation in various domains as well as efficient solution of machine learning verification problems. However, previous works have been limited to multilayer perceptrons (MLPs). The Graph Convolutional Neural Network (GCN) model and the GraphSAGE model can learn from non-euclidean data structures efficiently. We propose a bilinear formulation for ReLU GCNs and a MILP formulation for ReLU GraphSAGE models. We compare our formulations to a Genetic Algorithm (GA) by comparing solution times and optimality gaps while modelling a dataset of boiling points of different molecules. Our method guarantees to solve optimisation problems with trained GNNs embedded to global optimality. Between our two formulations the GraphSAGE neural network achieves similar model accuracy, and achieves faster solving times when embedded as a surrogate model in an MILP problem. Finally, we present a computer aided molecular design (CAMD) case study where the formulations of the trained GNNs are used to find molecules with optimal boiling points.

Files

TomMcDonaldThesis.pdf
(pdf | 18.2 Mb)
License info not available