Controlling Poisson Flow Generative Model

Implementing a class conditional generative model

Bachelor Thesis (2023)
Author(s)

I. GEORGIADES (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Y. Chen – Mentor (TU Delft - Data-Intensive Systems)

Z. Zhao – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 IOANNIS GEORGIADES
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 IOANNIS GEORGIADES
Graduation Date
27-06-2023
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project', 'Comics Illustration Synthesizer using Deep Generative Models']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With the following paper we are planning to present and explore the possibilities of the the newly introduced Poisson Flow Generative Model (PFGM). More specifically, this work aims to introduce the Conditional Poisson Flow Generative Model (CoPFGM), which by extending the existing repository of the PFGM, it will be able to be trained in a way that allows for conditional image sampling. The work aims to provide a more modular solution that can be easily adjusted for multiple data sets, including custom, as well as datasets taken directly from large Python libraries such as PyTorch and TensorFlow. Our proposed CoPFGM consists of two steps: (i) modifying the input of underlying UNet and (ii) modifying the loss function. More specifically, for (i) we have augmented the input channels of every image with one-hot-like class conditional images, and about (ii) we are introducing an updated loss function which incorporates the Cross-Entropy Loss of the generated images during training. The proposed model is tested against 2 datasets, the MNIST, and the Dilbert dataset, the latter, consists of 1100 custom images of the faces of 6 characters taken from the Dilbert Comic-Strip. The proposed model will be tested and presented in the form of an Ablation Study, with which, we show the conditional behavior of the channel augmentation, and the image improvement in terms of class representation with the Cross-Entropy loss.

Files

License info not available