Controlling Poisson Flow Generative Model

Implementing a class conditional generative model

Bachelor thesis (2023)

Authors

I. GEORGIADES Electrical Engineering, Mathematics and Computer Science

Contributors

Lydia Y. Chen Data-Intensive Systems - (supervisor 1)

Z. Zhao Data-Intensive Systems - (supervisor 2)

Faculty

Electrical Engineering, Mathematics and Computer Science

Generative model Poisson flow Conditional Generative model Comic synthesis

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:fcf831b5-2bc3-4016-9252-63283499930d

Published Date

27-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

With the following paper we are planning to present and explore the possibilities of the the newly introduced Poisson Flow Generative Model (PFGM). More specifically, this work aims to introduce the Conditional Poisson Flow Generative Model (CoPFGM), which by extending the existing repository of the PFGM, it will be able to be trained in a way that allows for conditional image sampling. The work aims to provide a more modular solution that can be easily adjusted for multiple data sets, including custom, as well as datasets taken directly from large Python libraries such as PyTorch and TensorFlow. Our proposed CoPFGM consists of two steps: (i) modifying the input of underlying UNet and (ii) modifying the loss function. More specifically, for (i) we have augmented the input channels of every image with one-hot-like class conditional images, and about (ii) we are introducing an updated loss function which incorporates the Cross-Entropy Loss of the generated images during training. The proposed model is tested against 2 datasets, the MNIST, and the Dilbert dataset, the latter, consists of 1100 custom images of the faces of 6 characters taken from the Dilbert Comic-Strip. The proposed model will be tested and presented in the form of an Ablation Study, with which, we show the conditional behavior of the channel augmentation, and the image improvement in terms of class representation with the Cross-Entropy loss.

Files

CSE3000_Final_Paper_Ioannis_Ge... (.pdf)

(.pdf | 3.63 Mb)