Synthesizing Comics via Conditional Generative Adversarial Networks

Morris, Darwin

Synthesizing Comics via Conditional Generative Adversarial Networks

Title

Synthesizing Comics via Conditional Generative Adversarial Networks

Author

Morris, Darwin (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Chen, Lydia Y. (mentor)
Zhao, Z. (mentor)
van Deursen, A. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2021-07-01

Abstract

The creation of comic illustrations is a complex artistic process resulting in a wide variety of styles, each unique to the artist. Conditional image synthesis refers to the generation of de novo images based on certain preconditions. Applying machine learning to conditionally generate novel comics proves an intriguing yet difficult task. This paper aims to answer whether Generative Adversarial Networks (GANs) can be used for conditional comic synthesis. Recent advancements in Generative Adversarial Networks have increased the capability of image synthesis to hyper-realistic levels. Despite this, the performance of GAN models is almost always assessed on photo-realistic images. To extend experimental knowledge of unconditional GAN performance into the domain of comics, an empirical analysis was performed on the unconditioned generative performance of three cutting edge GAN architectures: Deep Convolutional GAN (DCGAN), Wasserstein GAN (WGAN), and Stability GAN (SGAN). This paper showed that the SGAN implementation far outperforms both the DCGAN and WGAN architectures on a dataset of Dilbert comics, achieving an FID score of 89.1. Due to their relative simplicity, comics provide an intriguing candidate for conditional generation. A comic panel can likely be described using a few specific labels (eg. background and characters). Two conditional networks were created, using the SGAN architecture as a baseline. Multi Class SGAN (MC-SGAN) used a traditional multi-class conditional approach while the Multi Label SGAN (ML-SGAN) utilized a multi-label auxiliary classification approach. Multiple experiments were performed between these two networks resulting in hundreds of hours of training. While performance between the networks was quite similar on simple conditional tasks, on more complex tasks MC-SGAN outperformed ML-SGAN. MC-SGAN was able to conditionally generate comics based on character and color, with desired conditions distinguishable in almost all outputs. Issues with traditional methods of auxiliary classifier training in the MC-SGAN implementation are additionally identified and discussed.

Subject

Generative Adversarial Network
Machine Learning
image synthesis

To reference this document use:

http://resolver.tudelft.nl/uuid:2a21a50a-81c4-4d0d-9dbc-5dbce67f2936

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

Darwin_Morris_Thesis.pdf

3.95 MB

Close viewer