HaarCNN: Detection and Recognition of Comic Strip Characters with Deep Learning and Cascade Classifiers

Bachelor Thesis (2021)
Author(s)

B.M. Kotlicki (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Y. Chen – Mentor (TU Delft - Data-Intensive Systems)

Zilong Zhao – Mentor (TU Delft - Data-Intensive Systems)

Arie Van Deursen – Graduation committee member (TU Delft - Software Technology)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Bartlomiej Kotlicki
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Bartlomiej Kotlicki
Graduation Date
01-07-2021
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project', 'Comics Illustration Synthesizer using Generative Adversarial Networks']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Object detection and recognition is a computer vision problem tackled with techniques such as convolutional neural networks or cascade classifiers. This paper tackles the challenge of using the similar methods in the realm of comics strips characters. We approached the idea of combining cascade classifiers with various convolutional neural network architectures for character detection and recognition in consideration of maintaining low computational overhead. The alternative with the selective search algorithm step was also explored. The name of the pipeline is HaarCNN. We compared it to standard methods to verify a potential improvement. We evaluate 750 number of images extracted from comic strips and achieve over 85% precision and around 80% recall of detected faces and over 80% of correct main character recognitions. The images were processed in around 200 seconds. The potentially satisfying results of character annotation can be advantageous in deep learning sub-fields such as generative adversarial networks.

Files

License info not available