Data Generation Methods for Multi-Label Images of LEGO Bricks

Bachelor Thesis (2020)
Author(s)

B.H. Kam (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Lengyel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

J.C. Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Ricardo Guerra Marroquim – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2020 Berend Kam
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 Berend Kam
Graduation Date
22-06-2020
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project', 'Can deep learning recognize LEGO pieces?']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Data collection and annotation have proven to be a bottleneck for computer vision applications. When faced with the task of data creation, alternative methods to traditional data collection should be considered, as time and cost may be reduced signif- icantly. We introduce three novel datasets for multi- label classification purposes on LEGO bricks: a traditionally collected dataset, a rendered dataset, and a dataset with pasted cutouts of LEGO bricks. We investigate the accuracy of a ResNet classi- fier tested on real data, but trained on the different datasets. This research seeks to provide both in- sight into future dataset creation of LEGO bricks, as well as act as an advisor for general multi-label dataset creation. Our findings indicate that the tra- ditionally collected dataset is prone to overfitting due to speedups used during collection, with 90% accuracy during training but 19% during testing. Furthermore, synthetic data techniques are appli- cable to multi-label LEGO classification but need improvement, with accuracy ranging from 34% to 45%.

Files

License info not available