Data Generation Methods for Multi-Label Images of LEGO Bricks

None, None

Data Generation Methods for Multi-Label Images of LEGO Bricks

Bachelor Thesis (2020)

Author(s)

B.H. Kam (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Lengyel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

J.C. Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Ricardo Guerra Marroquim – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Computer vision Multi-label classification Data collection Lego bricks Synthetic Data-set Data-set

To reference this document use:

https://resolver.tudelft.nl/uuid:5a7acca4-3c2f-4d5f-841f-dcb7a2ed007e

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

22-06-2020

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project', 'Can deep learning recognize LEGO pieces?']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Data collection and annotation have proven to be a bottleneck for computer vision applications. When faced with the task of data creation, alternative methods to traditional data collection should be considered, as time and cost may be reduced signif- icantly. We introduce three novel datasets for multi- label classification purposes on LEGO bricks: a traditionally collected dataset, a rendered dataset, and a dataset with pasted cutouts of LEGO bricks. We investigate the accuracy of a ResNet classi- fier tested on real data, but trained on the different datasets. This research seeks to provide both in- sight into future dataset creation of LEGO bricks, as well as act as an advisor for general multi-label dataset creation. Our findings indicate that the tra- ditionally collected dataset is prone to overfitting due to speedups used during collection, with 90% accuracy during training but 19% during testing. Furthermore, synthetic data techniques are appli- cable to multi-label LEGO classification but need improvement, with accuracy ranging from 34% to 45%.

Files

Data_Generation_Methods_for_Mu... (pdf)

(pdf | 1.3 Mb)

License info not available