Data Generation Methods for Multi-Label Images of LEGO Bricks
B.H. Kam (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Lengyel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
J.C. Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Ricardo Guerra Marroquim – Graduation committee member (TU Delft - Computer Graphics and Visualisation)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Data collection and annotation have proven to be a bottleneck for computer vision applications. When faced with the task of data creation, alternative methods to traditional data collection should be considered, as time and cost may be reduced signif- icantly. We introduce three novel datasets for multi- label classification purposes on LEGO bricks: a traditionally collected dataset, a rendered dataset, and a dataset with pasted cutouts of LEGO bricks. We investigate the accuracy of a ResNet classi- fier tested on real data, but trained on the different datasets. This research seeks to provide both in- sight into future dataset creation of LEGO bricks, as well as act as an advisor for general multi-label dataset creation. Our findings indicate that the tra- ditionally collected dataset is prone to overfitting due to speedups used during collection, with 90% accuracy during training but 19% during testing. Furthermore, synthetic data techniques are appli- cable to multi-label LEGO classification but need improvement, with accuracy ranging from 34% to 45%.