Using frequency information to improve accuracy of object detectors

None, None

Using frequency information to improve accuracy of object detectors

Bachelor Thesis (2021)

Author(s)

P. Ulev (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Lengyel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Silvia Pintea – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

E. Isufi – Graduation committee member (TU Delft - Multimedia Computing)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine learning Computer vision Object detection Fourier Transform FFT YOLO YOLOv5

To reference this document use:

https://resolver.tudelft.nl/uuid:c8318e36-8351-4307-8125-bda9a00e9b50

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

01-07-2021

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This research paper analyses the effect that using frequency information can have on object detectors. The latter are complex networks that learn information about objects from images and are then able to predict the location of these objects in new, unseen images. There are, however, certain datasets that are hard to learn on, partly because the environment in which images are taken is diverse and complex, and also because the objects to detect can appear in fairly different shapes. The dataset considered in this paper is called the Global Wheat Head Dataset (GWHD, provided by a Kaggle competition). An object detector is run on the original GWHD images and then the performance is compared to running the detector on a frequency filtered version of the images. A mathematical transform called Fourier Transform is used to map images from their spatial (pixel) domain to a new domain called the frequency domain, where certain non-informative frequencies are filtered out and then the images are mapped back to their spatial domain. Two experiments were conducted and results show that with this specific filtering methodology, no improvement is found on the GWHD dataset using an object detector called YoloV5. A pipeline was developed which allows for custom filtering strategy implementations and customs datasets. Similar work has shown that images in their frequency domain can speed up computational time and also increase the accuracy of an object detector, so this paper also gives the opportunity for further experiments with the created pipeline.

Files

Bachelor_Thesis.pdf

(pdf | 1.41 Mb)

License info not available