On the decomposition of visual sets using Transformers

None, None

On the decomposition of visual sets using Transformers

Master Thesis (2021)

Author(s)

A. Alfieri (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.C. van Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Silvia-Laura Pintea – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Y. Chen – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Transformers Attention UAV DETR Set prediction Polygon Detection IROS

To reference this document use:

https://resolver.tudelft.nl/uuid:8ed688af-49a9-4063-8385-d766046b14b4

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

12-07-2021

Awarding Institution

Delft University of Technology

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Transformers can generate predictions auto-regressively by conditioning each sequence element on the previous ones, or can produce output sequences in parallel. While research has mostly explored upon this difference on tasks that are sequential in nature, we study this contrast on visual set prediction tasks, to analyze the core behaviour of the Transformer model. Multi-label classification, object detection and polygonal shape prediction are all visual set prediction tasks. Precisely predicting polygons in images is an important set prediction problem because polygons are representative of numerous types of objects, such as buildings, people, or obstacles for aerial vehicles. Set prediction is a difficult challenge for deep learning architectures as sets can have different cardinalities and are permutation invariant. We provide evidence on the importance of natural orders for Transformers, analyze the strengths and weaknesses of different solutions that can solve the set prediction task directly, and show the benefit of decomposing complex polygons into sets of ordered points in an auto-regressive manner.

Files

Andrea_Alfieri_thesis_report.p... (pdf)

(pdf | 4.57 Mb)

License info not available