Top-Down Networks

Lelekas, I.

Top-Down Networks

A coarse-to-fine reimagination of CNNs

Master thesis (2020)

Authors

I. Lelekas Electrical Engineering, Mathematics and Computer Science

Contributors

Jan van van Gemert Pattern Recognition and Bioinformatics (mentor)

Marcel J. T. Reinders Pattern Recognition and Bioinformatics (graduation committee member)

F. M. Vos ImPhys/Computational Imaging (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Adversarial attacks Deep Learning Computer Vision Convolutional Neural Networks Adversarial robustness Top-Down Fine-to-Coarse Coarse-to-Fine Gradcam Object localization

To reference this document use:

http://resolver.tudelft.nl/uuid:11888a7b-1e54-424d-9daa-8ff48de58345

More Info

expand_more

Published Date

13-03-2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Biological vision adopts a coarse-to-fine information processing pathway, from initial visual detection and binding of salient features of a visual scene, to the enhanced and preferential processing given relevant stimuli. On the contrary, CNNs employ a fine-to-coarse processing, moving from local, edge-detecting filters to more global ones extracting abstract representations of the input. In the current paper we propose the extraction of top-down networks, by reversing the feature extraction part of the baseline, bottom-up architecture. This coarse-to-fine pathway, by blurring out higher frequency information and restoring it only at later stages, offers a line of defence against attacks introducing high frequency noise. High resolution of the final convolutional layer's feature map can contribute to the transparency of the network's decision making process, as well as favor more object-driven decisions over context driven ones and thus provide better localized class activation maps. The paper offers empirical evidence for the applicability of the method to various existing architectures, but also on multiple visual recognition tasks.

Files

TopDownNetworks.pdf

(pdf | 12.4 Mb)

License info not available