Deep Visual City Recognition Visualization

None, None; None, None; None, None

Deep Visual City Recognition Visualization

Conference Paper (2019)

Author(s)

Xiangwei Shi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Seyran Khademi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Jan van Gemert (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group

Electronic Instrumentation

To reference this document use

https://resolver.tudelft.nl/uuid:3c430c3c-1a2c-4385-9681-8d3f16752697

More Info

expand_more

Publication Year

2019

Language

English

Research Group

Electronic Instrumentation

Pages (from-to)

1-6

Event

NCCV 2019 – The Netherlands Conference on<br/>Computer Vision (2019-12-16 - 2019-12-17), Wageningen, Netherlands

Downloads counter

243

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Understanding how cities visually differ from each others is interesting for planners, residents, and historians. We investigate the interpretation of deep features learned by convolutional neural networks (CNNs) for city recognition. Given a trained city recognition network, we first generate weighted masks using the known Grad-CAM technique and to select the most discriminate regions in the image. Since the image classification label is the city name, it contains no information of objects that are class-discriminate, we investigate the interpretability of deep representations with two methods. (i) Unsupervised method is used to cluster the objects appearing in the visual explanations. (ii) A pretrained semantic segmentation model is used to label objects in pixel level, and then we introduce statistical measures to quantitatively evaluate the interpretability of discriminate objects. The influence of network architectures and random initializations in training, is studied on the interpretability of CNN features for city recognition. The results suggest that network architectures would affect the interpretability of learned visual representations greater than different initializations.

Files

Shi_Deep_Visual_City_Recogniti... (pdf)

(pdf | 4.9 Mb)

License info not available