Representing CNN Feature Maps with Implicit Neural Representations

None, None

Representing CNN Feature Maps with Implicit Neural Representations

A Proof-of-Concept Study Using SIRENs

Master Thesis (2025)

Author(s)

B.Y. He (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.C. van Gemert – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

A.S. Gielisse – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

K.A. Hildebrandt – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Deep Learning Computer Vision CNNs INRs Feature Maps

To reference this document use

https://resolver.tudelft.nl/uuid:960403d6-da69-4e1c-a0e9-24379a56437f

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

11-12-2025

Awarding Institution

Delft University of Technology

Project

Computer Vision Group

Programme

Computer Science, Software Technology

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

94

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

High-resolution image analysis using deep Convolutional Neural Networks (CNNs) faces significant memory constraints due to the quadratic growth of intermediate feature maps with input resolution. This paper investigates whether Implicit Neural Representations (INRs), specifically SIRENs, can effectively represent CNN feature maps to reduce memory footprint during training. We address the unique challenge that CNN feature maps are not static signals but evolve continuously as network weights are updated through gradient-based optimization. Through three experiments on a modified All-CNN architecture trained on MNIST, we validate that: (1) SIRENs can fit static feature maps from frozen CNNs with high fidelity (PSNR > 30 dB) regardless of weight initialization; (2) SIRENs can track evolving feature maps during training, though with reduced reconstruction quality compared to static targets; and (3) SIREN-assisted feedforward—where SIRENs predict missing activations in receptive fields—enables classification accuracy (20.97%) above random guessing (10%) but substantially below standard training (95%). While results demonstrate the feasibility of using SIRENs to represent dynamic feature maps, significant challenges remain in maintaining reconstruction fidelity when SIRENs are integrated into the training loop. This proof-of-concept study provides empirical insights into bridging continuous implicit representations with discrete deep learning pipelines and highlights promising directions for future research in memory-efficient high-resolution image analysis.

Files

Submit_thesis.pdf

(pdf | 3.85 Mb)

License info not available