Representing CNN Feature Maps with Implicit Neural Representations

A Proof-of-Concept Study Using SIRENs

Master Thesis (2025)
Author(s)

B.Y. He (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.C. van Gemert – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

A.S. Gielisse – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

K.A. Hildebrandt – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
11-12-2025
Awarding Institution
Delft University of Technology
Project
Computer Vision Group
Programme
Computer Science, Software Technology
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
77
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

High-resolution image analysis using deep Convolutional Neural Networks (CNNs) faces significant memory constraints due to the quadratic growth of intermediate feature maps with input resolution. This paper investigates whether Implicit Neural Representations (INRs), specifically SIRENs, can effectively represent CNN feature maps to reduce memory footprint during training. We address the unique challenge that CNN feature maps are not static signals but evolve continuously as network weights are updated through gradient-based optimization. Through three experiments on a modified All-CNN architecture trained on MNIST, we validate that: (1) SIRENs can fit static feature maps from frozen CNNs with high fidelity (PSNR > 30 dB) regardless of weight initialization; (2) SIRENs can track evolving feature maps during training, though with reduced reconstruction quality compared to static targets; and (3) SIREN-assisted feedforward—where SIRENs predict missing activations in receptive fields—enables classification accuracy (20.97%) above random guessing (10%) but substantially below standard training (95%). While results demonstrate the feasibility of using SIRENs to represent dynamic feature maps, significant challenges remain in maintaining reconstruction fidelity when SIRENs are integrated into the training loop. This proof-of-concept study provides empirical insights into bridging continuous implicit representations with discrete deep learning pipelines and highlights promising directions for future research in memory-efficient high-resolution image analysis.

Files

Submit_thesis.pdf
(pdf | 3.85 Mb)
License info not available