Adaptive Compression of Deep Learning Models for Edge Inference via Bayesian Decomposition and Quantization Gates

Master Thesis (2025)
Author(s)

J.J. van de Weg (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J. Dauwels – Mentor (TU Delft - Signal Processing Systems)

Sinian Li – Mentor (TU Delft - Signal Processing Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
12-09-2025
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering | Signals and Systems']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With the growing developments in Artificial Intelligence (AI), deep learning models have become an attractive solution for industrial applications such as machine health monitoring and predictive maintenance. To enable real-time analysis and reduce reliance on cloud infrastructure, it is often more practical to process sensor data directly on edge devices. However, while deep learning models offer improved performance, their high memory and computational demands often exceed the limited resources of edge devices. Moreover, compression requires a lot of hyperparameter tuning, which is unique for each model, layer, and application. To address these limitations, this work utilizes dynamic Bayesian compression, which reduces model size and computational costs. By introducing learnable gate variables that control the quantization precision and the rank of decomposed factors, the model can adaptively determine the most efficient configuration for each layer during training. This results in a more flexible, end-to-end trainable compression scheme that maintains performance while significantly improving deployability on edge devices.

Files

License info not available