Offline Compression of Convolutional Neural Networks on Edge Devices
S.A. Tulling (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Y. Chen – Mentor (TU Delft - Data-Intensive Systems)
Amirmasoud Ghiassi – Graduation committee member (TU Delft - Data-Intensive Systems)
B.A. Cox – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M Zuñiga Zamalloa – Coach (TU Delft - Embedded Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Edge Devices and Artificial Intelligence are important and ever increasing fields in technology. Yet their combination is lacking because the neural networks used in AI are being made increasingly large and complex while edge devices lack the resources to keep up with these developments. Neural network model compression will allow these edge devices to run these models due to overcoming memory constraints. This paper proposes to use both singular value decomposition and canonical polyadic decomposition as a way to decrease the size of convolutional neural networks at the cost of some accuracy. This compression pipeline can be run on an edge device and is configurable to change the trade-off between file size and accuracy. This creates a possibility to run convolutional neural networks natively on edge devices.