Temporal Delta Layer

Training Towards Brain Inspired Temporal Sparsity for Energy Efficient Deep Neural Networks

Master Thesis (2021)
Author(s)

P. Preetha Vijayan (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

R. Leuken – Mentor (TU Delft - Signal Processing Systems)

Z. Al-Ars – Graduation committee member (TU Delft - Computer Engineering)

Amirreza Yousefzadeh – Coach (Stichting IMEC Nederland)

Manolis Sifalakis – Coach (Stichting IMEC Nederland)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Preetha Preetha Vijayan
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Preetha Preetha Vijayan
Graduation Date
25-08-2021
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the recent past, real-time video processing using state-of-the-art deep neural networks (DNN) has achieved human-like accuracy but at the cost of high energy consumption, making them infeasible for edge device deployment. The energy consumed by running DNNs on hardware accelerators is dominated by the number of memory read/writes and multiplyaccumulate (MAC) operations required. As a potential solution, this work explores the role of activation sparsity in efficient DNN inference. As the predominant operation in DNNs is matrix-vector multiplication of weights with activations, skipping operations and memory
fetches where (at least) one of them is zero can make inference more energy efficient. Although spatial sparsification of activations is researched extensively, introducing and exploiting temporal sparsity is much less explored in DNN literature. This work presents a new DNN layer (called temporal delta layer) whose primary objective is to induce temporal activation sparsity during training. The temporal delta layer promotes activation sparsity by performing delta operation facilitated by activation quantization and l1 norm based penalty to the cost function. During inference, the resulting model acts as a conventional quantized
DNN with high temporal activation sparsity. The new layer was incorporated as a part of the standard ResNet50 architecture to be trained and tested on the popular human action recognition dataset (UCF101). The method caused 2x improvement in activation sparsity, with 5% accuracy loss.

Files

License info not available