Temporal Delta Layer

Training Towards Brain Inspired Temporal Sparsity for Energy Efficient Deep Neural Networks

Master thesis (2021)

Authors

P. Preetha Vijayan Electrical Engineering, Mathematics and Computer Science

Contributors

T.G.R.M. van Leuken Signal Processing Systems - (supervisor 1)

Z. Al-Ars Computer Engineering - (supervisor 2)

Amirreza Yousefzadeh Stichting IMEC Nederland (coach)

Manolis Sifalakis Stichting IMEC Nederland (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science

Video processing Temporal sparsity Neuromorphic

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:0806241d-9037-4094-a197-6e65d6482f2b

Published Date

25-08-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

In the recent past, real-time video processing using state-of-the-art deep neural networks (DNN) has achieved human-like accuracy but at the cost of high energy consumption, making them infeasible for edge device deployment. The energy consumed by running DNNs on hardware accelerators is dominated by the number of memory read/writes and multiplyaccumulate (MAC) operations required. As a potential solution, this work explores the role of activation sparsity in efficient DNN inference. As the predominant operation in DNNs is matrix-vector multiplication of weights with activations, skipping operations and memory
fetches where (at least) one of them is zero can make inference more energy efficient. Although spatial sparsification of activations is researched extensively, introducing and exploiting temporal sparsity is much less explored in DNN literature. This work presents a new DNN layer (called temporal delta layer) whose primary objective is to induce temporal activation sparsity during training. The temporal delta layer promotes activation sparsity by performing delta operation facilitated by activation quantization and l1 norm based penalty to the cost function. During inference, the resulting model acts as a conventional quantized
DNN with high temporal activation sparsity. The new layer was incorporated as a part of the standard ResNet50 architecture to be trained and tested on the popular human action recognition dataset (UCF101). The method caused 2x improvement in activation sparsity, with 5% accuracy loss.

Files

Preetha_Vijayan_Thesis_Report_... (.pdf)

(.pdf | 4.22 Mb)