Print Email Facebook Twitter SPARO: Scalable Sparsity-Aware Event-Driven Architecture for Low-Latency Edge Intelligence Title SPARO: Scalable Sparsity-Aware Event-Driven Architecture for Low-Latency Edge Intelligence Author Upadhyay, Pankaj (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Bishnoi, R.K. (mentor) Vadivel, Kanishkan (mentor) Hamdioui, S. (mentor) Degree granting institution Delft University of Technology Programme Computer Science Date 2024-05-30 Abstract Deep Neural Networks (DNNs) have revolutionized numerous computational fields, from image and speech recognition to autonomous driving and natural language processing. Yet, the substantial computational and energy requirements of DNNs, particularly Convolutional Neural Networks (CNNs), pose significant obstacles to their deployment on resource-constrained edge devices. This thesis presents SPARO, a novel Scalable Sparsity-Aware Event-Driven Architecture designed to overcome these challenges by effectively exploiting sparsity in both neural network weights and activations. SPARO’s architecture is founded upon a unique event-driven dataflow that harnesses the inherent sparsity of CNNs, thereby reducing computational burden and energy consumption. This dataflow is strategically divided into two distinct phases: the Update Phase and the Fire Phase. During the Update Phase, all computations essential for incoming events are executed, while the Fire Phase is dedicated to applying non-linear activation functions and pooling operations to the output feature maps (OFM). This meticulously designed phased approach streamlines data handling, eliminates redundant computations, and significantly boosts overall processing efficiency.A cornerstone of SPARO’s innovation is its dynamic weight reuse mechanism, which intelligently maximizes the reuse of weights across multiple events. This significantly reduces the number of weight fetches needed, thereby improving arithmetic intensity. Furthermore, SPARO leverages advanced sparse data representation techniques to minimize memory usage and further enhance computational efficiency.The efficacy of SPARO is demonstrated through comprehensive evaluations using both synthetic benchmarks and real-world CNN applications, such as gesture recognition and object detection. In the same form-factor, SPARO achieves an impressive 8.5x speedup compared to the baseline Seneca system, delivering real-time performance while consuming only 14% of the energy for the TinyYolo vision task. Subject Edge-AIAcceleratorsSparsity Exploitation To reference this document use: http://resolver.tudelft.nl/uuid:1abd356b-acda-4a27-9b36-75df237ccdb4 Embargo date 2026-05-30 Part of collection Student theses Document type master thesis Rights © 2024 Pankaj Upadhyay Files file embargo until 2026-05-30