A.B. Gebregiorgis | TU Delft Repository

An Area and Energy Efficient Arithmetic Unit for Stacked Machine Learning Models

Mo Model Mo Problems Like... Hardware Design Problems

Master thesis (2024) - F.H. van der Kolk (author) , Said Hamdioui (mentor) , Anteneh Gebregiorgis (graduation committee member) , René van den Berg (graduation committee member)

Machine learning on edge devices performs crucial identification or prediction tasks while limiting the amount of data that needs to be transmitted to more centralized computing nodes. However, strict area and energy requirements necessitate specialized hardware developed for the ...

Benchmarking and Algorithm Optimization for SENeCA

A RISC-V-based Neuromorphic Processor

Master thesis (2022) - Kevin Kevin Shidqi (author) , Said Hamdioui (mentor) , A.B. Gebregiorgis (graduation committee member)

With recent breakthroughs in AI (Artificial Intelligence) technology, the impact of AI on society can be felt in various fields. The market for AI software, for example, reached a valuation of \$62 billion in 2022. A growing number of new computer architectures specialized in run ...

With recent breakthroughs in AI (Artificial Intelligence) technology, the impact of AI on society can be felt in various fields. The market for AI software, for example, reached a valuation of \$62 billion in 2022. A growing number of new computer architectures specialized in running these AI software were also developed. At first they were run on conventional CPUs (Central Processing Unit) and GPUs (Graphical Processing Units), but then more specialized hardware emerged, such as the TPU (Tensor Processing Unit). However, since algorithms in these AI software are generally data-intensive, the power consumption became a problem. Therefore, as many of these algorithms were based on biological neural networks, there is a growing interest to develop hardware similarly based on principles found these networks as well to replicate their efficiency. This new architecture is known as neuromorphic architecture.

However, a new architecture does not come without challenges. As a nascent and fragmented field, neuromorphic computing in general lacks a standardized benchmarking suite or methodology. In other, more mature fields, benchmarks are a standard way of evaluating the performance of different designs objectively and fairly. This thesis aims to propose and demonstrate a benchmarking methodology and implementation flow for neuromorphic processors. This methodology aims to measure the important performance metrics for a neuromorphic processor, both on the small scale of individual synaptic operations, and the large scale of performing an actual workload. The chosen workload is a keyword spotting program based on a simple DNN architecture, which detects a specific phrase in an audio recording. This workload was chosen due to its potential application in an environment where energy is limited, such as an embedded device.

The neuromorphic processor that is the target of this benchmarking is SENeCA (short for Scalable Energy-efficient Neuromorphic Computer Architecture), a flexible and scalable design developed at IMEC The Netherlands. To implement the keyword spotting program on SENeCA, the keyword spotting program was rewritten and parsed. Since no physical chip implementation of SENeCA exists at the time of writing, the program was run on SENeCA using a HDL simulator. The execution time of the program is measured in detail, taking into account not only the total time, but also the time required to complete the specific stages of program. Afterwards, the power consumption of SENeCA during the execution of the program was measured using a power estimation software, both for the entire chip and its individual components. This is done both in average mode, obtaining the average power consumption over the total execution time, and in time-based mode, providing insight to the peak power and fluctuations over time. Then, the energy to solution is calculated using the execution time and power consumption. This process is done in multiple iterations, with a specific optimization done each iteration using SENeCA's accelerators. This provides insight into the impact of each optimization to power consumption and performance. Finally, a measurement of the energy consumption of SENeCA per individual synaptic operations is also done, allowing estimates of the energy consumption of future implementations.

RRAM-based fault-tolerant Binary Neural Networks

Master thesis (2021) - A. Zografou (author) , S Hamdioui (mentor) , R.K. Bishnoi (graduation committee member) , A.B. Gebregiorgis (graduation committee member) , T.G.R.M. van Leuken (graduation committee member)

Computation-In-Memory (CIM) employing Resistive-RAM
(RRAM)-based crossbar arrays is a promising solution to implement Neural Networks (NNs) on hardware, such that they are efficient with respect to consumption of energy, memory, computational resources, and computation time. ...