An Area and Energy Efficient Arithmetic Unit for Stacked Machine Learning Models

None, None

An Area and Energy Efficient Arithmetic Unit for Stacked Machine Learning Models

Mo Model Mo Problems Like... Hardware Design Problems

Master Thesis (2024)

Author(s)

F.H. van der Kolk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S Hamdioui – Mentor (TU Delft - Computer Engineering)

Anteneh Gebregiorgis – Graduation committee member (TU Delft - Computer Engineering)

René van den Berg – Graduation committee member (NXP Semiconductors)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine Learning Reliability Reconfigurable architecture Lifetime prediction Computer Arithmetic Invariant Integer Division

To reference this document use:

https://resolver.tudelft.nl/uuid:7dcddaae-dad1-4804-bf5b-c8706e4e2e59

More Info

expand_more

Publication Year

2024

Language

English

Copyright

Graduation Date

22-01-2024

Awarding Institution

Delft University of Technology

Project

['Lifetime Prediction for embedded automotive devices']

Programme

['Computer Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Machine learning on edge devices performs crucial identification or prediction tasks while limiting the amount of data that needs to be transmitted to more centralized computing nodes. However, strict area and energy requirements necessitate specialized hardware developed for the requirements of the device and model. This thesis is concerned with developing an area and energy arithmetic unit as part of the implementation of a stacked machine learning model in embedded automotive devices. The model in question was previously designed to perform lifetime prediction with the goal of improving the reliability of semiconductor devices used in various automotive applications.

This thesis aims to achieve area and energy efficiency by exploiting the commonalities in the arithmetic operations of several of the internal learners of the stacked machine learning model. The use of a weighted figure of merit, taking into account area, energy and delay, allow for simple comparisons of designs at any operation frequency and easy insight into the changes in the merit of designs if device requirements were to change. A sweep of the percentage of multiplications in the workload also gave insight into how design choices may change due to future redesigns of the stacked machine learning model.

It was found that the MAC, multiply, divide and accumulate operations of the internal learners can best be supported by one arithmetic unit containing a "Reduced Area" parallel multiplier (still taking up most of the area), a small, dedicated accumulator and invariant integer division using the multiplier. It was also found that the ability to reconfigure the multiplier for different levels of bit-precision does not yield performance improvement for the expected precision distribution.

Files

Thesis_Stacked_ML_Model_Hardwa... (pdf)

(pdf | 1.98 Mb)

- Embargo expired in 07-02-2024

License info not available