Neural network partitioning for resource-limited environments

None, None

Neural network partitioning for resource-limited environments

Master Thesis (2023)

Author(s)

P. Geel (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Zaid Al-Ars – Mentor (TU Delft - Computer Engineering)

Nick van der Meijs – Coach (TU Delft - Signal Processing Systems)

J. Petri-König – Graduation committee member (AMD)

Kevin McElligott – Coach (Alten)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine Learning Deep learning FPGA FINN Heterogeneous acceleration CPU Edge AI

To reference this document use:

https://resolver.tudelft.nl/uuid:e39ca6fc-cdd0-40c1-97ed-cca7ff7232c8

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

02-05-2023

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The demand for implementing neural networks on edge devices has rapidly increased as they allow designers to move away from expensive server-grade hardware. However, due to the limited resources available on edge devices, it is challenging to implement complex neural networks. This study selected the Kria SoM KV260 hardware platform due to its affordability and sufficient hardware capabilities for creating a resource-constrained environment. By leveraging the hardware acceleration capabilities of the FPGA for specific nodes of the MobileNetv1 model and offloading other nodes to the onboard quad-core ARM cortex-A53 CPU, it was feasible to implement a neural network on a hybrid combination of CPU and FPGA. Results showed that when executing the MobileNetv1 model in a hybrid configuration, a total runtime improvement of 2.8x over a pure CPU implementation can be achieved. The study concludes that node-wise partitioning of the MobileNetv1 model is a practical solution. This approach offers a cost-effective solution for users who seek an accessible way to run neural networks without the need for expensive server-grade hardware.

Files

Patrick_Geel_Master_Thesis_fin... (pdf)

(pdf | 4.49 Mb)

License info not available