Extending AIfES with Depthwise Convolution

Implementation and Evaluation of Depthwise Convolution on Microcontrollers

Bachelor Thesis (2026)
Author(s)

Z.I.K. Krassenburg (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

H. Liu – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

M.A. Zuñiga Zamalloa – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.M. Weber – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
23-06-2026
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
4
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Artificial Intelligence is increasingly being used in everyday devices. However, most AI systems are designed to run on powerful computers or cloud servers rather than on small, low-power devices such as microcontrollers. Running AI directly on these devices can reduce energy consumption and enable systems to operate without an internet connection. AIfES (Artificial Intelligence for Embedded Systems) is a machine learning framework that allows neural networks to be trained directly on microcontrollers. However, it currently lacks support for depthwise convolution, an important operation used in efficient neural network architectures such as MobileNet. As a result, many modern computer vision models cannot be trained within the framework.

This project extends AIfES with support for depthwise convolution and integrates the new operator into the existing training pipeline. The implementation was validated using a combination of manually verified test cases, comparisons with TensorFlow, and image classification experiments on embedded hardware. The results show that the new operator functions correctly during both inference and training. Models containing the implemented layer successfully learned classification tasks and achieved behavior similar to equivalent TensorFlow models. By adding support for depthwise convolution, this work expands the range of neural network architectures that can be trained directly on microcontrollers and contributes to making on-device AI more practical and flexible.

Files

Research_paper_cse3000.pdf
(pdf | 0.657 Mb)
License info not available