µLightDigit: A TinyML System for Contactless Digit Recognition using Ambient Light

More Info
expand_more

Abstract

This thesis describes the design and implementation of µLightDigit, which is the second iteration of the LightDigit project: a contactless air-writing system based on ambient light detection with embedded deep learning using only simple photodiodes. The system is able to classify digits 0–9 written in the air by detecting the dynamic hand shadow using a 3×3 array of simple photodiodes. A novel TinyML system is developed by transforming LightDigit from a Raspberry Pi 4 single- board computer to an STM32H743 microcontroller. This transition is burdened with a significant reduction in computation power while having the goal of creating a more robust and adaptable system. To achieve this, a new photodiode array is designed to overcome the saturation issues of the photodiode array developed for LightDigit. A preliminary analysis is performed to investigate the performance of three deep learning models on the microcontroller: a tiny convolution neural network (CNN), a tiny long short-term memory (LSTM) model, and a tiny Conv-LSTM hybrid. For the final system evaluation, the tiny CNN and ConvLSTM models are chosen which provided the best performance in the preliminary evaluation. In order to test the system’s robustness, three different indoor setups and one outdoor setup are created. These setups are created to be reproducible and serve for the verification of the system in different ambient lighting conditions and light intensities. Shadow patterns are recorded for each of the setups and are compared to explain the digit recognition results. The tiny CNN model results in an average accuracy of 84.6% with a maximum of 97.0% and a minimum of 77.8% accuracy across the four different test setups. The tiny ConvLSTM model achieves an average accuracy of 82.4%, with a maximum of 89.3% and a minimum of 72.4%. The tiny ConvLSTM model outperforms the CNN model in the outdoor setup but is generally outclassed by the CNN in all indoor setups. The CNN model is the lightest model with 792 ms inference time and a total size of 138k parameters, which translates to 147 kB. The tiny ConvLSTM model is larger with 446k parameters at 494 kB and 947 ms inference time. For the real-time inference time on the STM32H743 microcontroller, both the tiny CNN and ConvLSTM models remain below 1 second, which is fast enough for real-time applications.

Files

Thesis_Koen_Goedemondt_uLightD... (pdf)
(pdf | 29.7 Mb)
- Embargo expired in 01-11-2023
Unknown license