Neural networks for non-contact oxygen saturation estimation from the face
J.J.M. Kok (TU Delft - Electrical Engineering, Mathematics and Computer Science)
J.C. Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
M. Bittner – Graduation committee member (TU Delft - Biomechatronics & Human-Machine Control)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
COVID-19 drastically raised the importance of non-contact based healthcare methods. Low blood oxygen levels of a person, which can be unnoticeable, are potentially a precursor of COVID-19. Contact based methods for measuring blood oxygen saturation could spread the contagious disease. Therefore, this paper investigates non-contact RGB camera-based peripheral oxygen saturation estimation by remote photoplethysmography (rPPG) methods. The novel aspects of non-contact oxygen saturation that we are looking into are: (1) Applying SpO$_2$ predictor neural networks to rPPG signals obtained from facial regions, instead of the less practical hand based skin regions. To be more specific, we show in a facial based setting that in the relatively uncontrolled environment the traditional Ratio-of-Ratios pulse oximetry principles fail. In the leave-one-participant-out experiments, the RoR method achieved a correlation of $-0.05$, whereas neural networks showed the capability of dealing with the inherent challenges of the PURE dataset by achieving a superior correlation of $0.64$. These challenges are lighting variation due to subtle head motion and clouds alternatively blocking the sun. (2) The first end-to-end neural networks for SpO$_2$ estimation are introduced by replacing traditional hard pixel region-of-interest selectors, which assign equal weight to each selected pixel, with convolutional soft-attention masks. (3) By using an adapted version of a recent heart and breathing rate estimator network, called DeepPhys, we indicate that the current state-of-the-art is far from optimal. This is done by comparing the window-based constructed end-to-end neural networks with Adapted DeepPhys, which is based on single frame differences. Finally, our research\footnote{Code available on \url{https://github.com/jimkok9/oxygenSaturation}} shows that non-contact facial based SpO$_2$ estimation by RGB camera remains a difficult task. However, as our results indicate, more sophisticated deep learning model might become a viable diagnostic tool for this task in the future.