Table tennis tutor

Forehand strokes classification based on multimodal data and neural networks

Journal Article (2021)
Author(s)

Khaleel Asyraaf Sanusi (Cologne Game Lab)

Daniele Di Mitri (DIPF - Leibniz Institute for Research and Information in Education)

Bibeg Hang Limbu (TU Delft - Web Information Systems)

Roland Klemke (Open University of the Netherlands, Cologne Game Lab)

Research Group
Web Information Systems
Copyright
© 2021 Khaleel Asyraaf Mat Sanusi, Daniele Di Mitri, B.H. Limbu, Roland Klemke
DOI related publication
https://doi.org/10.3390/s21093121
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Khaleel Asyraaf Mat Sanusi, Daniele Di Mitri, B.H. Limbu, Roland Klemke
Research Group
Web Information Systems
Issue number
9
Volume number
21
Pages (from-to)
1-18
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Beginner table-tennis players require constant real-time feedback while learning the funda-mental techniques. However, due to various constraints such as the mentor’s inability to be around all the time, expensive sensors and equipment for sports training, beginners are unable to get the immediate real-time feedback they need during training. Sensors have been widely used to train beginners and novices for various skills development, including psychomotor skills. Sensors enable the collection of multimodal data which can be utilised with machine learning to classify training mistakes, give feedback, and further improve the learning outcomes. In this paper, we introduce the Table Tennis Tutor (T3), a multi-sensor system consisting of a smartphone device with its built-in sensors for collecting motion data and a Microsoft Kinect for tracking body position. We focused on the forehand stroke mistake detection. We collected a dataset recording an experienced table tennis player performing 260 short forehand strokes (correct) and mimicking 250 long forehand strokes (mistake). We analysed and annotated the multimodal data for training a recurrent neural network that classifies correct and incorrect strokes. To investigate the accuracy level of the afore-mentioned sensors, three combinations were validated in this study: smartphone sensors only, the Kinect only, and both devices combined. The results of the study show that smartphone sensors alone perform sub-par than the Kinect, but similar with better precision together with the Kinect. To further strengthen T3’s potential for training, an expert interview session was held virtually with a table tennis coach to investigate the coach’s perception of having a real-time feedback system to assist beginners during training sessions. The outcome of the interview shows positive expectations and provided more inputs that can be beneficial for the future implementations of the T3.