Predicting Injury Risk with Machine Learning Methods using a Longitudinal Data Set

More Info


Sports injury has long been causing concern to the athlete’s performance, financial aspect, and psychological impact. This is even more prominent in recent years as sports become more widely available o the mass population. Reducing the chances of athletes experiencing injuries not allows them to maintain optimal performance during training and competition but is also psychological benefits for the athletes. Amsterdam UMC has gathered weekly OSTRC data from 19 Waterpolo athletes over 109 weeks. By using this data, it is possible to build a model that can provide an indication of possible injury risks thus helping athletes in controlling possible injuries. Traditionally, statistical models are used for this kind of analysis. However, this often requires large amount of time and a-priori knowledge. Hence an injury risk classifier method based on longitudinal data (OSTRC) and machine learning algorithm was proposed in this study. The chosen ML algorithms for this study are ANN, LSTM, and Random Forrest. To investigate the importance of time dependency between data entries, sliding window, and forward chaining are used during data processing. All models trained with minimally processed data, sliding window data, and forward chaining data to investigate the impact of time dependency on model accuracy. The OSTRC data set is pre-processed, and SW and FC methods are applied. Data is then split into training and testing data set. Metric used to assess model performances are accuracy and confusion matrix. The results of this study show that both RF-SW and LSTM-SW produced an accuracy of 92.17% and 91.67% respectively. Accuracy of ANN on training data indicates adequate performance, but the confusion matrix indicates poor performance. Confusion of RF-SW and LSTM-SW show small variation due to differences in the test data set, which can be mitigated as more data are available. Inconclusion, the high level of accuracy from both RF-SW and LSTM-SW proves that it can be used to provide insight to athletes, helping them to reduce the chances of injury.