Personalised prediction of skill development and retention using XGBoost and SHAP

More Info
expand_more

Abstract

Predicting individual skill retention, the extent to which human operators retain learned skills over time is limited by lengthy experiments and identifying patterns in the highly dimensional data. Using machine learning to process this data and find patterns could provide a regression prediction of this data. This paper investigates the use of an Extreme Gradient Boosting (XGBoost) technique, fed by a training dataset originating from a skill-based tracking task, for predicting a high-resolution individual skill retention curve. This training dataset is divided in different feature classes and analyzed by SHapley Additive exPlanations (SHAP) to identify robust predictors. Also, the proposed XGBoost model application is separately trained on the experiment data and synthetic data, which is generated on the properties of experiment data. Both these applications are evaluated on the experiment data. The synthetic data, unlike the experiment data, allows the model to capture individual skill retention curves on a day-to-day basis, resulting in a 21\% prediction improvement for the learning curve data. Generalizing this XGBoost model to more types of tasks remains challenging since experiments to gather concise task data are uncommon, and (complex) skill experiments usually demand long time intervals.
However, this research shows that the experiment data of a skill-based tracking tasks could be used to predict individual skill decay curves, which, in the future, could assist in improving retraining schedules of individuals.

Files

2022_BarryvanLeeuwen_MScThesis... (.pdf)
warning

File under embargo until 31-01-2027