Detecting Concept Drift in Deployed Machine Learning Models

How well do Margin Density-based concept drift detectors identify concept drift in case of synthetic/real-world data?

More Info
expand_more

Abstract

When deployed in production, machine learning models sometimes lose accuracy over time due to a change in the distribution of the incoming data, which results in the model not reflecting reality any longer. A concept drift is this loss of accuracy over time. Drift detectors are algorithms used to detect such drifts. Drift detectors are important as they allow us to detect when a classification model becomes inaccurate. Some possible uses of drift detectors can even go as far as detecting adversarial attacks on machine learning algorithms. The detectors discussed in this paper are Margin Density drift detectors. Their evaluation is made within an unsupervised context, where we assume no testing labels are available. In real world applications of machine learning models, this might often be the case, as finding labels is costly. Experiments in this paper have found that margin density detectors can be useful tools in detecting the first drift for synthetic data, even though parameter tuning must be done to achieve high accuracy for some datasets. In an unsupervised environment with more than one drift, the drift detectors are unreliable as was seen in experiments involving real world data. With this paper comes an implementation of margin density detectors.