Detecting Concept Drift in Deployed Machine Learning Models

How well do Margin Density-based concept drift detectors identify concept drift in case of synthetic/real-world data?

Bachelor thesis (2023)

Authors

B.G.L. André Electrical Engineering, Mathematics and Computer Science

Contributors

Jan S. Rellermeyer Data-Intensive Systems - (supervisor 1)

L. Poenaru-Olaru Software Engineering - (supervisor 1)

J.H. Krijthe Pattern Recognition and Bioinformatics - (supervisor 2)

Faculty

Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:259aaf37-47ad-4332-a9c2-f92f621615fd

Published Date

03-02-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

When deployed in production, machine learning models sometimes lose accuracy over time due to a change in the distribution of the incoming data, which results in the model not reflecting reality any longer. A concept drift is this loss of accuracy over time. Drift detectors are algorithms used to detect such drifts. Drift detectors are important as they allow us to detect when a classification model becomes inaccurate. Some possible uses of drift detectors can even go as far as detecting adversarial attacks on machine learning algorithms. The detectors discussed in this paper are Margin Density drift detectors. Their evaluation is made within an unsupervised context, where we assume no testing labels are available. In real world applications of machine learning models, this might often be the case, as finding labels is costly. Experiments in this paper have found that margin density detectors can be useful tools in detecting the first drift for synthetic data, even though parameter tuning must be done to achieve high accuracy for some datasets. In an unsupervised environment with more than one drift, the drift detectors are unreliable as was seen in experiments involving real world data. With this paper comes an implementation of margin density detectors.

Files

Author_Guide_Final_Template_Th... (.pdf)

(.pdf | 0.399 Mb)