Print Email Facebook Twitter How Well do Clustering Similarities-Based Concept Drift Detectors Identify Drift in case of Synthetic/Real-World Data? Title How Well do Clustering Similarities-Based Concept Drift Detectors Identify Drift in case of Synthetic/Real-World Data? Author Pohl, Jindřich (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Poenaru-Olaru, L. (mentor) Rellermeyer, Jan S. (mentor) Krijthe, J.H. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-02-03 Abstract Concept drift is an unforeseeable change in the underlying data distribution of streaming data, and because of such a change, deployed classifiers over that data show a drop in accuracy. Concept drift detectors are algorithms capable of detecting such a drift, and unsupervised ones detect drift without needing the data’s actual labels, which can be expensive to obtain. This work is concerned with the implementation and evaluation of two existing unsupervised concept drift detectors based on clustering, UCDD and MSSW, by evaluation on both synthetic and real-world data. Our biggest contribution is in making implementations publicly available. By evaluation, we also realise that UCDD detects drift earlier for simple numerical synthetic datasets, MSSW detects drift earlier for more complex synthetic datasets with categorical features, and none seems suitable for real-world datasets. To reference this document use: http://resolver.tudelft.nl/uuid:c5a19eff-04d4-4ab8-90cd-8338367898a5 Part of collection Student theses Document type bachelor thesis Rights © 2023 Jindřich Pohl Files PDF CSE3000_Paper.pdf 440.29 KB Close viewer /islandora/object/uuid:c5a19eff-04d4-4ab8-90cd-8338367898a5/datastream/OBJ/view