Print Email Facebook Twitter Accelerating Cluster Assignment for SeqClu Title Accelerating Cluster Assignment for SeqClu Author Al-Obaidi, Rami (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Intelligent Systems) Contributor Nadeem, A. (mentor) Verwer, S.E. (mentor) Migut, M.A. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2021-07-01 Abstract Clustering is a group of (unsupervised) machine learning algorithms used to categorize data into clusters. The most popular clustering algorithm is k-means clustering. K-means clustering clusters the data into k clusters where a cluster is represented by the mean of the data points called a centroid. Instead of using the mean as a centroid, a data point (medoid) can be used instead. This algorithm is called k-medoids algorithm. Both the algorithms work in an offline setting where all the data is known beforehand and usually use Euclidean distance to calculate the distance between any two points. SeqClu is another clustering algorithm for sequential data that works in an online setting and uses Dynamic Time Warping as its distance measure. It is based on k-medoids where it uses p sequences called prototypes, to represent a cluster. It assigns an incoming sequence to the cluster that has the lowest average distance between its prototypes and the incoming sequence. The issue in this approach is that many Dynamic Time Warping distance calculations need to be made which affects the clustering speed and using the average distance affects the clustering accuracy due to outliers being assigned as prototypes. This paper proposes an alternative algorithm with three variants for the cluster assignment process. This algorithm iterates through the prototypes in search for the closest prototype while excluding clusters that are deemed too far. It assigns the incoming sequence to the cluster of the closest prototype that it has found. Experiments on the UJI Pen Characters and UCR Synthetic Control datasets show an improvement in clustering speed and accuracy. Subject machine learningclusteringSequence Clusteringonline clustering algorithmsk-medoids To reference this document use: http://resolver.tudelft.nl/uuid:7ead7330-fe8a-4b94-a6cf-8f752083ecba Part of collection Student theses Document type bachelor thesis Rights © 2021 Rami Al-Obaidi Files PDF Accelerating_Cluster_Assi ... SeqClu.pdf 1.03 MB Close viewer /islandora/object/uuid:7ead7330-fe8a-4b94-a6cf-8f752083ecba/datastream/OBJ/view