SeqClu-PV: An extension of online K-medoids to efficiently cluster sequences real-time

More Info
expand_more

Abstract

Real-time sequence clustering is the problem of clustering an infinite stream of sequences in real time with limited memory. A variant of the k-medoids algorithm called SeqClu is the suggested approach, representing a cluster with p most representative sequences of the cluster, called prototypes, to solve the problem of maintaining a high-quality representation of a cluster that requires little memory throughout time. However, the computational cost of this algorithm is considerable due to many distance computations that use Dynamic Time Warping (DTW), which is a computationally expensive distance measure that can be applied to sequences and is proven to be robust to noise and
delays. Therefore, this paper proposes an extension of SeqClu called SeqClu-PV, characterised by a decision-making mechanism for updating prototypes that improves the balance between the number of distance computations and the cost incurred due to incorrect clustering and reviews its performance.