Impact of Considering Artificial Worst-Case Scenarios Within Clustering Algorithms
A case study through three newly adapted clustering algorithms
R.P.L. Novosel (TU Delft - Electrical Engineering, Mathematics and Computer Science)
G.A. Morales España – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M.B. Elgersma – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
J.A. Baaijens – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Planning a long-term energy system relies on models that simulate system operation over many years at an hourly level, which is computationally expensive. A common remedy is temporal aggregation: grouping similar time periods and representing each group by one typical period to shrink the dataset the model must process. This speeds up the computation but tends to average away rare yet demanding conditions, such as days with high energy demand and little energy availability. These extreme periods, however, often determine how much capacity the system requires. This paper introduces three adaptations of widely used clustering algorithms that deliberately embed synthetic worst-case periods into the clustering process, ensuring the representative periods do not ignore the most demanding conditions. We evaluate them against four standard baselines (K-Means, K-Medoids, K-Medoids WC (worst-case), and Hull clustering) by measuring how closely each method's investment decisions match those of a benchmark model that uses the full, unaggregated data: a gap we call relative regret. The standard methods often require a large number of representative periods to approach the benchmark, whereas the proposed worst-case method WCA-K-Means reaches near-benchmark decisions with far fewer periods. By capturing the conditions that drive capacity needs without partitioning the data into excessive detail, it represents a full year with a much smaller dataset, giving planners results that closely match a full-resolution model while substantially reducing the computational cost of solving the energy model.