Meta-learning the Best Caching Expert

Tuning caching policies with expert advice

Bachelor Thesis (2024)
Authors

M.J.A. de Vries (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Supervisors

George Iosifidis (TU Delft - Networked Systems)

Naram Mhaisen (TU Delft - Networked Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
21-06-2024
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In recent years, the novel framing of the caching problem as an Online Convex Optimisation (OCO) problem has led to the introduction of several online caching policies. These policies are proven optimal with regard to regret for any arbitrary request pattern, including that of adversarial origin. Nevertheless, their performance is primarily affected by the tuning of their so-called hyperparameters (e.g. learning rate), which is often done under the presumption of adversarial conditions. Consequently, as not all request patterns encountered in practice are adversarial, this suggests potential for further improvements. Unfortunately, the tuning of these hyperparameters to achieve optimal performance across non-adversarial request sequences remains dubious and poses a new challenge. This paper proposes an online meta-learner that combines several instances of the OGA caching policy, each distinct in their learning rate, framing the problem as an expert advice decision problem. The proposed meta-learner dynamically shifts between caching experts and achieves a regret upper bound similar to that of the best-performing caching expert. The penalty introduced for learning the best expert is limited to a logarithmic dependence on the number of experts, which improves upon previous works. Numerical evaluations composed of synthetic request traces demonstrate consistent improvements when the caching experts' relative performance varies over time.

Files

RP_THESIS.pdf
(pdf | 1.52 Mb)
License info not available