Adaptive Feature Selection For Sparse Linear Bandits
Experimental study on strategies for Online Feature Selection in High-Dimensional Bandit Settings
M.S. Damyanov (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Julia Olkhovskaya – Mentor (TU Delft - Sequential Decision Making)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The Multi-armed Bandit (MAB) is a classic problem in reinforcement learning that exemplifies the exploration-exploitation dilemma - deciding when to gather more information and when to act on current knowledge. In its sparse variant, the feature vectors often contain many irrelevant components, which is common in real-world scenarios and poses a significant efficiency bottleneck. Querying the entire vector is often unnecessary when only a small subset of it is informative. In this work, we introduce two novel algorithms for dynamic feature selection in sparse linear bandit settings. These algorithms adaptively select relevant features at each round, enabling more efficient learning and decision-making. Empirical results demonstrate that our methods consistently outperform standard approaches such as LinUCB in sparse environments.