Empirical Evaluation of Random Network Distillation for DQN Agents

None, None

Empirical Evaluation of Random Network Distillation for DQN Agents

Bachelor Thesis (2025)

Author(s)

A. Moreno (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

N. Yorke-Smith – Graduation committee member (TU Delft - Algorithmics)

P.R. van der Vaart – Mentor (TU Delft - Sequential Decision Making)

Faculty

Electrical Engineering, Mathematics and Computer Science

RND Random Network Distillation

To reference this document use:

https://resolver.tudelft.nl/uuid:ffd5f1df-4419-4759-a675-5b99600abdd9

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

25-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This paper investigates how Random Network Distillation (RND), coupled with Boltzmann exploration, influences exploration behaviour and learning dynamics in value-based agents such as Deep Q-Learning (DQN) across a range of environments, from classic control tasks to behaviour suite benchmarks and contextual bandits. The study addresses the sensitivity of RND to key hyperparameters, the impact of exploration strategy design, and the transferability of settings across tasks. The results reveal that RND remains benefitial within DQN in both sequential and non-sequential tasks, but requires careful tuning of reward scaling, temperature, and network capacity to be effective. No universal hyperparameter configuration generalizes across environments, and inappropriate tuning can lead to unstable learning or suboptimal outcomes. These findings provide practical insights into the strengths and limitations of applying RND within value-based reinforcement learning frameworks.

Files

Final_research_paper.pdf

(pdf | 2.85 Mb)

License info not available