This dissertation concerns the efficient quantification of uncertainty in the field of deep reinforcement learning. At the time of this writing, artificial intelligence is being adopted rapidly into the critical pipelines of numerous scientific and societal domains — from autonom
...
This dissertation concerns the efficient quantification of uncertainty in the field of deep reinforcement learning. At the time of this writing, artificial intelligence is being adopted rapidly into the critical pipelines of numerous scientific and societal domains — from autonomous driving and medical diagnostics to scientific discovery. A particular class of machine learning models, deep neural networks, has been pivotal in this recent development due to their extraordinary scalability and expressive power. Such models learn by optimizing vast sets of parameters to shape predictions according to previous measurements, captured in large datasets. When we deploy such learned models for practical applications, however, they are asked to make predictions for novel inputs not represented in their training data. Such predictions are the result of inductive generalization — deriving insights about future situations from past experience — and are inherently subject to uncertainty. For these predictions to be actionable, they must often be accompanied by a reliable measure of confidence. This need to know what one does not know is addressed by the quantification of epistemic uncertainty, which arises from the imperfection of a learned model, often due to a lack of sufficient relevant data. This stands in contrast to aleatoric uncertainty — the irreducible, inherent randomness in a process — and it is this reducible, model-centric epistemic uncertainty that forms the central object of inquiry for this dissertation.
The challenge of epistemic uncertainty estimation becomes especially tangible in the context of sequential decision-making problems. In such settings, an agent’s actions can have long-term consequences that compound over time, shaping downstream outcomes and choices. Reinforcement learning, a paradigm in which agents learn such decision-making strategies through direct interaction with an environment, faces several fundamental challenges that hinge on reliable uncertainty estimation. Underpinning both efficient exploration and safe decision-making is a principled understanding of an agent’s own epistemic uncertainty — the central topic of this thesis.
Examining the current research landscape of uncertainty quantification in deep learning, we observe a persistent tension between theoretically well-motivated yet computationally expensive techniques on one hand, and computationally efficient yet less understood methods on the other. Bayesian inference, widely regarded as the gold standard for reasoning about epistemic uncertainty, is generally intractable for modern, large-scale neural networks. This has led to a spectrum of approximate methods — including deep ensembles, advanced sampling techniques, and variational inference — that navigate this trade-off to varying degrees. More pragmatic solutions often offer substantial computational savings but lack a deeper theoretical understanding of what their uncertainty estimates represent, or how they behave in practice. From this landscape, we derive the research mission for this dissertation: to develop and analyze uncertainty quantification methods that are both computationally tractable and theoretically well-motivated.
Our first line of inquiry investigates deep neural network ensembles. We hypothesize that their efficacy is constrained not merely by the number of constituent models but by the quality of their diversity. Focusing on distributional reinforcement learning, we show that architectural components such as projection operators can induce inductive biases that shape generalization. We develop diverse projection ensembles, which induce diversity through architecturally distinct members, and demonstrate empirically that this approach yields more robust uncertainty signals and improved exploration performance.
Our second line of inquiry pursues the goal of emulating ensemble uncertainty within a single, efficient model. We introduce contextual similarity distillation, a technique enabling epistemic uncertainty estimation with a single model trained via gradient descent. Using neural tangent kernel theory, we reinterpret ensemble variance estimation as a tractable kernel regression problem. Complementarily, we provide a theoretical foundation for random network distillation, showing that, in the infinite-width limit, its uncertainty corresponds to the predictive variance of a deep ensemble. We further develop a Bayesian variant whose error signal matches the posterior predictive variance of an infinitely wide Bayesian neural network.
Finally, we address the estimation of long-term, cumulative uncertainty in reinforcement learning. We propose a novel single-model method — universal value-function uncertainties — that quantifies uncertainty over entire trajectories. Grounded in neural tangent kernel theory, we show that this method yields estimates equivalent to those of a full ensemble, while retaining computational efficiency.
In conclusion, this dissertation advances uncertainty quantification in deep reinforcement learning by bridging the gap between theoretical rigor and computational practicality. It provides both methodological innovations and theoretical insights toward more reliable, uncertainty-aware autonomous agents.