Deep Reinforcement Learning (DRL) is a powerful framework for training autonomous agents in complex environments. However, testing these agents is still prohibitively expensive due to the need for extensive simulations and the rarity of failure events, such as collisions or timeo
...
Deep Reinforcement Learning (DRL) is a powerful framework for training autonomous agents in complex environments. However, testing these agents is still prohibitively expensive due to the need for extensive simulations and the rarity of failure events, such as collisions or timeouts, where the agent fails to complete its task safely or correctly. Existing surrogate models, such as Multi-Layer Perceptrons (MLPs), are a promising improvement by predicting failures without requiring full simulation runs. However, prior research has focused almost exclusively on MLPs, leaving it unclear whether other, more expressive machine learning models could improve performance. In this paper, we investigate whether Bayesian Neural Networks (BNNs), which incorporate probabilistic reasoning into neural architectures, can be more effective surrogates for failure prediction in DRL environments. We developed, trained, and evaluated a BNN surrogate and compared it against a pre-trained MLP baseline, using the HighwayEnv car parking environment as our test case. Our evaluation focused on comparing the predictive accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC) using training data, as well as assessing the models' effectiveness in the DRL parking environment. The results show that the BNN surrogate outperforms the MLP baseline in terms of practical utility for failure discovery. These findings suggest that BNNs can be a more effective surrogate model for prioritising failure scenarios in DRL testing.