Multi-agent reinforcement learning for radar waveform design

More Info
expand_more

Abstract

This thesis investigates the application of multi-agent reinforcement learning (MARL) to the optimization of radar waveforms. Radar technology is crucial in fields such as aviation, maritime navigation, and defense, but faces challenges such as interference, clutter, and the need for high resolution and accuracy. Cognitive radar, which adapts to environmental changes in real-time, offers a promising solution. This research aims to explore the potential of MARL in optimizing radar waveforms and examines whether incorporating domain knowledge can enhance performance.

The radar waveform optimization problem is framed within the Decentralized Partially Observable Markov Decision Process (Dec-POMDP) framework, defining the radar environment, agents' observations and actions, and reward functions. The study experiments with different architectures, including decentralized actors with a centralized critic. The centralized critic, having access to global state information, helps stabilize the learning process and mitigate non-stationarity and credit assignment problems. The use of GNNs as a centralized critic is proposed to leverage graph data sparsity, enhancing scalability.

The proposed models are trained and tested in a radar-tracking scenario, evaluated in terms of Pareto optimality and optimization times. The results show that both Independent Actor-Critic (IAC) and Independent Actor with Centralized Critic (IACC) models outperform traditional methods in terms of probability of detection, waveform duration, and optimization speed. The findings highlight the effectiveness of MARL approaches in optimizing radar waveforms, emphasizing the benefits of centralized critics for robustness and coordination. However, the choice of architecture significantly impacts performance, and while GNNs offer potential scalability advantages, their integration of domain knowledge did not yield significant improvements in this study. This research lays a foundation for future exploration of MARL and GNNs in radar waveform optimization.