Quantum computing holds the potential to solve problems that are intractable for classical systems. However, the physical realization of large-scale quantum systems remains a challenge due to the difficulty of scaling qubit counts. Distributed Quantum Computing (DQC) offers a pro
...
Quantum computing holds the potential to solve problems that are intractable for classical systems. However, the physical realization of large-scale quantum systems remains a challenge due to the difficulty of scaling qubit counts. Distributed Quantum Computing (DQC) offers a promising solution by interconnecting multiple Quantum Processing Units (QPUs), effectively increasing the number of usable qubits. This interconnection introduces additional operations and exacerbates the complexity of qubit routing and entanglement management during circuit execution. While a reinforcement learning (RL) approach by Promponas et al. shows promise for qubit routing and EPR management in distributed quantum computing (DQC) environments, it suffers from inconsistent performance and is unable to reliably compile larger circuits. To address these limitations, we introduce a novel action space that allows direct operations between arbitrary qubit pairs, rather than restricting interactions to neighbouring qubits. While this significantly reduces solution depth, it increases the size of the action space, posing scalability challenges. To address this, we propose a novel neural network architecture that computes Q-values for qubit pairs based on the values of their individual qubits, considerably reducing the number of trainable parameters. Additionally, we extend the masking strategy to eliminate sub-optimal actions, effectively constraining the branching factor and accelerating learning. Together, these enhancements enable the agent to compile larger circuits with improved speed and consistency, significantly outperforming the baseline. Overall, our contributions result in better performance and reduced training time, marking a step forward in scalable quantum circuit compilation for distributed quantum systems.