Routing in Distributed Quantum Systems using Reinforcement Learning

Master Thesis (2025)
Author(s)

J.J. van Veen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S. Feld – Mentor (TU Delft - QCD/Feld Group)

Luise Prielinger – Mentor (TU Delft - QID/Vardoyan Group)

Q. Wang – Graduation committee member (TU Delft - Embedded Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
13-10-2025
Awarding Institution
Delft University of Technology
Programme
['Computer and Embedded Systems Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Quantum computing holds the potential to solve problems that are intractable for classical systems. However, the physical realization of large-scale quantum systems remains a challenge due to the difficulty of scaling qubit counts. Distributed Quantum Computing (DQC) offers a promising solution by interconnecting multiple Quantum Processing Units (QPUs), effectively increasing the number of usable qubits. This interconnection introduces additional operations and exacerbates the complexity of qubit routing and entanglement management during circuit execution. While a reinforcement learning (RL) approach by Promponas et al. shows promise for qubit routing and EPR management in distributed quantum computing (DQC) environments, it suffers from inconsistent performance and is unable to reliably compile larger circuits. To address these limitations, we introduce a novel action space that allows direct operations between arbitrary qubit pairs, rather than restricting interactions to neighbouring qubits. While this significantly reduces solution depth, it increases the size of the action space, posing scalability challenges. To address this, we propose a novel neural network architecture that computes Q-values for qubit pairs based on the values of their individual qubits, considerably reducing the number of trainable parameters. Additionally, we extend the masking strategy to eliminate sub-optimal actions, effectively constraining the branching factor and accelerating learning. Together, these enhancements enable the agent to compile larger circuits with improved speed and consistency, significantly outperforming the baseline. Overall, our contributions result in better performance and reduced training time, marking a step forward in scalable quantum circuit compilation for distributed quantum systems.

Files

Master_Thesis_Report.pdf
(pdf | 2.75 Mb)
License info not available