Designing a DDS-Based SoC for High-Fidelity Multi-Qubit Control

The design of a large-scale quantum computer requires co-optimization of both the quantum bits (qubits) and their control electronics. This work presents the first systematic design of such a controller to simultaneously and accurately manipulate the states of multiple spin qubits or transmons. By employing both analytical and simulation techniques, the detailed electrical specifications of the controller have been derived for a single-qubit gate fidelity of 99.99% and validated using a qubit Hamiltonian simulator. Trade-offs between several architectures with different levels of digitization are discussed, resulting in the selection of a highly digital DDS-based solution. Initiating from the system specifications, a complete error budget for the various analog and digital circuit blocks is drafted and their detailed electrical specifications, such as signal power, linearity, spurs and noise, are derived to obtain a digital-intensive power-optimized multi-qubit controller. A power consumption estimate demonstrates the feasibility of such a system in a nanometer CMOS technology node. Finally, application examples, including qubit calibration and multi-qubit excitation, are simulated with the proposed controller to demonstrate its efficacy. The proposed methodology, and more specifically, the proposed error budget lay the foundations for the design of a scalable electronic controller enabling large-scale quantum computers with practical applications.

In quantum chemistry, the electronic structure of molecular orbitals can be mapped onto a quantum processor to simulate the interaction between molecules. As a result, new molecules and reactions can be designed, and industrial chemical processes can be optimized [1], [2]. For instance, fertilizer production takes up to 1% of the world's energy supply, due to the currently employed high-temperature high-pressure Haber-Bosch industrial process. However, a very similar process, i.e.nitrogen fixation in plants, happens under ambient conditions. While the primary cofactor of the biological nitrogenfixing enzyme nitrogenase (FeMo cofactor) is not yet fully understood, it could conceivably be simulated using a quantum computer [3].
The computing power of a quantum computer is directly related to the quality and quantity of its quantum bits (qubits 1 ), its fundamental computing unit. Although a quantum computer with only 53 qubits has surpassed the capability of even the most powerful supercomputers [4], large-scale quantum computers with thousands -or even millions -of qubits would be required for any practical computations, thus demanding a scalable qubit and control architecture.
Amongst various qubit topologies, solid-state qubit technologies, such as spin qubits [5] and transmons [4], promise scalability due to their small form factor and fabrication process. However, such qubits typically operate at temperatures below 100 mK inside a dilution refrigerator, while the control electronics is implemented with off-the-shelf equipment operating at room temperature and connected via at least a single RF cable per qubit. Such control setups hinder scalability both because of the excessive complexity of an equipment-based control and because it is impractical to fit thousands of cables inside a dilution refrigerator, while minimizing the heat load in the fridge and ensuring the reliability of the interconnects.
Although several setups have migrated from the use of generic equipment to custom-made electronics, the specifications of such systems are not tailor-made for qubit control, but rather the main goal is to reduce the amount and cost of equipment/interconnect used for qubit control [6]. The ideal solution to build large-scale quantum computers would be to operate both qubits and control electronics at the same cryogenic temperature [5]. Along such direction, custom-made PCBs with commercial off-the-shelf components operating at cryogenic temperatures have been designed to interface several control and read-out channels, thus minimizing cabling [7], [8]. Taking a step further, control and readout integrated circuits implemented in standard CMOS technology and operating at cryogenic temperature (cryo-CMOS) have been shown to operate at cryogenic temperatures as low as 4 K and even below, promising a viable solution for scalability [9], [10]. Additionally, such cryo-CMOS electronics can be in principle co-integrated with spin qubits on a single chip, thus providing a compact solution towards the realization of practical quantum computers [11], [12]. To design such systems, the circuit specifications need to be estimated/simulated to produce a power-efficient design. Furthermore, this is indispensable for circuits operating at cryogenic temperatures, due to limited cooling power of the dilution refrigerator.
In this paper, we address the above-mentioned issues by proposing a systematic design technique of the electronic controller for single-qubit operations, employing frequency multiplexing to reduce interconnects and power consumption. This work presents the architecture and specifications of a power-efficient qubit control system, to achieve a single-qubit gate fidelity up to 99.99% by complying with the signal specifications for qubit control, outlined in [13]. That work presents a systematic study of the impact of the classical electrical signals on the qubit fidelity for single-electron spin qubits, considering all operations, i.e., single-qubit rotations, two-qubit gates, and readout, in the presence of errors in the control electronics, such as static, dynamic, systematic, and random errors. Moreover, using case studies, [13] shows how preliminary signal specifications can be derived to achieve a specified gate fidelity. In this work, those results are used as the basis to find the specifications for a DDS-based system for multi-qubit control.
In the following, section II presents the requirements for the qubit control system. Section III discusses the trade-offs between possible transmitter architectures and describes the chosen system architecture. In Section IV the specifications for the different architectural sub-blocks are determined to assess the feasibility of the system. Finally, Section V demonstrates the flexibility of the design for various applications, and a conclusion follows in Section VI.

II. REQUIREMENTS
The main focus of this work is on single-electron spin qubits, since they are very promising both in terms of scaling opportunities and co-integration with CMOS electronics [5], and a co-simulation platform is readily available [13], [14]. Since the control signals required by spin qubits and transmons are very similar, we will also describe the minor changes required to ensure compatibility with transmons.

A. Qubit Control Signal Requirements
In a single-electron spin qubit, information is encoded in the spin of a single electron hosted in a quantum dot. The spin-up and spin-down states encode the |0 and |1, respectively. 2 Since a large magnetic field is applied, the two states are separated by the Zeeman energy E z , which is associated to the qubit frequency f = E z / h, with h the Planck constant. Any single-qubit operation can be represented as a rotation of the qubit state in the Bloch sphere, as indicated in Fig. 1. To achieve universal quantum computation, rotations around at least two different axes are required. The accuracy of the implemented qubit rotation can be measured by the fidelity of such operation [13]. The fidelity is the most commonly used quantity to benchmark the quality of quantum operations. However, for systematic errors, the actual error rate in an algorithm can deviate from the error predicted by the fidelity (1 − F), and is, in the worst case, bounded by the diamond norm ( √ 1 − F). In current experiments, the observed error rate is usually well described by the fidelity and is hence used here.
To perform such a rotation for single-electron spin-qubits, a microwave pulse needs to be applied to the qubit as either an electric (Electric Dipole Spin Resonance, EDSR) or magnetic field (Electron Spin Resonance, ESR), with the frequency accurately matched to the qubit frequency. The amplitude of the microwave pulse sets the rotation speed, called Rabi frequency f R , and hence, together with the duration of the microwave pulse T , sets the rotation angle θ in Fig. 1. For a rectangular pulse envelope, θ = 2π f R T . The phase of the microwave signal needs to remain coherent with the phase of the qubit, which implies keeping a coherent phase for the whole duration of the quantum algorithm, even over different pulses. Changing the relative phase results in a rotation along a different axis in the Bloch sphere (φ in Fig. 1), and can be used to implement both X-and Y-rotations of the qubit.
Frequency multiplexing can be used to drive multiple qubits on the same driveline, with the advantage that the amount of interconnects can be reduced, and the control electronics can potentially be more area and power efficient. However, when applying a short microwave burst to one qubit, the spectrum can contain energy at frequencies corresponding to other qubits. Pulse shaping techniques must then be used to minimize spectral leakage, generally known as cross-talk, to other qubits [15]. However, even with pulse shaping, the frequency of a qubit slightly shifts when applying a signal at a different frequency. This so-called AC-Stark shift causes the qubit to acquire a phase offset. This results in an unintended Z-rotation on the qubit, which needs to be corrected [15].
Two-qubit operations, qubit initialization and qubit readout typically require unmodulated pulses to be applied to the quantum processor [16], and are here assumed to be generated by other control electronics and are therefore outside the scope of this paper.

B. System Specifications
For single-electron spin qubits, the qubit frequency is typically 12-40 GHz with microwave pulse duration in the order of 1 μs, and, to achieve a typical Rabi frequency of 1 MHz, a power of ∼ -45 dBm is usually required [17]- [20]. For future systems, it is desirable to operate at lower qubit frequencies and higher Rabi frequencies [12]. Hence, the system presented here will be designed for an output frequency range of 5-20 GHz, and Rabi frequencies in the range of 1-10 MHz, with a maximum rotation angle of π. This sets the nominal duration of a π-rotation to 50-500 ns. The required output power for spin qubits ranges then from −45 dBm to −25 dBm for the selected Rabi frequency range. However, as attenuators (e.g., with 6 dB loss) are typically employed before the qubits to reduce the heat injected into the quantum processor, and the sensitivity of the qubit can easily vary by ±50%, the required output power range is extended as −48 dBm to −16 dBm (50 mV p ). Current experiments on single-electron spin-qubits typically do not use frequency multiplexing, and hence, rectangular envelopes for the microwave pulses are allowed [17], [20]. However, for our system, more complex pulse shaping, e.g., Gaussian envelopes, are necessary to support frequency multiplexing. Moreover, for flexibility, it is desirable to program any envelope, with support of I/Q-modulation for the benefit of having X-and Y-rotations.
The fidelity of single-qubit operations is typically above 99% for single-electron spin-qubits [18]. For fault-tolerant quantum computing, a minimum qubit fidelity, typically around 99.9%, is required when using error-correction techniques [21]. In order not to limit the performance of the whole quantum computer, the proposed electronic interface targets a fidelity of 99.99% for a π-rotation performed on a spin-qubit, when taking into account only the errors due to the electronic interface and assuming a perfect qubit. Considering frequency multiplexing, the system will be designed such that both the addressed qubit achieves a 99.99% fidelity for the targeted π-rotation (which generally gives the lowest fidelity [13]) while the idle qubits reach a 99.99% for the identity, or idle, operation.
Since modern CMOS processes allow processing of extremely wide bandwidths, we aim at the maximum feasible bandwidth to maximize the number of qubits that can be served. Fig. 2 shows the number of qubits that can be multiplexed in a 1 GHz bandwidth for different microwave pulse envelopes, when assuming uniformly distributed qubit frequencies, a π-rotation at the maximum supported Rabi frequency of 10 MHz, and Z-corrections to compensate for The number of qubits that can be allocated in a 1 GHz band when driving with different envelopes and the required Z-correction, each performing a π -rotation in 50 ns.
the AC-Stark shift [13]. Less than 5 qubits can be served at a 99.9% fidelity with a rectangular envelope. By employing Gaussian pulses, this can be significantly improved, resulting in ∼40 qubits operating at a 99.99% fidelity in a 2 GHz bandwidth. For the system discussed here, 32 qubits are targeted in a 2 GHz bandwidth, since this number allows for easy addressing of the qubits (32 = 2 5 ), and for a theoretical fidelity >99.999%.
Even though frequency multiplexing allows for operating on multiple qubits simultaneously, the system will be optimized assuming sequential execution of the operations on the different qubits, as more complicated measures than a simple Z-correction are required when operating on multiple frequency-multiplexed qubits simultaneously [15]. However, as a scalable solution is desired, the chosen system architecture should support the simultaneous excitation of multiple qubits.

C. Extending to Transmons
The control of transmons is very similar to spin qubits, but there are a few key differences that could affect the system specifications. The qubit frequency is typically around 6 GHz for transmons, and microwave pulses as short as 20 ns are used with a signal power of ∼ -60 dBm. Hence, the duration and output power specifications are extended to include this. Additionally, pulse shaping (Derivative Removal by Adiabatic Gate, DRAG) is typically used to minimize spectral leakage to higher energy levels of the same qubit. This specific pulse requires I/Q modulation, which is already supported to seamlessly allow X-and Y-rotations. Finally, as state-of-the-art transmons typically achieve fidelities not better than 99.99% [22], the control system will still not limit the achievable fidelity.
A summary of the discussed specifications is given in Table I.
Following the methods presented in [13], preliminary signal specifications can be estimated for performing a π-rotation on the addressed spin qubit with either a Rabi frequency of 1 or 10 MHz and a rectangular envelope, see Table II. Equal error contributions are assumed, and the value given for the amplitude inaccuracy assumes a peak amplitude of 50 mV, which corresponds to the maximum required output power. These preliminary specifications will be used to assess the

III. SYSTEM ARCHITECTURE
Based on the signal requirements for qubit control (Table II), the feasibility of several architectures is discussed and the chosen architecture is presented in this section.

A. Analog/RF Section
To generate the required envelopes for frequencymultiplexed qubits, the simplest architecture would be to design a digital-to-analog converter (DAC) operating at 40 GS/s, as shown in Fig. 3(a). However, the power consumption would be too high due to its large data bandwidth [23]. To reduce the power consumption, a multiple-return-to-zero (MRZ) DAC [ Fig. 3(b)] exploiting higher Nyquist zones is capable of synthesizing frequencies up to 20 GHz [24]. However, limited flexibility in choosing the output frequency band (centered around N · f s ) and an output spectrum corrupted by DAC replicas does not make this a good candidate. To overcome this, several low-speed DACs along with I/Q mixers [ Fig. 3(c)], can be used to generate envelopes at distinct frequencies [10], each covering the bandwidth of one qubit, with the possibility of having an individual RF channel/output per qubit. However, this would require multiple local oscillator (LO) signals, thus making it power/area inefficient for multi-qubit control. Moreover, on-chip implementation of multiple LOs can cause frequency pulling and affect the spectral purity of the synthesizers, thus degrading the transmitter SFDR. Instead, a very high-speed DAC at 4 GS/s and a single mixer can be used for controlling multiple qubits from a single RF cable [ Fig. 3 Single-sideband (SSB) modulation can be implemented instead of double-sideband (DSB) modulation to obtain the same bandwidth with half the DAC sampling frequency, at the cost of increased circuit complexity. This can be achieved by filtering each of the upper/lower sideband (USB/LSB) and combining them at the output, as shown in Fig. 3(e). To achieve an image rejection ratio (IRR) > 44 dB for output frequencies close to the carrier (as required by the SFDR specification), filters with very high order or quality factor are essential. Instead, image rejection can be achieved using a Hartley modulator with I/Q DAC and mixer [ Fig. 3(f)] at the cost of requiring an LO with quadrature phases. At the circuit level, instead of cascading the DAC and the mixer, a better solution would be to use a mixing DAC, i.e. combining the DAC and mixer at the circuit level, for power efficiency and linearity [25]. However, the mixing DAC output is corrupted by tones at alias frequencies, which may fall in the upconverted 2 GHz output band when the signal bandwidth is comparable to the carrier frequency, as it will be shown in Section IV ( Fig. 6). This would suggest exploring a high-speed DAC followed by a reconstruction filter and a mixer for better spectral purity [ Fig. 3

B. Digital Signal Synthesis
To generate multiple SSB-modulated tones with this frontend design, a digital back-end is required. This work assumes the availability of a reprogrammable on-chip memory that is used to store calibrated waveforms for each of the desired qubit rotations. A qubit algorithm is then executed by playing the various stored waveforms in the desired order. The most straightforward approach is to store all possible combinations of qubit instructions in an SRAM, as shown in Fig. 4(a). The required memory of such an SRAM can be estimated as S R AM mem = N × f s × t pulse × m n , where N and f s are the number of bits and the sampling frequency of the DAC, respectively, t pulse is the pulse duration, m the number of possible instructions per qubit and n the number of qubits. Assuming an 8-bit DAC operating at 2.5 GS/s to address 32 qubits and a maximum pulse duration of 500 ns, it would require an impractically large memory of 3.7·10 19 bits, considering merely 3 instructions per qubit. Moreover, since qubits require coherent control, intermittent sequential operations on any qubit demand keeping track of the phase of all qubits. Consequently, an individual reference clock would be required for each qubit.
To reduce the required memory, an alternative approach is to store only the amplitude information in the SRAM, which can modulate the amplitude of a sinusoidal waveform with a programmable phase, as shown in Fig. 4(b). Under the mentioned assumptions, this would require less than 1-Mbit SRAM instead (scaling as m × n instead of m n ), consequently saving area at the cost of a higher power consumption. In order to update the phase for each qubit and ensure coherent control, sine and cosine waveforms scaled by appropriate coefficients can be combined to generate the required phase offset. However, this adds an overhead of 2 multipliers per qubit running at the full sampling speed. A power-efficient approach would be to use a Numerically Controlled Oscillator (NCO) for each qubit to generate both the required frequency and the phase offset [26]. A numerically controlled oscillator consists of a phase accumulator running at f s . An input frequency tuning word (FTW) defines the step size of the phase accumulator to generate the desired output frequency f out = F T W × f N s/2 , where N is the number of bits in the phase accumulator and determines the frequency accuracy ( f N s/2 ). The sine Look-Up Table (LUT) generates a sinewave corresponding to the output phase of the NCO, which is then multiplied with an envelope (stored in the SRAM) to obtain the necessary modulated signal, as shown in Fig. 4(c). This allows for fewer multipliers and the same number of adders compared to Fig. 4(b), thereby saving substantial power, i.e., 2 multipliers per qubit running at 2.5 GHz. Another advantage of such a system is that the NCO can keep track of the phase of individual qubits, thus allowing coherent operation [27].

C. Final Architecture
Considering the above-mentioned trade-offs, a digitally intensive architecture based on direct digital synthesis (DDS) with digital modulation, as shown in Fig. 5, has been selected. Such an architecture benefits from the scaling advantage of advanced CMOS technology nodes in terms of speed and power efficiency and offers the flexibility and robustness of digital signal processing.
Multiple NCOs (one per qubit) are used to keep track of the phase evolution of the qubits. However, the NCO outputs are time-multiplexed to allow operation on one qubit at a time to reduce system complexity, as mentioned in Section II. The multiplexed output is fed into LUTs to generate the sinusoidal signals, which are then modulated by the envelope memory (ENV_I, ENV_Q) for various gate operations and pulse shaping [28], [29] providing flexibility in qubit control.
Because of the stringent IRR requirement of 44 dB (originating from SFDR) corresponding to a maximum phase and gain imbalance of 0.3 • and 0.1 dB, respectively, an I/Q digital correction network is required to compensate for analog I/Q mismatch. Moreover, a DC offset correction is added to cancel the LO feed-through to the output.
Finally, the same transmitter as in Fig. 3(f), comprising I/Q DACs, reconstruction filter, and an I/Q mixer, translate its digital input to the RF band. The only required analog input is then a quadrature LO signal to drive the mixer.

IV. CIRCUIT SPECIFICATIONS
When increasing the signal dynamic range, the rate at which the power consumption increases is much lower in a digital circuit than in its analog counterpart, especially in nanometer CMOS technologies [30]. Therefore, the error budget for the digital section is set an order of magnitude tighter than the target fidelity, i.e. it is set to a 99.999% fidelity, so as to contribute negligibly to the target fidelity of the controller.
To this purpose, a MATLAB simulation model of the entire system is developed, comprising an accurate representation of the digital section (including quantization and rounding effects), an ideal model of every analog block, and a model of the 32-qubit quantum processor. The evolution of each qubit is represented by the Hamiltonian of a single-electron spinqubit under the excitation of the microwave current i mw (t) generated by the controller (Hamiltonian simulator implementation in [14]): where α and ω 0 represent the sensitivity to the drive signal and qubit frequency respectively of the qubit processor.
The following calculations are based on a rectangular envelope, while the simulations consider both a rectangular and Gaussian envelope. Moreover, as the specifications are typically stricter when operating at a Rabi frequency of 10 MHz, this will be the default assumption, unless otherwise specified.
Besides that, the lowest output frequency band of 5 − 7 GHz will be used in the simulations as this band suffers more from sampling replicas. When simulating the idle qubits, any Z-error is ignored, as these can be corrected in software [31].
The design strategy is as follows. First, the sample rate is chosen (Section IV-A), which then allows for the selection of an appropriate reconstruction filter (Section IV-B). Next, the effects of a limited bit length in each digital block on the targeted and idle qubits are individually simulated while keeping the other blocks ideal, i.e. not quantized (Section IV-C). The results of this sensitivity analysis are used to select the number of bits required in each block to achieve the targeted fidelity. The final digital system, including all non-idealities, which are simultaneously accounted for, is simulated in a final verification step (Section IV-D). Finally, in Section IV-E, the specifications of analog blocks can be readily derived from the requirements in Table II.

A. Sample Rate
Due to the chosen I/Q-modulation architecture, there is individual control over the upper and lower sidebands of the upconverted signal. For the required 1-GHz sideband (for a 2-GHz bandwidth), it is sufficient to run the DACs at a sample rate of 2 GS/s to fulfill the Nyquist criteria. However, considering the inherent zero-order hold (ZOH) operation of DACs, the -3-dB bandwidth of a DAC is roughly 40% of the sample rate. Hence, in this design, a sample rate of 2.5 GHz is chosen, thus resulting in a timing resolution of 400 ps for the microwave envelopes. 3 The shortest operation of 20 ns is then supported (50 points), while the longest operation (500 ns) sets a minimum memory of e.g. 160 kSa, assuming four instructions for each of the 32 qubits.

B. Reconstruction Filter
The lowest qubit frequency of 5 GHz is achieved using a carrier frequency of 6 GHz and the 1 GHz sideband. A sketch of the output spectrum for this condition is shown in Fig. 6. The negative frequencies are shown for clarity to illustrate that negative sampling replicas fold back to positive frequencies and eventually fall back close to an in-band qubit. Since the ZOH suppression of the replicas corresponds to a worst-case SFDR of 21 dB, an additional attenuation of at least 33 dB is required at 11 GHz to achieve an SFDR better than 54 dB, as required for 99.999% fidelity (10 dB more than Table II for a 10× smaller error). Since a second-order filter is at least required, a 2 nd order Chebyshev-I with 3-dB passband ripple and a 1.8-GHz corner frequency was chosen. The combination of the ZOH and reconstruction filter provides an SFDR better than 58 dB in all cases, resulting in a simulated fidelity of the idle qubit of >99.9996%.
In addition, the chosen filter improves the in-band flatness to 0.14 dB over the full 2-GHz data band. While this is not a strict requirement, this removes the need to predistort the envelopes. As a result, a qubit driven at 5.1 GHz with a rectangular envelope can achieve a fidelity of 99.99995% without any predistortion in an otherwise ideal system. In comparison, a 3 rd -order Butterworth filter with a 1.7-GHz corner frequency has an in-band flatness of 2.6 dB, which results in a fidelity of only 99.998% for a non-predistorted rectangular envelope. This is an important result, as it shows that, with proper design, one can use much simpler modulation schemes to achieve the intended performance.

C. Digital Blocks 1) Number of NCO Accumulator Bits:
The number of bits in the accumulator register b acc (see Fig. 5) sets the frequency resolution f res of the numerically controlled oscillator according to [26]: This results in a maximum frequency error f = f res /2, which results in a theoretical infidelity of when performing a π-rotation using a rectangular envelope [13]. This result, along with the simulated fidelity in the case of both a rectangular and Gaussian envelope is shown Fig. 7. Infidelity of a π -rotation as a function of the NCO accumulator number of bits. Eq. 3, valid for rectangular envelopes, is plotted as the theoretically expected fidelity.
in Fig. 7. In the simulation, the target qubit frequency is chosen such that the frequency error is maximized. As the Gaussian envelope has a longer duration, a larger frequency error is accumulated. At least 16 accumulator bits are required to achieve a 99.999% fidelity. 4 2) Number of LUT Entries: For a more efficient design, the minimum number of entries (2 b lut ) should be used in the sine/cosine lookup table. However, as this requires the number of bits out of the accumulator (b acc ) to be reduced to the number of LUT address bits (b lut ), a periodic error would appear, and, as a result, the spectrum will show spurious tones. While the spectrum depends on the generated frequency (see Fig. 8), the spurs are associated with a limited S F D R equal to [26]: As such a spurious tone can be at the frequency of an idle qubit, its infidelity is expected to increase to [13]: The above theoretical bound is compared to simulations in Fig. 9. As the effects of Gaussian and rectangular envelopes are similar, only the results of the Gaussian envelope are presented. Different target frequencies have been simulated, and, in each condition, an idle qubit is considered at the frequency of the largest spur. For the accumulator output bit reduction, both truncation and rounding are considered. Eq. 5 well predicts the fidelity of the idle qubit, both for rounding and truncation. In the case of rounding, the idlequbit fidelity requires at least 9 bits for a 99.999% fidelity.
In the case of truncation, the targeted qubit is affected more and at least 10 bits are required. When targeting a certain fidelity, the required b lut is one bit less when rounding the accumulator output. Note that saving 1 bit is significant as it halves the number of entries required in the LUT.

3) Number of LUT Data Bits:
A finite number of data bits in the sine/cosine lookup table (b data ) results in a quantization error. Generally, such a quantization error can be modeled as white noise spread over the full Nyquist bandwidth f s /2 with associated Signal-to-Quantization-Noise Ratio of Since the qubit is only sensitive to noise in a bandwidth E N BW = f R · π θ due to the intrinsic noise filtering of the qubit [13], the expected infidelity for the driven qubit is given by: This noise affects both the targeted and idle qubits. For certain output frequencies, however, quantization noise is more tonal (similar to Fig. 8), and the spur could be at the frequency of an idle qubit. To capture these different cases, again, different offset frequencies are used when determining the number of LUT entries, and a victim qubit is simulated at the frequency of the highest spur. In Fig. 10, only the simulated fidelity for an offset frequency of 450 MHz is shown for clarity, as the spectrum shows many spurious tones resulting in significant tones affecting the qubit more than expected from the white-noise model. The simulations with the various offset frequencies show that at least 8 data bits are required for a 99.999% fidelity.

4) Number of Envelope Bits:
A limited number of bits used for the envelope in the I/Q-modulation (b env , signed) causes an error in the pulse amplitude. For a rectangular envelope, the maximum amplitude inaccuracy is leading to an infidelity of [13]: While the amplitude could be different for a rectangular envelope due to quantization noise, the shape of the envelope is unaffected. This is not the case for e.g., a Gaussian envelope, where quantization leads to distortion of the envelope, affecting the signal spectrum. A simulation is performed by setting the qubit properties such that the ideal driving amplitude for a rectangular envelope is in-between two quantization levels. Although the effect of the quantization noise on another qubit may be relevant for a Gaussian envelope, and it is hence simulated as well, the results in Fig. 11 indicate that such effect is negligible. The simulated fidelity follows the prediction of Eq. 9, resulting in a minimum of 9 bits for a fidelity of 99.999%. Fig. 11. The simulated infidelity versus the number of envelope bits when using a rectangular or Gaussian envelope for an offset frequency of 500 MHz.

5) Number of Bits in the Correction Network:
The tolerable phase imbalance (φ) and gain imbalance () follow from the required image rejection ratio (IRR) (Section 4.2.4 of [32]): For an SFDR of 54 dB, to achieve a fidelity of 99.999% due to the image spur (see Section IV B in [13]), the maximum gain imbalance and the maximum phase imbalance are 0.4% (0.035 dB) and 0.32 • , respectively. A correction network is added to compensate for inaccuracies in the analog blocks (see Fig. 5). In this correction network, the coefficients α I , α Q , β I and β Q are unsigned fractions of b f rac bits. 5 A gain imbalance can be compensated for by lowering either α I or α Q , and for a maximum error of 0.4% at least 7 bits are required ( A A = 1 2 b f rac +1 ). However, since the relation is non-linear for phase imbalance, both the α and β coefficients need to be adapted. As it is difficult to predict the worstcase scenario, a system-level simulation is performed where any phase imbalance from -25 • to +25 • is introduced and subsequently corrected using a finite number of bits. The situation of the worst-case IRR is further considered when simulating the system along with the quantum processor. The results of this simulation, when using a Gaussian envelope, are shown in Fig. 12.
It can be clearly seen that the fidelity of the qubit at the image frequency equals the fidelity as expected from the spur power (Eq. 5). Besides the victim qubit, the targeted qubit seems affected in the same way. From this simulation, it follows that at least 9 fractional bits in the fixed-point number are required to achieve a fidelity of 99.999%.

D. Total Digital System
To summarize, for a 99.999% fidelity, it was found that at least a 16-bit accumulator is required, of which the Fig. 12. The simulated infidelity when reducing the number of fractional bits in the fixed-point number in the I/Q-correction network. Each simulation is performed at the worst-case phase imbalance, and uses a Gaussian envelope to drive a qubit at an offset frequency of 500 MHz. The victim qubit is placed at the image frequency of -500 MHz, and its fidelity is estimated from the simulated SFDR following Eq. 5. 9 most-significant bits, after rounding, are used to index the LUT holding 8-bit values. Moreover, both the envelope and I/Q-correction network require a 9-bit resolution. In the sensitivity analysis, only part of the digital datapath under investigation was quantized, and hence all multiplier outputs were not quantized. As an initial estimate for the entire digital system, these minimum specifications were used and all multiplier outputs were truncated to 9 bits, as at least 9 bits were found necessary for the envelope. Reducing the number of multiplier output bits is critical to save power and to find the minimum number of bits required for the DAC.
A full system simulation was done, where on each qubit a Gaussian-shaped microwave pulse was applied to perform a π-rotation at a 10 MHz Rabi frequency. The operating frequencies of the 32 qubits are evenly spaced over the available 2 GHz band. Furthermore, the system was again simulated with the worst DC and I/Q errors. The fidelity of the performed rotation is recorded, as well as the fidelity of all unaddressed qubits, including an additional one placed at the highest spectral spur.
The fidelity of the resulting system was limited to ∼99.996% by the unaddressed qubit at the image frequency when truncating the multiplier outputs. After implementing rounding in the multipliers of the I/Q correction network, the fidelity improved to ∼99.998%, limited by the unaddressed qubit at the highest spectral spur. When increasing the number of LUT entry bits (b lut ) by 1, we are at the edge of achieving the desired fidelity. The result of this simulation is shown in Fig. 13. Finally, the number of accumulator bits (b acc ) is increased to 19 to ensure the required frequency accuracy when operating at the lowest Rabi frequency of 1 MHz. A summary of the specifications is given in Table III.

E. Analog Blocks
The coarse specifications for a 99.99% fidelity (100 ppm infidelity) in Table II assume an equal contribution from the Fig. 13. The simulated infidelity of the digital system with specifications in Table III.   TABLE III   SPECIFICATIONS FOR THE DIGITAL SYSTEM different errors (∼ 10ppm each), with the previously discussed digital system contributing another ∼10 ppm to the infidelity. While the assumption of equal error contribution is useful for drafting initial specifications, the trade-offs between these specifications are analyzed in this section in order to budget the different errors for feasibility.
As long as the digital clock frequency and analog gain are stable enough, the pulse amplitude, generated frequency, I/Q phase imbalance, and duration can be guaranteed by the digital section. Following Table II, a variable gain of 44 dB with stability of 0.22% is required from the analog circuit. The frequency accuracy of 3.5 kHz (for a 1-MHz Rabi frequency and a 20-GHz output) requires a 0.18 ppm frequency stability. Such stability can be achieved by a crystal oscillator [33], and easily satisfies the required duration accuracy of 0.11 ns/50 ns = 0.22%. Hence, the duration inaccuracy will hardly contribute to the infidelity.
Assuming that the same frequency generator is used to derive the clock and the LO, the tolerable frequency noise (σ f ) can be translated to the required clock jitter (σ t ) as where f 0 is the clock frequency, and a phase noise profile of a narrowband PLL with ∼ 1/ f 2 over the frequency range of interest from f a to f b ( f a f b ) is assumed. A qubit is only sensitive to noise in a bandwidth of f b = f R · π 2 4 for a π-rotation at a Rabi frequency of f R [13]. For the case of a 1-MHz Rabi oscillation (σ f = 3.5 kHz rms ) and a 2.5-GHz clock, this requires an absolute jitter of σ t < 0.9 ps rms ( f a = f b /100 for a total duration of ∼ 100 quantum operations). Consequently, the timing jitter requirement of 0.11 ns rms is well satisfied, and this error source will hardly contribute to the infidelity. Achieving such a frequency noise is however not trivial; assuming the same phase noise profile, a singlesideband phase noise of -116 dBc/Hz is required at a 1 MHz offset from the carrier.
As the maximum output swing of -16 dBm (50 mV p ) can be directly generated by the DAC, no gain is assumed in the following stages, thus each stage contributes equally to the noise 6 and distortion. As a representation of those blocks, a single stage CMOS class-A resistive-loaded common-source amplifier, that can serve as the 50-output driver, 7 is analyzed in the following.
The maximum RMS output voltage of such an amplifier is given by and the RMS output noise voltage by where I d is the bias current, R L is the load resistance, k B is Boltzmann's constant, T the temperature, γ ∼ 2 is the excess noise factor for sub-micron devices and g m the device transconductance. The Signal-to-Noise Ratio follows as Assuming T = 300 K, a transistor overdrive voltage where g m I d ∼ 10 V −1 , and the bandwidth for which the qubit is sensitive to amplitude noise BW = f R (for a π-rotation) [13], it is found that a bias current I d > 0.66 μA is required to achieve the 50-dB SNR requirement with a 10-MHz Rabi frequency. Note that this is easily satisfied as I d > 1 mA is required to obtain the desired output voltage swing over a 50-load.
Assuming a CMOS single-ended amplifier, and an ideal square law device, the 2 nd -order distortion is given by (Section 5.7 of [32]) where V in is the input voltage, V gs is the gate-to-source voltage and V T is the device threshold voltage. Given the requirement of HD2 < −44 dB, and assuming no gain (V in,max = 50 mV p ), an unrealistic overdrive voltage V gs − V T > 2 V is required.
In order not to be limited by HD2, a differential circuit topology can be considered with a 3 rd -order distortion of (Section 5.7 of [32]) 6 The DAC quantization noise is already accounted for in the digital specifications. 7 The same noise analysis is also valid for e.g. a current-steering DAC. where V out, p is the peak amplitude. Achieving HD3 < -44 dB requires an overdrive V gs − V T > 0.15 V (assuming the gain g m · R L = 1). For a device in saturation, . With a g m of 1 50 , a bias current larger than 1.5 mA is required. 8 Finally, an SFDR < -44 dB requires a DAC with an effective number of bits (ENOB) of 7.
To summarize, the proposed design requirements are specified in Table IV. Of these requirements, the reference clock stability and LO frequency noise requirements appear most stringent. As the duration accuracy, timing jitter and amplitude noise specifications of Table II are most easily satisfied, their error contribution can be reduced to relax the specifications on the more stringent ones to save power. However, the SFDR as specified in Table II is not part of any error budgeting as it is the only error source considered affecting idle qubits, and hence this specification cannot be relaxed.

F. Power Consumption Estimate
While an accurate estimation of the power consumption requires knowledge of the exact digital and analog circuit implementation, in this section an estimate is given based on the previously found specifications and implementation examples found in literature.
A direct digital synthesizer with similar specifications (9-bit amplitude, 2-GHz clock and 55-dB SFDR), has been implemented in 55-nm CMOS while consuming 25 mW in the 32-bit NCO and 37 mW in the phase-to-amplitude conversion [34]. Considering our system with 32 19-bit NCOs operating at 2.5 GHz and a single phase-to-amplitude conversion block, a power consumption of 640 mW is expected. Similarly, in 65-nm CMOS, a 10-bit multiplier operating at 2.5 GHz consumes 14 mW [35], and hence an additional 112 mW is expected in our digital modulation and I/Q correction network (8 multipliers), bringing the total digital power consumption to ∼750 mW. Based on the study presented in [36], a power consumption of ∼160 mW is expected in a 22-nm CMOS node, with 80% of the power consumed in the NCOs, i.e. 4 mW/NCO.
As found from the analog specifications, a single-transistor bias current of 1.5 mA is required to meet the linearity requirement if the entire circuit consists of a single stage. However, a more realistic implementation consists of at least 2 stages contributing to the distortion, e.g. current-steering DACs driving a 50-passive reconstruction filter which in turn drives a double-balanced I/Q mixer driving the 50output load. Considering a 2-stage implementation, a singletransistor bias current of 1.5 mA· √ 2 =2.1 mA is required. 9 As there are 2 stages, each differential, with I/Q, a total current of at least 17 mA is required (17 mW with a 1-V supply).
As about 36 mW is expected for the digital section in case of a single NCO and further reduction in digital power is promised going to a more advanced CMOS node, the power consumption is well-balanced between the analog and digital section, with another 4 mW required for every NCO, i.e. qubit, that is added.
V. APPLICATION EXAMPLE Compared to state-of-the-art controllers based on generalpurpose instruments or tailor-made controllers employing FPGAs [16], the presented solution offers the highest number of frequency multiplexed control channels and is maximally tailored to the quantum processor requirements allowing for a reduced power consumption. Implementing the proposed controller as a CMOS SoC will reduce its form factor, potentially enabling operating this power-efficient controller physically close to the qubits. The advantages of such a digital-intensive microwave signal generator can be observed by considering application examples for qubit control, as illustrated in this section.

A. Qubit Tune-Up
Besides the intended application of performing single-qubit operations, the control architecture can, for example, be used to tune-up the qubit processor. Part of this tune-up protocol is to find the qubit resonance frequency. The adiabatic fast passage technique uses a chirp pulse to sweep the microwave frequency across the spin resonance frequencies of multiple qubits in an FDMA setup, thereby smoothly rotating all spins whose resonant frequencies lie within the range [37]. Generating such a chirp pulse using the system architecture presented in the paper can be readily implemented using the following waveform for the in-phase (I ) and quadraturephase (Q) part of the envelope: for envelope samples n = 1 to N, resulting in a chirp from frequency f min to f max using N samples (total chirp time T chir p = N/ f s ) and amplitude A.
As an example, a binary search for the qubit frequency is shown in Fig. 14 using the designed system. Multiple frequency chirps are used over a frequency band that is halved every cycle of the search, narrowing down on the actual qubit frequency indicated by the black dashed line. In the first chirp, the frequency is swept over the lower sideband (LSB) from 5 to 6 GHz, and it is observed whether the qubit rotates or not. In case the qubit rotates, the qubit resonance frequency is in the LSB and the search continues there, otherwise the search continues in the upper sideband (USB). In order to keep the power spectral density the same when the frequency band is halved, the signal amplitude is gradually reduced with each step.
In the presented example (Eq. 17), a linear frequency sweep is implemented. However, any other profile can be implemented as well, which could be more efficient in determining the qubit resonance frequency [37]. Thanks to the highspeed DACs and digital back-end that allows modulation over the full data bandwidth, such frequency chirp can be easily implemented in the presented system.
Besides the qubit resonance frequency, the required pulse duration and amplitude should be determined during tuneup to calibrate the rotation angle. This is typically done by performing a Rabi oscillation where either the pulse duration or amplitude is incremented in small steps and the resulting rotation angle is measured. Due to the option to program any pulse envelope, the pulse duration and/or amplitude can easily be varied to perform such a Rabi oscillation and finalize the calibration of the qubit operation.

B. Multi-Qubit Simultaneous Excitation
As stated previously, the system is optimized by assuming sequential execution of operations on different qubits. However, the chosen system architecture supports the excitation of multiple qubits simultaneously when having a digital modulator and correction network for each channel (Fig. 15). The required DAC resolution increases to: Fig. 15. Simplified block diagram of the system for the case of 2 simultaneous excitation channels.  where N ch is the number of simultaneous channels. For the following example, the simulation model is adapted to allow for the simultaneous excitation of 2 qubits with a 10-bit DAC.
When simultaneously exciting 2 qubits using standard Gaussian envelopes (Fig. 17, amplitude modulation only) the fidelity is limited when the qubits are close in frequency due to the AC-Stark shift [15] 10 (Fig. 16). This effect shifts the resonance frequency of the qubit when an off-resonance pulse is applied. To account for this frequency shift and to compensate for its effect, phase modulation must be added besides the Gaussian amplitude modulation [15], as shown in Fig. 17 (top). The resulting in-phase and quadrature-phase components that are used for the digital modulation are shown in Fig. 17 (bottom). With these compensated Gaussian envelopes, a high fidelity can be achieved for 2 qubits spaced closely in frequency while being driven simultaneously in the presented control system (Fig. 16).
Thanks to the digital-intensive back-end that allows individual I/Q modulation for each channel, simultaneous excitation of multiple qubits is easily implemented in the presented system.

VI. CONCLUSION
Deriving the system specifications of the classical electronic controller for qubits and determining the optimal error budget are crucial in designing power efficient circuits. To meet these specifications, design trade-offs between several system architectures have been compared in this paper, resulting in the proposal of an efficient architecture exploiting frequency multiplexing for multi-qubit control. Co-simulation of the proposed electronic system and the qubits was used to assess the effect of non-idealities of each circuit block on qubit fidelity. Based on such analysis, the design specifications of each block have been determined to achieve the required gate fidelity while optimizing power consumption. Finally, the effectiveness and flexibility of such a system has been shown by demonstrating relevant practical applications, such as qubit tune-up and simultaneous qubit excitation. As a result of the proposed design methodology, we have obtained the blueprint for a power-efficient integrated electronic controller to realize single-qubit operations for practical large-scale quantum computers.