A 4 GHz Continuous-Time ADC With 70 dB DR and 74 dBFS THD in 125 MHz BW

Bolatkale, Muhammed; Breems, LJ; Rutten, Robert; Makinwa, Kofi

DOI
10.1109/JSSC.2011.2164963

Publication date
2011

Document Version
Accepted author manuscript

Published in
IEEE Journal of Solid State Circuits

Citation (APA)

Important note
To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Takedown policy
Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.
A 4GHz Continuous-Time ∆Σ ADC with 70dB DR and −74dBFS THD in 125MHz BW

Muhammed Bolatkale, Member, IEEE, Lucien J. Breems, Senior Member, IEEE, Robert Rutten, Kofi A. A. Makinwa, Fellow, IEEE

Abstract

A 4GHz 3rd order continuous-time ∆Σ ADC is presented with a loop filter topology that absorbs the pole caused by the input capacitance of its 4-bit quantizer and also compensates for the excess delay caused by the quantizer’s latency. The ADC was implemented in 45nm-LP CMOS and achieves 70dB DR and −74dBFS THD in a 125MHz BW, while dissipating 260mW from 1.1/1.8V supply. The ADC occupies 0.9mm² including the modulator, clock circuitry and decimation filter.

Index Terms

Analog-to-digital conversion, oversampling ADCs, CMOS analog integrated circuits, continuous-time sigma-delta modulation, delta-sigma modulator, continuous-time filters, multi-bit, wireless communication, radio receivers, base stations.

I. INTRODUCTION

Analog-to-Digital converter (ADC) developments are driven by the increasing demand for signal bandwidth and dynamic range in applications such as wireline and wireless communications, medical imaging and high-definition video processing. Multi-channel applications such as digital FM (DFM) and LTE-advanced require ADCs whose signal bandwidth ranges from 20MHz-100MHz and whose dynamic range (DR) is greater than 70dB [1]–[3]. To achieve high data rates, these applications rely on advanced digital modulation techniques that can be

M. Bolatkale, L. J. Breems and R. Rutten are with NXP Semiconductors, Eindhoven, The Netherlands, Email: muhammed.bolatkale@nxp.com.

K. A. A. Makinwa is with the Electronic Instrumentation Laboratory, Delft University of Technology, Delft, The Netherlands.
advantageously implemented in nanometer-CMOS, which motivates the development of suitable ADCs in these technologies. Positioning the ADC close to the RF input simplifies the design of the analog front-end. However, it makes greater demands on ADC performance, especially in terms of linearity and DR. Furthermore, the low supply voltages of nanometer-CMOS make ADC design even more challenging.

A plot of dynamic range vs. bandwidth for various power efficient (<1pJ/conv.) state-of-the-art ADCs is shown in Fig. 1. As can be seen, many switched-capacitor Nyquist ADCs achieve both wide bandwidths and high DR. To achieve high DR, a Nyquist ADC requires a large input capacitance, which is determined by thermal noise requirements. However, they must be preceded by an anti-aliasing filter and an input buffer capable of driving their thermal-noise-limited input capacitance, both of which increase system complexity and power.

Pipeline ADCs are the most common type of Nyquist ADC. They can achieve sampling speeds of up to 125MHz in standard CMOS [4]–[6]. To achieve higher sampling rates, a Bi-CMOS or SiGe Bi-CMOS process can be used at the cost of higher power consumption due to their higher supply voltages (1.8V-3.0V) [7], [8]. A further drawback of pipeline ADCs is the fact that they typically rely on high-gain wideband residue amplifiers and/or complex calibration techniques to reduce gain errors [5]–[7], thus increasing their area and complexity.

Recently, Nyquist ADCs based on the successive approximation register (SAR) architecture have achieved signal bandwidths of up to 50MHz with 56-65dB DR and excellent power efficiency (<80fJ/conv.) [9]–[12]. Greater bandwidth can be achieved by using time-interleaving. However, the linearity of a time-interleaved SAR ADC is limited by gain, offset, and timing errors and so such ADCs also require extensive calibration [13]. Furthermore, time interleaving increases the input capacitance and chip area, since many slices are required for interleaving [14].

By contrast, continuous-time (CT) Delta-Sigma (ΔΣ) ADCs have a simple resistive input that does not require the use of a power hungry input buffer or an anti-aliasing filter. When implemented in CMOS, such ADCs have achieved signal bandwidths of up to 25MHz with 70-80dB dynamic range and good power efficiency (<350fJ/conv.) [15]–[17]. Typical CTΔΣ modulators employ a high-order loop filter with a multi-bit quantizer, which, for a 20MHz bandwidth, require sampling frequencies of 0.5-1GHz to achieve more than 70dB of dynamic range. Assuming that sampling frequency is proportional to bandwidth, sampling frequencies of
2.5-5GHz will be then required to achieve bandwidths greater than 100MHz. However, at GHz sampling rates, parasitic poles and quantizer latency can easily cause modulator instability.

CT\Delta\Sigma modulators with signal signal bandwidths up to 20-25MHz have been implemented in 90nm-130nm CMOS. The switching speed of an NMOS transistor in 45nm CMOS is approximately 1.6x better than 90nm CMOS and 2.7x better than 130nm CMOS [18]. Implementing a \Delta\Sigma modulator in 45nm-LP CMOS is advantageous for circuits such as quantizers and DACs whose delay is important for stability. However, the dynamic range of the circuits in 45nm CMOS is limited due to the low intrinsic gain and poor matching of the transistors [19], [20]. The low operating supply (1.1V-1.0V) furthermore implies that cascaded stages are required to make gain in blocks such as an OTA or a quantizer. Therefore, the intrinsic speed of 45nm-LP CMOS can not be fully utilized. To realize CT\Delta\Sigma modulators with bandwidths greater than 100MHz, innovations are still required at the system-level design.

In this paper, a high-speed filter topology is proposed that overcomes these limitations and enables GHz sampling rates and state-of-the-art power efficiency. The 4GHz CT\Delta\Sigma ADC is implemented in 45nm-LP CMOS and achieves 70dB DR and \(-74\)dBFS THD in a 125MHz bandwidth [21]. Section II describes the fundamental limitations and tradeoffs encountered in the design of a CT\Delta\Sigma ADC at GHz sampling speeds. Section III discusses the implementation details and section IV describes the ADC’s measurement setup and presents the measurement results.

II. SYSTEM-LEVEL DESIGN

A. CT \Delta\Sigma Modulators at High Sampling Rates

In Fig. 2, a basic model of a single loop \Delta\Sigma modulator is shown. It has three main building blocks, a loop filter, a quantizer and a digital-to-analog converter (DAC) in the feedback path. The signal-to-quantization-noise-ratio (SQNR) and bandwidth of such a modulator depend on three main parameters: loop filter order, quantizer resolution, and sampling frequency (\(f_s\)). Signal bandwidth (BW) and \(f_s\) are linked via the \(OSR = f_s/(2 \times BW)\). Fig. 3 illustrates the relation between the three design parameters in a CT\Delta\Sigma modulator. Each point in Fig. 3 is taken from simulation results and corresponds to 80dB\(^3\) SQNR in 125MHz BW. It can be seen that achieving

\(^3\)To design a thermal noise limited ADC with DR of 70dB, the SQNR is set to at least 10dB better than target DR.
bandwidths in excess of 100MHz requires GHz sampling frequencies. A 1-bit quantizer is the most suitable for high-speed operation since its relaxed offset requirements lead to low area and small parasitic capacitances. For example, a 35GHz 1-bit 2nd-order modulator has been demonstrated in SiGe BiCMOS [22] with 55dB DR in a 100MHz bandwidth. However, in currently available CMOS processes, such sampling frequencies are impractical. Moreover, for sampling frequencies greater than 30-40GHz, the DR of the ADC will be limited by non-idealities such as clock jitter and quantizer metastability [23].

For the same SQNR, the sampling frequency of a CT∆Σ modulator can be reduced by using a multi-bit quantizer. However, the maximum sampling frequency will usually be limited by the quantizer’s latency and the parasitic loop-filter pole caused by its input capacitance. In practice, quantizers with up to 4-bit resolution are used as a compromise between complexity, latency and the power dissipation in the clock distribution network [1], [24], [25]. For a given quantizer resolution, increasing the loop-filter order also relaxes the sampling frequency. However, higher-order loop filters require more coefficients to stabilize the modulator, thus increasing its complexity. Moreover, the loop-filter coefficients will drift due to process, voltage and temperature (PVT) variations, and may cause SQNR degradation.

Despite the drawbacks of higher-order loop filters and multi-bit quantizers, they do facilitate lower sampling frequencies\(^2\). To meet the target specification of 80dB SQNR in a 125MHz bandwidth, a 3rd-order single-loop modulator with a 4-bit quantizer\(^3\) sampled at 4GHz was chosen. In the 45nm-LP process used, this choice was found to be a good trade-off between sampling frequency and circuit complexity. However, the use of a 4-bit quantizer reduces the maximum achievable sampling rate, due to its delay and input capacitance. This paper proposes a high-speed filter topology that overcomes these limitations and enables the use of GHz sampling frequencies.

\(^2\)MASH ∆Σ modulators offer another route to low sampling frequencies [26]. However, their signal bandwidths still depend on the signal bandwidth of a single-loop modulator. Although this work focuses on extending the signal bandwidth of a single order modulator, the results can also be used to increase the signal bandwidth of MASH modulators.

\(^3\)This architecture has been commonly used in CMOS ∆Σ modulators with 10-25MHz BW.
B. High-speed Capacitive Feedforward CT ∆Σ Modulator

In this work, the main challenge is the need to achieve both high DR and wide signal bandwidth with a CT ∆Σ modulator. To achieve the target DR, three requirements must be satisfied. The first is related to thermal noise and total-harmonic distortion (THD), which have to be better than -70dB in 125MHz BW and –70dBFS, respectively. The second is clock jitter, which, based on system-level simulations, requires clock buffers with less than 250fsec (rms) of jitter. The third, and most difficult, requirement is the need to maintain modulator stability while operating at a sampling frequency of 4GHz. The first two requirements can be met by dissipating more power in the associated circuitry. However, the relationship between modulator stability and power consumption is more complex. For instance, a quantizer must generate a valid digital output within a fraction of a sampling-clock cycle to maintain modulator stability, which implies that more power must be dissipated at higher sampling frequencies. Similar requirements exist for the loop filter and the DAC, since at GHz sampling rates, the delay associated with parasitic poles must be overcome by dissipating more power.

The circuit blocks of a ∆Σ modulator will all have a certain delay associated with the limited speed of the available transistors. As shown in Fig. 4a, both the 4-bit quantizer and DAC1 in this design are allowed to have a delay of half a clock period (125ps), so the total delay in the loop is one clock period, which would lead to an unstable modulator. An attractive solution that can be implemented in 45nm-LP CMOS is to compensate for the loop delay in the digital domain [27]. However, extra hardware is required which introduces additional delay and further pushes the digital circuitry to its limits. A part of the DR is used for compensating the delay in digital domain. [28]. Considering the drawbacks of the digital approach, an analog delay compensation method is used in this design. A second feedback path comprising a multi-bit D/A converter (DAC2) is employed (Fig. 4b) to compensate the loop delay [29]. This bypasses the loop filter and creates a stable 1st-order ∆Σ modulator at high frequencies. The presence of DAC2 stabilizes the modulator. However, it requires the implementation of a wideband summation node at the input of the quantizer.

A summation node can be implemented either by using a voltage summing amplifier or by

\footnote{The unity gain frequencies of the integrators and of the OTAs have been chosen such that the loop filter’s excess delay is negligible.}
using the virtual ground node of the last integrator of the loop filter to sum differentiated signals in the current domain [15]. In both cases, an amplifier will be employed, which has a finite output resistance ($R_{OUT}$). As shown in Fig. 4c, this will interact with the input capacitance ($C_Q$) of the 4-bit quantizer to introduce an additional pole that degrades stability. To preserve stability, the amplifier that implements the summing node must have a wide bandwidth for low delay, as well as high gain for reducing the variation of the loop-filter coefficients over PVT. These stringent requirements result in a power hungry summing amplifier.

The proposed solution is to eliminate the active summation node and connect the loop filter directly to the quantizer. By implementing the last stage of the loop filter as a transconductor, the quantizer’s parasitic capacitance $C_Q$ can be used to realize one of the loop filter poles. The output current of the transconductor will then be directly integrated over $C_Q$ (Fig. 5). To satisfy stability, however, there must still be a high speed path around the quantizer, to compensate for its latency. As shown in Fig. 5, this can implemented with a current steering DAC (DAC2) that is driven by a digital differentiator ($1 - z^{-0.5}$) [15].

Fig. 6 shows the block diagram of the proposed 3rd-order single-loop capacitive feedforward CT ΔΣ modulator. To minimize its power consumption, the loop filter employs a feedforward topology instead of a feedback topology. A feedback topology will require more DACs to implement feedback coefficients which, at GHz sampling frequencies, will significantly load the virtual grounds of the amplifiers. On the other hand, a feedforward topology requires a summation node for its feedforward coefficients. Since $C_Q$ can be used as a wideband passive summation node only for differentiated signals in the current domain, the feedforward voltages must be appropriately processed. This can be simply achieved by connecting capacitors $C_{A1}$ and $C_{A2}$ between the summing node and the outputs of the 1st and 2nd integrators. Furthermore, an overall feed-forward path is implemented by $C_{A0}$ to relax the requirements on the loop filter’s linearity [30] and to reduce the peaking in the signal transfer function of the modulator at the cost of lower anti-alias filtering. The feedforward coefficients can be expressed as:

$$a_n = \frac{C_{An}}{C_{TOTAL}}$$

where $C_{TOTAL} = C_{A0} + C_{A1} + C_{A2} + C_Q + C_{DAC2}$. $C_{TOTAL}$ is the total capacitance connected to the output of the loop filter. The feedforward capacitors $(C_{A0}, C_{A1}, C_{A2})$ are implemented by fringe capacitors. The total capacitance also includes the parasitic capacitances such as the
input capacitance of the 4-bit quantizer ($C_Q$) and the output capacitance of DAC2 ($C_{DAC2}$). The parasitic capacitances vary with the voltage swing present at the summing node. When compared to $C_{TOTAL}$, the nonlinear part is negligible. The passive summation requires that $(a_0 + a_1 + a_2) = 1 - (C_Q + C_{DAC2})/C_{TOTAL}$, which can be guaranteed by design.

### III. Implementation Details

#### A. CT ΔΣ ADC Architecture

Fig. 7 shows the architecture of the ADC in more detail. The first two integrators are implemented as RC integrators since these can operate at low supply voltages while providing the linearity required to achieve $-70\text{dB THD}$. To cancel the right-half plane zero introduced by the first integrator, a resistor ($R_z$) in series with $C_1$ is employed. The first and second OTAs are implemented as two-stage amplifiers with feed-forward frequency compensation [1]. To further optimize the gain in the band of interest, a resonator is implemented around the first two integrators by using a resistor ($R_3$). To compensate for RC spread, $C_1$, $C_2$ and $R_3$ can be individually calibrated via 5-bit networks, and the implemented tuning range is $\pm 50\%$. The third integrator is a Gm-C integrator whose linearity requirement is relaxed by the gain of the first two integrators. The third OTA is implemented as a resistively degenerated folded-cascode amplifier. Thanks to the high-speed capacitive feedforward loop filter architecture, the third OTA is not in the speed-critical path, which relaxes its bandwidth requirements. As a result, its power dissipation is negligible compared to that of the first two OTAs. The feedforward capacitors ($C_{A0}$, $C_{A1}$, $C_{A2}$) were not made trimmable, since their relative matching can be made sufficiently accurate by design. A further consideration is that the signal swing on the required selection switches could cause distortion via switches’ signal-dependent ON resistances. The bias current of the Gm-C integrator can also be programmed $\pm 50\%$ to calibrate its unity-gain frequency ($\omega_3 \propto 1/C_{TOTAL}$).

The DAC2 is directly connected to the capacitive summing node. Its errors are suppressed by the loop-filter’s gain, and so it was designed for 9-bit intrinsic matching. The 15-bit thermometer code output of the 4-bit quantizer is connected through a DAC driver to the 4-bit DAC1. The DAC driver resamples the high speed data and generates digital copies for further processing. The ADC includes a thermometer-to-binary decoder, decimation filter and low voltage differential swing (LVDS) buffers. The decoder demultiplexes the 4GHz data and converts the 15-bit thermometer code.
code to $4 \times$ time-interleaved 4-bit binary code which is then decimated by an on-chip polyphase decimation filter.

B. Quantizer Design and Timing Diagram of the Modulator

As shown in Fig. 8, the quantizer is a 4-bit flash converter. It consists of 15 unit elements, whose reference voltages are generated from a 15-tap resistive ladder. Since it is in the high-speed-path (Fig. 7), its delay must be less than half a sampling-clock period (125ps) to ensure loop stability. The combination of the 4-bit DAC1 and its driver (Fig. 7) must achieve similar delay while still meeting the linearity and noise requirements. Lastly, the excess delay in the path around the 4-bit flash converter (through DAC2) must be less than half a clock period. Therefore, each slice of the quantizer must drive a unit element of DAC2 with minimal buffering to avoid the excess delay and power dissipation associated with re-clocking the data at 4GHz. To meet these system level requirements, the unit elements of the 4-bit quantizer and the DAC1 driver were co-designed to minimize the total number of gates, and thus minimize the delay. Furthermore, the quantizer generates complementary digital outputs to drive DAC1 and DAC2 directly, while the high speed digital traces are routed differentially to reduce the noise injected to the substrate.

To realize high speed flash ADCs, several comparator units can be pipelined. In this design, however, the ADC must complete its operation in half a clock period, which severely limits the choice of architectures. Considering that at 4GHz the clock buffers will also consume considerable dynamic power, a three stage comparator consisting of a preamplifier, a latch and a D-FF (Fig. 8) was chosen as a trade-off between the power consumption of the clock buffers and the power consumption of a unit slice of the quantizer.

The preamplifier is a resistively-loaded NMOS pair with a reset switch connected across its output to enable fast overdrive recovery. The input pair is scaled for offset voltage and the preamplifier employs low-threshold transistors to reduce the bias current required for the intended bandwidth. The latch is realized as a differential pair that drives a cross-coupled latch. The D-FF consists of two stages: a double-tail sense amplifier [31] and a symmetrical slave latch (SL) [32]. The first stage of the D-FF is shown in Fig. 9. This architecture is suitable for low-voltage supplies since a maximum of three transistors are stacked between the supply rails. The 2nd stage of D-FF uses a symmetrical SL, which ensures that each of the D-FF’s outputs has equal delay, making it possible to drive DAC2 directly and thus avoid the extra delay associated with
re-clocking the data. The DAC1 driver uses the same D-FF architecture.

To reduce the kickback noise on the loop filter and reference ladder, the first two stages of the comparator (the preamplifier and the latch) are biased with a static current such that their input pairs do not switch. Only the charge injection of the reset switches is then present at the input of the comparator, but this is a common-mode effect. Moreover, the kickback noise of the D-FF is suppressed by the gain of the first two stages of the quantizer. The D-FF is also designed for minimal kickback noise. The first stage of D-FF (Fig. 9) consists of a dynamic input stage ($M_{1,2}$) whose outputs ($b_n, b_p$) are connected to a cross coupled inverter ($M_{9-11}$) through $M_{7,8}$. Since the current of the latch can be optimized independently of the current of the input stage, the kickback noise caused by the switching of transistors $M_{1,2}$ can be minimized. Furthermore, $M_{7,8}$ isolates the input and output of the D-FF, which serves to further reduce the kickback noise.

The modulator’s timing is shown in Fig. 10. To ensure stability, the comparator outputs ($D_q$ and $\overline{D_q}$) must be valid after half a clock period, while the output of the DAC1 driver ($D_1$ and $\overline{D_1}$), which drives the unit current sources, must be valid in less than one clock period. In order to reduce the delay associated with the comparator, as well as the power in the clock buffers a delayed-clocking scheme is adopted [33]. First, the preamplifier’s $Rst$ switch is disabled and the preamplifier starts amplifying. After a short delay (less than half a clock period), during which the preamplifier’s output settles to 4-bit accuracy, $CLK_{Latch}$ is activated whereupon the signal is further amplified by the latch. Then $CLK_{DFF}$ is activated after which the D-FF finalizes the comparison and generates a valid digital representation of the decision. A unit element of the DAC driver is shown in Fig. 8. It consists of a D-FF, a switch driver, and a data buffer. The thermometer output of each quantizer is directly connected to each unit element, where it is re-clocked on the rising edge of $CLK_{DAC1}$ (Fig. 10). The additional clocking of the data minimizes the jitter introduced by the D-FF’s data-dependent delay and metastability.

C. Feedback DACs

The ADC employs two 4-bit unary weighted DACs. DAC1 has the most stringent requirements in terms of linearity and noise, and it requires large devices to achieve the required matching. DAC2, which is connected to the output of the loop filter, has much more relaxed requirements, since its non-idealities are suppressed by the gain of the loop filter.
DAC1 is a 4-bit current-steering DAC designed for 11-bit intrinsic matching. Achieving this with MOS current sources consumes too much area and results in poor high frequency linearity. Increasing the gate overdrive voltage also does not help much, and so resistively degenerated current sources were used. One unit element of the DAC is shown in Fig. 11. It consists of a resistively degenerated PMOS current source, which has better matching and lower noise than a MOS-only current source. By using a higher supply voltage for DAC1 (1.8V), R1 can be made larger, effectively reducing the noise contribution of DAC1 and reducing the ADC’s overall power consumption. Since the voltage drop on R1 is about 0.7V, $M_{1-8}$ can still be implemented using thin-oxide transistors. The D-FF and switch driver can then be optimized for the generation of the signals (with low crossover and steep edges) required to drive the PMOS switches ($M_{3,4}$) of DAC1. At high sampling rates, the un-equal rise and fall time of the output of DAC1 can cause inter-symbol interference (ISI) [34], [35]. To minimize this, DAC1 employs a fully differential architecture [36]. Moreover, the DAC1 driver’s D-FF and switch drivers are dimensioned to achieve better than 80dB SNR [34]. DAC1 is biased by low-noise on-chip circuitry, and for further noise suppression the bias voltage of the MOS current sources are filtered by an on-chip RC-filter. DAC1 does not use any calibration techniques such as data-weighted-averaging, or current-source calibration at start-up. The linearity of DAC1 is limited by the device matching.

D. Operational Transconductance Amplifier

As shown in Fig. 12, the first two integrators are implemented as a two stage feed-forward compensated amplifier [1]. Transistors $M_{1-8}$ form the amplifier’s input stage, while transistors $M_{11,12}$ form its second stage. Transistors $M_{0,10}$ create a high frequency feed-forward path between the input and the output, thus stabilizing the amplifier. The output common mode voltage of the first stage is sensed by poly resistors that control the gate voltage of transistors $M_{7,8}$. Similarly the output common-mode voltage of the second stage is controlled by an auxiliary common-mode amplifier which controls the bias voltage of transistor $M_{14}$. The designed OTA achieves 35dB DC gain and 8GHz UGBW, while consuming 23mA from a 1.1V supply. Since the second integrator’s OTA requires less bandwidth, its current is scaled down by a factor of two.
E. Decimation Filter

Fig. 13 illustrates the block diagram of the decoder and decimation filter. The decimation filter is included on the chip to relax the task of capturing the data and designing the test PCB. Moreover, the decoder and decimation filter act as a digital aggressor in close proximity to the ADC. Therefore, the robustness of the ADC’s performance to substrate noise injected by the digital circuitry can be evaluated. The 15-bit thermometer output of modulator is clocked at 4GHz. Since the digital cells of the standard digital library could only be verified up to 1.2GHz, the data is first demultiplexed by a custom thermometer-to-binary decoder, which generates $4 \times$ time-interleaved 4-bit binary data with a sampling frequency of 1GHz. The two-stage polyphase decimation filter sampled at 1GHz and 500MHz respectively, generates 14-bit decimated outputs at 500MHz so that the quantization noise spectrum just outside the 125MHz signal bandwidth can also be measured. The decimated outputs are then converted to LVDS signals on the chip and transmitted to LVDS repeaters on the measurement PCB.

IV. EXPERIMENTAL RESULTS

A. Measurement Setup

The measurement setup used to evaluate the ADC is shown in Fig. 14. A signal source (Rohde\&Schwarz SMA100A) drives a programmable 5th order bandpass filter, which attenuates its harmonics and the noise. The resulting single-ended signal is converted into a differential signal by a balun and fed to the ADC. The ADC’s clock signal is generated by another signal source (Rohde & Schwarz SMIQ-06B), which outputs a 4GHz sinewave with 6dBm output power. The integrated jitter of the clock signal is 240fsec rms in a 1kHz to 2GHz bandwidth. The clock signal is converted into a differential signal ($CLK$, $CLK\overline{K}$) by a 180°-hybrid and then AC-coupled to the ADC. This divides it by 4 and outputs the result to enable data capture and synchronization. A pulse generator (Agilent 81134A) is synchronized to $CLK\text{OUT}$ and outputs a conditioned CLK to a high-speed FPGA (Altera Stratix III) which captures the data. LVDS repeaters on the test PCB buffer the decimated 14-bits output of the ADC, and isolates it from the digital noise associated with the FPGA. The captured data is then downloaded to a PC for post processing in MATLAB. At GHz sampling speeds, capturing errors can degrade the measurement results, therefore a double sampling scheme is adapted to capture data. The data

August 5, 2011
is sampled twice by the FPGA and so every consecutively captured sample will have the same value if the measurement setup has the correct timing and synchronization. This sampling scheme provides a first-order confirmation that no capturing errors have occurred.

B. Measurement Results

A chip photo of the fabricated ADC in 45nm-LP baseline CMOS is shown in Fig. 15. The ADC has an active area of 0.9mm$^2$. The modulator occupies 0.675mm$^2$, whereas clock buffers and decimation filter occupy 0.225mm$^2$. The ADC dissipates 256mW from a 1.1 V supply and 3.2mW from a 1.8V supply. To reduce interconnect resistances and capacitances, the high speed blocks are placed very close to each other. For example, DAC2 with its multi-bit differentiator is located just after the 4-bit quantizer. DAC1 is positioned very close to the input of the loop filter, so as to minimize the parasitics at the virtual ground of the first integrator. At the system level, the additional delay due to the long interconnect lines between 4-bit quantizer and DAC1 is compensated for by allocating a half clock cycle to the sum of its settling time and the interconnect delay. The clock buffers and digital circuits such as the decoder and the decimation filter are positioned close to the clocked circuits. Moreover, identical supply routing is used for DAC1, DAC2, and the quantizer to ensure that each unit element experiences the same $I \times R$ drop on its supply.

Fig. 16 shows an FFT of the measured-decimated output of the $\Delta \Sigma$ ADC with no input signal. The ADC’s noise floor is flat in the signal BW of 125MHz and rises slightly at higher frequencies due to the presence of out-of-band quantization noise. To measure the ADC’s distortion, sinusoidal input signals with a maximum input voltage of 2.0-$V_{p-p}$ differential were supplied to the ADC. The decimated output for a 41MHz input signal at $-0.5$dBFS has been captured in real-time, and its FFT is shown in Fig. 16. The total harmonic distortion (THD) is $-74$dBFS. As shown in Fig. 17, the ADC achieves 70dB DR in a 125MHz BW. The peak SNR/SNDR are 65.5/65dB at $-0.5$dBFS input respectively. For large signals ($-10$dBFS $\sim -0.5$dBFS), the residual non-linearity of DAC1 causes harmonic components and quantization errors to fold into the signal band, thus increasing the in-band noise.

$^5$In Figures 16-18, the noise floor is the average of 4 measurements.
Fig. 18 shows the ADC’s measured intermodulation performance for 93MHz and 95MHz input signals at $-7.2\text{dBFS}$. This choice of input frequency was determined by the bandpass filters available in the measurement setup. The second order intermodulation distortion (IM2) and the third order intermodulation distortion (IM3) are $-73\text{dBc}$ and $-69\text{dBc}$ respectively. The measured linearity of the ADC is limited by the mismatches of DAC1 unit elements.

The jitter performance of a CTΔΣ ADC is commonly analyzed by assuming a clock source with white noise jitter. However, to generate GHz sampling frequencies an on-chip clock source such as a PLL is required. This will multiply an input reference clock and generates the ADC’s sampling clock ($f_s$). As typical in a PLL output spectrum, the clock would have spurious tones located at ($f_s \pm f_{\text{offset}}$). In multi-channel applications, these spurious tones can demodulate an adjacent channel or an interferer into the signal band and thus degrade the sensitivity of the receiver. For an input signal located at $f_{\text{in}}$, the amplitude of in-band jitter tones at the ADC’s output can be expressed as [37]:

$$JT_{f_{\text{in}} \pm f_{\text{offset}}} = ST^{\text{dBc}} \times \frac{f_{\text{in}}}{f_s} [\text{dBc}]$$

(2)

where $ST^{\text{dBc}}$ is the power of a spurious tone relative to the carrier. Since the implemented ADC does not have a PLL, an external clock signal\(^\text{8}\) generates a spurious tone located at $f_{\text{offset}} = 10\text{MHz}$ with $-32.4\text{dBc}$ power as shown in Fig. 19a. To measure the in-band jitter tones, a 105MHz input signal at $-1\text{dBFS}$ is applied to the ADC input and the resulting jitter tones are shown in Fig. 19b. The jitter tones are attenuated by $10\cdot\log_{10}(105\text{MHz}/4\text{GHz})=31.6\text{dBc}$ and the resulting tones located at $f_{\text{in}} \pm f_{\text{offset}}$ have amplitudes of $-63.8\text{dBc}$ and $-63.9\text{dBc}$ respectively, which agrees with (2).

However, the maximum dynamic range of the ADC is defined by its sensitivity to white noise jitter. To measure the effect of white noise jitter, a bandwidth-limited white noise jitter is introduced by using a pattern generator. The signal-to-jitter-noise-ratio (SJNR) due to the

\(^8\)A signal source generates a sinewave that is fed to a pattern generator (Agilent J-BERT N4903B) which divides the input clock signal by 2 and generates a 4GHz clock signal with 6dBm output power.
demodulation of the out-of-band quantization noise can be expressed as:

\[
SJNR_{JQ} = -10 \cdot \log_{10}(PND) - 10 \cdot \log_{10}(BW) + 10 \cdot \log_{10}\left(\left(\frac{N - 1}{0.7 + N - 2}\right)^{2}\right) + 6
\]

(3)

where PND is the average phase noise density per Hz, N is the number of quantizer levels, BW is the signal bandwidth [38]. In (3) it is assumed that all the quantization noise is located at \(0.5 \times f_s\) which results in a lower SJNR for a given white noise jitter. In Fig. 20a, the phase noise spectrum of the clock generator around the carrier without additional white noise is shown (clock source). The ADC normally achieves 70dB DR, but when -34.5dBc (1.05psec rms) white noise is applied to the clock (test clock\(^7\) in Fig. 20a), its DR range degrades to 69dB as shown in Fig. 20b. By using (3), the expected SJNR\(^8\) is 75.2dB which reduces the DR by 1dB.

However, in the presence of a large input signal, white noise jitter in Fig. 20a is present around the input signal and degrades SJNR significantly as shown in Fig. 20b. The SNR degrades from 65dB to 61dB as expected from (2). Therefore, in the presence of a large input signal in a high-speed and wideband CT\(\Delta\Sigma\) ADCs, the spectral shape of jitter noise limits the achievable SNR and DR.

Spurious tones are present at 25MHz, 80MHz, and 130MHz in both Fig. 19b and Fig. 20b. However, the clock spectrum in Fig. 20a does not have any spurious tones above 2MHz. So these high frequency spurious tones are not due to clock spurs. Since, the decimation filter is effectively running at 500MHz and does not have enough suppression, aliasing in the decimation filter might cause these tones. For example, the higher order distortion tones of the modulator \((4^{th}, 5^{th}, 6^{th}, \ldots)\) can mix down with the clock of the decimation filter.

16 samples have been measured and showed similar performances. Table I summarizes the performance of a typical ADC sample. Compared to the CMOS \(\Delta\Sigma\) ADCs, the proposed ADC

\(^7\)While generating white noise jitter, the test clock generates spurious tones located up to 2MHz offset from the carrier.

\(^8\)The measured integrated phase noise is \(-34.5\text{dBc}\) in 100MHz BW from the carrier frequency \((PND= -114\text{dBc/Hz})\). For the frequencies between 100MHz and 500MHz offset from the carrier, PND stays at \(-114\text{dBc/Hz}\) and for frequencies higher than 500MHz PND rolls off to \(-138\text{dBc/Hz}\). However since the quantization noise is low enough for frequencies between 100MHz and 500MHz, the convolution of white noise jitter and quantization noise can be neglected. Therefore, the phase noise density can be assumed to be at \(-138\text{dBc/Hz}\). The total integrated phase noise (in the band of 1kHz-2GHz) is \(-34.2\text{dBc}\). The \(PND_{\text{dBc/Hz}}\) is \(-34.2\text{dBc}-10\log_{10}(0.5 \times f_s) = -127.2\text{dBc}\) and by using (3) the expected SJNR is 75.2dB
achieves 5x larger BW with similar dynamic range. When compared to non-CMOS ΔΣ ADCs, it achieves 125MHz BW with 10dB more DR and at a lower supply voltage and lower sampling frequency \(f_s=4\)GHz). Furthermore, its figure of merit (FOM) is more than 10\(\times\) better, where the FOM is defined as:

\[
FOM = \frac{\text{Power}}{2 \times \text{BW} \times 2^{\frac{\text{DR}-1.76}{6.02}}}
\]  

(4)

In the FOM calculation, the power consumption of the modulator, clock buffers, decoder and decimation filter are included. The proposed ADC owes its good power efficiency to its loop-filter architecture, which obviates the need for a power-hungry active summation node, and to the low power consumption of digital circuitry in nanometer-CMOS. Considering that the switching speed of a transistor increases by 1.6x from 90nm CMOS to 45nm-LP CMOS, the rest of the improvement in signal bandwidth is achieved thanks to the use of a high-speed capacitive-feedforward loop filter architecture, and a low-latency 4-bit quantizer and DAC. Compared to the Nyquist ADC, the proposed ADC achieves similar BW but 1-bit less dynamic range. Since the DR of the proposed ADC is thermal noise limited, it can be improved by reducing its effective input-referred noise resistance. This will be at the expense of increased power consumption in the first integrator, which, however, contributes only 10% of the ADC’s total power dissipation. The proposed ADC has a better FOM than the Nyquist ADC, which implies that ΔΣ ADCs can be a power efficient alternative for applications which require high dynamic range and wide bandwidths. Lastly, the active area of the proposed ADC is less than 1mm\(^2\), which is essential for low-cost integration.

V. CONCLUSIONS

This work demonstrates the implementation of a multi-bit GHz CTΔΣ ADC that achieves 70dB dynamic range in 125MHz signal bandwidth. This is a 5\(\times\) improvement in BW compared to state-of-the-art CMOS ΔΣ ADCs. Without any calibration, the ADC achieves \(-74\)dB THD in 125MHz bandwidth with a FOM of 0.4pJ/conv. while drawing only 256mW from a 1.1 V supply and 3.2mW from a 1.8 V supply. This performance is achieved thanks to the use of a high-speed capacitive-feedforward loop filter architecture, and a low-latency 4-bit quantizer and DAC. As the scaling of nm-CMOS continues, the bandwidth of a CTΔΣ ADC is also expected to scale with improvements in transistor switching speed. For example, 20% more signal BW
can be achieved in 28nm-LP CMOS. Furthermore, its resistive input makes it easier to drive than Nyquist ADCs with switched-capacitor inputs. The result is an ADC design whose performance enlarges the application domain of \( \Delta \Sigma \) ADCs by an order of magnitude.

VI. ACKNOWLEDGEMENTS

The authors would like to thank Harish Kundur and Jingjing Hu for their careful layout.

REFERENCES


LIST OF FIGURES

1 Dynamic range vs. bandwidth of state of the art ADCs with power efficiency better than 1pJ/Conv. [39]. Even though they do not meet this criterion, ∆Σ ADCs with bandwidth of 100MHz and 125MHz [22], [40] are included to illustrate the state of the art. .................................................... 20

2 A basic single-loop ∆Σ modulator. ................................................................. 20

3 System level trade-off in a single loop ∆Σ modulator for 80dB SQNR in 125MHz BW. ................................................................. 21

4 A single loop ∆Σ modulator with excess loop delay (a), excess loop delay compensation (b), and a non-ideal summing amplifier which is modeled as $R_{OUT}$, and $C_Q$ is the input capacitance of the 4-bit flash quantizer (c). .................................................... 22

5 A single loop ∆Σ modulator with a wideband summation node for differentiated signals in current domain. .................................................... 23

6 The proposed high-speed capacitive feedforward CT∆Σ modulator. ............. 23

7 The top level architecture of the 3rd order CT∆Σ ADC. .................................. 24

8 The simplified schematic of a unit element of 4-bit quantizer and DAC driver. .. 24

9 The input stage of the D-FF ........................................................................... 25

10 The timing diagram of the CT∆Σ modulator. ............................................... 25

11 The schematic of a unit element of 4-bit main DAC. ................................... 26

12 The schematic of operational transconductance amplifier. ......................... 26

13 The block diagram of the implemented decimation filter. .......................... 27

14 Measurement setup of the CT∆Σ ADC. ....................................................... 27

15 Chip Micrograph. ......................................................................................... 28

16 An FFT of measured decimated output for an input signal of -0.5dBFS at 41MHz. 28

17 Measured SNR and SNDR vs. input power with a 41MHz input. ............... 29

18 An FFT of measured decimated output for a two-tone input signal of −7.3dBFS at 93MHz and 95MHz. ....................................................... 29

19 The measured phase noise of the clock source for a clock tone introduced at $f_c + 10MHz$ with −32dBc power (a), and the measured output spectrum of CT∆Σ ADC for an input signal of 105MHz at −0.5dBFS. ....................................................... 30

20 The measured phase noise of the clock source and the test clock with wideband white noise (a), and the measured output spectrum of CT∆Σ ADC for an input signal of 105MHz at −0.5dBFS. ....................................................... 31
Fig. 1. Dynamic range vs. bandwidth of state of the art ADCs with power efficiency better than 1pJ/Conv. [39]. Even though they do not meet this criterion, ΔΣ ADCs with bandwidth of 100MHz and 125MHz [22], [40] are included to illustrate the state of the art.

Fig. 2. A basic single-loop ΔΣ modulator.
Fig. 3. System level trade-off in a single loop ΔΣ modulator for 80dB SQNR in 125MHz BW.
Fig. 4. A single loop $\Delta \Sigma$ modulator with excess loop delay (a), excess loop delay compensation (b), and a non-ideal summing amplifier which is modeled as $R_{OUT}$, and $C_Q$ is the input capacitance of the 4-bit flash quantizer (c).
Fig. 5. A single loop $\Delta\Sigma$ modulator with a wideband summation node for differentiated signals in current domain.

Fig. 6. The proposed high-speed capacitive feedforward $CT\Delta\Sigma$ modulator.
Fig. 7. The top level architecture of the 3rd order CTΔΣ ADC.

Fig. 8. The simplified schematic of a unit element of 4-bit quantizer and DAC driver.
Fig. 9. The input stage of the D-FF

Fig. 10. The timing diagram of the CTΔΣ modulator.
Fig. 11. The schematic of a unit element of 4-bit main DAC.

Fig. 12. The schematic of operational transconductance amplifier.
Fig. 13. The block diagram of the implemented decimation filter.

Fig. 14. Measurement setup of the CTΔΣ ADC.
Fig. 15. Chip Micrograph.

Fig. 16. An FFT of measured decimated output for an input signal of -0.5dBFS at 41MHz.
Fig. 17. Measured SNR and SNDR vs. input power with a 41MHz input.

Fig. 18. An FFT of measured decimated output for a two-tone input signal of $-7.3\text{dBFS}$ at 93MHz and 95MHz.
Fig. 19. The measured phase noise of the clock source for a clock tone introduced at $f_c + 10\text{MHz}$ with $-32\text{dBc}$ power (a), and the measured output spectrum of CT\(\Delta\Sigma\) ADC for an input signal of 105MHz at $-0.5\text{dBFS}$. 
Fig. 20. The measured phase noise of the clock source and the test clock with wideband white noise (a), and the measured output spectrum of $\text{CT} \Delta \Sigma$ ADC for an input signal of $105\text{MHz}$ at $-0.5\text{dBFS}$. 
LIST OF TABLES

I  Performance table and comparison to prior work. . . . . . . . . . . . . . . . . . . . . . . . 33
TABLE I 
PERFORMANCE TABLE AND COMPARISON TO PRIOR WORK.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Architecture</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
<td>∆∆.</td>
</tr>
<tr>
<td>F_s (GHz)</td>
<td>4</td>
<td>0.5</td>
<td>0.9</td>
<td>0.42</td>
<td>0.34</td>
<td>0.64</td>
<td>0.25</td>
<td>8</td>
<td>0.25</td>
</tr>
<tr>
<td>BW (MHz)</td>
<td>125</td>
<td>25</td>
<td>20</td>
<td>20</td>
<td>20</td>
<td>20</td>
<td>25</td>
<td>125</td>
<td>125</td>
</tr>
<tr>
<td>DR (dB)</td>
<td>70</td>
<td>70</td>
<td>81.2</td>
<td>72</td>
<td>77</td>
<td>76</td>
<td>59</td>
<td>52</td>
<td>77.5</td>
</tr>
<tr>
<td>SNR (dB)</td>
<td>65.5</td>
<td>64</td>
<td>81.2</td>
<td>72</td>
<td>71</td>
<td>76</td>
<td>59</td>
<td>52</td>
<td>77.5</td>
</tr>
<tr>
<td>SNDR (dB)</td>
<td>65</td>
<td>63.5</td>
<td>78.1</td>
<td>70</td>
<td>69</td>
<td>74</td>
<td>53</td>
<td>-</td>
<td>77.5</td>
</tr>
<tr>
<td>Power (mW)</td>
<td>260</td>
<td>8</td>
<td>87</td>
<td>28</td>
<td>58</td>
<td>38</td>
<td>650</td>
<td>1800</td>
<td>1000</td>
</tr>
<tr>
<td>VDD (V)</td>
<td>1.0/1.8</td>
<td>1.2</td>
<td>1.5</td>
<td>1.2</td>
<td>1.2</td>
<td>2.5</td>
<td>2.5</td>
<td>1.6/3.3</td>
<td>1.8/3.0</td>
</tr>
<tr>
<td>Area (nm²)</td>
<td>0.9</td>
<td>0.15</td>
<td>0.45</td>
<td>1</td>
<td>0.5</td>
<td>1.2</td>
<td>4</td>
<td>1.45</td>
<td>50</td>
</tr>
<tr>
<td>Technology</td>
<td>45nm</td>
<td>90nm</td>
<td>130nm</td>
<td>90nm</td>
<td>90nm</td>
<td>130nm</td>
<td>130nm</td>
<td>InP</td>
<td>BiCMOS</td>
</tr>
<tr>
<td>FOM (pJ/conv.)</td>
<td>0.71a</td>
<td>0.13</td>
<td>0.33</td>
<td>0.27</td>
<td>0.61</td>
<td>0.23</td>
<td>8.9</td>
<td>-</td>
<td>0.55</td>
</tr>
<tr>
<td>FOM (pJ/conv.)</td>
<td>0.40b</td>
<td>0.06</td>
<td>0.23</td>
<td>0.21</td>
<td>0.24</td>
<td>0.18</td>
<td>4.4</td>
<td>22</td>
<td>0.55</td>
</tr>
</tbody>
</table>

a)The power consumption of the decimation filter is included in the FOM.
b)FOM = Power/(2 × BW × 2^{SNDR/1.76})
c)FOM = Power/(2 × BW × 2^{DR/1.76})