

Delft University of Technology

#### Wideband circuits for optical communications

Vera Villarroel, Leonardo

DOI 10.4233/uuid:76b8664b-4e53-4549-8397-a5a9974f5abe

Publication date 2016 **Document Version** 

Final published version

Citation (APA) Vera Villarroel, L. (2016). Wideband circuits for optical communications. [Dissertation (TU Delft), Delft University of Technology]. https://doi.org/10.4233/uuid:76b8664b-4e53-4549-8397-a5a9974f5abe

#### Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

#### Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology. For technical reasons the number of authors shown on this cover page is limited to a maximum of 10. Wideband Circuits for Optical Communications

### Wideband Circuits for Optical Communications

Proefschrift

ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof. ir. K. C. A. M. Luyben, voorzitter van het College voor Promoties, in het openbaar te verdedigen

op 8 september 2016 om 12:30 uur

door

#### ARIEL LEONARDO VERA VILLARROEL

elektrotechnisch ingenieur geboren te Cochabamba, Bolivia.

#### Dit proefschrift is goedgekeurd door de promotor:

#### Prof. dr. J. R. Long

Samenstelling promotiecommissie:

| Rector Magnificus              | Technische Universiteit Delft, voorzitter |
|--------------------------------|-------------------------------------------|
| Prof. dr. J. R. Long           | Technische Universiteit Delft, promotor   |
| Prof. dr. ir. W. A. Serdijn    | Technische Universiteit Delft (EWI)       |
| Prof. dr. ir. L.C.N. de Vreede | Technische Universiteit Delft (UD-EWI)    |
| Prof. dr. ir. B. Nauta         | Universiteit Twente                       |
| Prof. dr. ing. C. Scheytt      | Universitat Paderborn                     |
| dr. ir. H. Veenstra            | Philips                                   |
| dr. B. J. Gross                | IBM                                       |
|                                |                                           |

Reservelid: Prof. dr. ing. A. Neto

Technische Universiteit Delft

| Keywords:   | Broadband feedback Darlington amplifier, frequency mul-        |
|-------------|----------------------------------------------------------------|
|             | tiplier, frequency divider, optical modulator driver, distrib- |
|             | uted amplifier, energy detector                                |
| Printed by: | Ipskamp drukkers, The Netherlands                              |

Copyright @ 2016 by ARIEL LEONARDO VERA VILLARROEL

ISBN 978-94-6186-700-1

An electronic version of this dissertation is available at http://repository.tudelft.nl/

To my family

| SUMMARY                                                 | xi |
|---------------------------------------------------------|----|
| SAMENVATTING                                            | XV |
| 1 INTRODUCTION                                          | 1  |
| 1.1 Objectives of this thesis                           | 3  |
| 1.2 Optical fiber communication                         | 3  |
| 1.2.1 Optical modulation                                | 4  |
| 1.2.2 Mach-Zehnder modulator (MZM)                      | 5  |
| 1.2.3 Towards an integrated digital optical transmitter | 6  |
| 1.3 IC technology for high-speed/wideband communication | 7  |
| 1.4 Wideband circuits and technology benchmarking       | 7  |
| 1.4.1 Wideband amplifiers                               | 8  |
| 1.4.2 Frequency multiplier                              | 9  |
| 1.4.3 Frequency divider                                 | 10 |
| 1.5 Mach-Zehnder modulator driver                       | 11 |
| 1.5.1 Conventional distributed amplifier limitations    | 12 |
| 1.5.2 Digitally-controlled distributed amplifier        | 12 |
| 1.6 Organization of this thesis                         | 13 |
| References                                              | 15 |
|                                                         |    |

#### Part I: BENCHMARK CIRCUITS

| 2 WIDEBAND AMPLIFIER                                  | 21 |
|-------------------------------------------------------|----|
| 2.1 Broadband amplifiers                              | 21 |
| 2.2 Darlington pair                                   | 23 |
| 2.3 Darlington feedback amplifier                     | 24 |
| 2.3.1 Low frequency gain, input and output resistance | 25 |
| 2.3.2 Transistors sizes                               | 27 |
| 2.4 Bandwidth enhancement                             | 30 |
| 2.4.1 Inductive peaking                               | 32 |
| 2.4.2 Cascoding                                       | 33 |
| 2.4.3 Amplifier noise figure                          | 35 |
| 2.4.4 Amplifier linearity                             | 36 |
| 2.5 Amplifier measurement and characterization        | 38 |
| 2.6 Summary                                           | 47 |
| References                                            | 47 |
| 3 FREQUENCY MULTIPLICATION                            | 51 |
| 3.1 Passive and active frequency multipliers          | 51 |
| 3.1.1 Active frequency multiplier topologies          | 52 |
| 3.2 Low-voltage multiplier topology                   | 53 |
| 3.2.1 Core input optimization                         | 58 |
| 3.2.2 Output load                                     | 63 |
| 3.2.3 Broadband and narrowband comparison             | 67 |
| -                                                     |    |

i

| 3.3 Wideband doubler prototype                            | 70  |
|-----------------------------------------------------------|-----|
| 3.4 Frequency quadrupler                                  | 76  |
| 3.5 Narrowband quadrupler prototype                       | 80  |
| 3.6 Summary                                               | 84  |
| References                                                | 85  |
|                                                           |     |
| 4 FREQUENCY DIVISION                                      | 89  |
| 4.1 Static frequency divider                              | 89  |
| 4.2 Dynamic frequency divider                             | 90  |
| 4.3 Dual mode dynastat (dynamic-static) frequency divider | 91  |
| 4.4 Dynastat prototype                                    | 93  |
| 4.5 Low voltage dynastat divider                          | 96  |
| 4.6 Summary                                               | 100 |
| References                                                | 101 |
|                                                           |     |

#### Part II: DIGITALLY-CONTROLLED DISTRIBUTED AMPLIFIER

| 5 BUILT-IN SELF-TEST CIRCUIT                                 | 105 |
|--------------------------------------------------------------|-----|
| 5.1 PRBS maximum operation frequency analysis                | 105 |
| 5.2 Clock distribution                                       | 108 |
| 5.3 Shift register design                                    | 111 |
| 5.4 XOR gate design                                          | 112 |
| 5.5 Output MUX                                               | 113 |
| 5.6 PRBS characterization                                    | 114 |
| 5.7 Summary                                                  | 119 |
| References                                                   | 119 |
|                                                              |     |
| 6 DIGITALLY-CONTROLLED DISTRIBUTED AMPLIFIER                 | 121 |
| 6.1 Digitally-controlled distributed amplifier               | 121 |
| 6.1.1 Output transmission line and back-termination resistor | 124 |
| 6.1.2 Latch and limiting amplifier                           | 126 |
| 6.1.3 Phase inverters, vector summer and clock buffer        | 128 |
| 6.1.4 Injection-locked oscillator                            | 131 |
| 6.1.5 Dynastat frequency divider                             | 133 |
| 6.2 Built-in calibration                                     | 133 |
| 6.3 40-Gb/s digitally-controlled DA prototype                | 140 |
| 6.4 Summary                                                  | 149 |
| References                                                   | 150 |
|                                                              |     |
| 7 CONCLUSIONS AND RECOMMENDATIONS                            | 153 |
| 7.1 Major contributions                                      | 154 |
| 7.2 Recommendations for future work                          | 157 |
| References                                                   | 160 |

ii

| APPENDIX A           | 163 |
|----------------------|-----|
| APPENDIX B           | 167 |
| LIST OF PUBLICATIONS | 169 |
| FABRICATED ICs       | 171 |
| ACKNOWLEDGEMENTS     | 173 |

Table of Contents

| Fig. 1-1:          | Total global [1] and mobile data traffic [6].                         | .2  |
|--------------------|-----------------------------------------------------------------------|-----|
| Fig. 1-2:          | Mach-Zehnder modulator.                                               | .5  |
| Fig. 1-3:          | Simplified optical link transmitter block diagram.                    | .6  |
| Fig. 1-4:          | Distributed amplifier circuit.                                        | .11 |
| Fig. 1-5:          | Distributed amplifier with digitally retimed input data.              | .13 |
| Fig. 2-1:          | Staggered amplifier [4].                                              | .22 |
| Fig. 2-2:          | 2-stage Darlington amplifier [5].                                     | .22 |
| Fig. 2-3:          | Cherry-Hooper amplifier [8].                                          | .23 |
| Fig. 2-4:          | Darlington pair.                                                      | .23 |
| Fig. 2-5:          | Shunt feedback, broadband reference amplifier                         | .24 |
| Fig. 2-6:          | Shunt feedback amplifier low frequency model                          | .25 |
| Fig. 2-7:          | Amplifier low frequency signal flow diagram                           | .26 |
| Fig. 2-8:          | Simplified small-signal model of the Darlington pair                  | .27 |
| Fig. 2-9:          | Small-signal model of a single-transistor feedback amplifier,         |     |
|                    | including the base resistance.                                        | .29 |
| Fig. 2-10:         | S11 of a single transistor feedback amplifier.                        | .29 |
| Fig. 2-11:         | Simplified small-signal circuit used for the frequency response       |     |
|                    | analysis.                                                             | .31 |
| Fig. 2-12:         | Options to implement inductive peaking                                | .32 |
| Fig. 2-13:         | Inductive series-peaked broadband amplifier                           | .33 |
| Fig. 2-14:         | Custom made inductor for the inductive series-peaking of a            |     |
|                    | broadband amplifier.                                                  | .33 |
| Fig. 2-15:         | Cascode broadband amplifier.                                          | .34 |
| Fig. 2-16:         | Darlington amplifier with a delay introduced in the collector         |     |
|                    | current of Q2.                                                        | .34 |
| Fig. 2-17:         | Simulated effect of a delay in the Darlington pair amplifier          | .35 |
| Fig. 2-18:         | Noise figure comparison between a common-emitter and a cascode        | ;   |
|                    | topology                                                              | .36 |
| Fig. 2-19:         | Schematic of a cascode amplifier, and Cjc vs. base-collector junction | on  |
|                    | reverse bias voltage                                                  | .37 |
| Fig. 2-20:         | Third-order intermodulation distortion for different VBC in the       | _   |
| _                  | cascode amplifier, simulation results.                                | .37 |
| Fig. 2-21:         | Chip photomicrograph of the cascode amplifier.                        | .39 |
| Fig. 2-22:         | Measured (solid line) vs. simulated (dashed)  S21  for the three      |     |
|                    | amplifiers.                                                           | .39 |
| Fig. 2-23:         | Measured (solid) vs. simulated (dashed)  S12  for the three           |     |
|                    | amplifiers.                                                           | .40 |
| Fig. 2-24:         | Measured (solid) and simulated (dashed) input and output reflection   | n   |
| <b>D</b> : • • = = | coefficients for the cascode amplifier.                               | .41 |
| Fig. 2-25:         | Stability factor k and D extracted from measurements of the           |     |
| -                  | series-peaked and cascode amplifiers.                                 | .41 |
| Fig. 2-26:         | Group delay extracted from measured S-parameters for the three        |     |
|                    | amplifiers                                                            | .42 |

| Fig. 2-27: | Measured (solid) and simulated (dashed) noise figure (NF) for the |     |
|------------|-------------------------------------------------------------------|-----|
| •          | amplifiers.                                                       | .43 |
| Fig. 2-28: | Measured (solid) vs. simulated (dashed) linearity for the cascode |     |
| -          | amplifier.                                                        | .43 |
| Fig. 2-29: | Forward transmission coefficient across four SiGe-BiCMOS          |     |
|            | generations.                                                      | .46 |
| Fig. 3-1:  | Gilbert multiplier.                                               | .52 |
| Fig. 3-2:  | Frequency doubler core schematic.                                 | .53 |
| Fig. 3-3:  | Frequency doubler normalized transfer function.                   | .55 |
| Fig. 3-4:  | Doubler transconductance derived from Eq. 4 for 3 different VK    |     |
|            | values.                                                           | .56 |
| Fig. 3-5:  | Large-signal transient simulation results for DIout.              | .56 |
| Fig. 3-6:  | Large-signal spectral components vs. DVin.                        | .57 |
| Fig. 3-7:  | Small-signal frequency response of DIout.                         | .59 |
| Fig. 3-8:  | Modified circuit to improve its input bandwidth.                  | .59 |
| Fig. 3-9:  | Bandwidth improvement of doubler core with series inductors       | .61 |
| Fig. 3-10: | Narrowband input interface to the multiplier core                 | .62 |
| Fig. 3-11: | Narrowband multiplier load.                                       | .63 |
| Fig. 3-12: | Wideband output load for the frequency multiplier.                | .64 |
| Fig. 3-13: | Multiplier with feedback bias-circuit.                            | .65 |
| Fig. 3-14: | Op-amp schematic.                                                 | .65 |
| Fig. 3-15: | Feedback and active inductor circuits for DC offset suppression   | .66 |
| Fig. 3-16: | Block diagram for the active frequency multiplier.                | .67 |
| Fig. 3-17: | Active input balun.                                               | .68 |
| Fig. 3-18: | Schematic of the broadband output buffer.                         | .68 |
| Fig. 3-19: | Simulated CG vs. frequency for NB and WB doubler circuit          |     |
|            | examples.                                                         | .69 |
| Fig. 3-20: | Photomicrograph of the doubler testchip                           | .71 |
| Fig. 3-21: | : Doubler output spectrum measured in: a) V-band (50-75 GHz), and | 1   |
|            | b) W-band (75-100 GHz).                                           | .71 |
| Fig. 3-22: | Measured and simulated (dashed) doubler output vs. input power    |     |
|            | for a 40GHz input signal.                                         | .72 |
| Fig. 3-23: | Measured and simulated (dashed) doubler output power vs           |     |
|            | frequency                                                         | .73 |
| Fig. 3-24: | Phase noise for the doubler at 40GHz output.                      | .75 |
| Fig. 3-25: | : Quadrupler input/output waveforms                               | .76 |
| Fig. 3-26  | Quadrupler DIout frequency components.                            | .77 |
| Fig. 3-27: | : Frequency quadrupler prototype                                  | .77 |
| Fig. 3-28  | Modified Cherry-Hooper input buffer used in the quadrupler        |     |
|            | testchip                                                          | .78 |
| Fig. 3-29: | Voltage controlled active inductor.                               | .79 |
| Fig. 3-30  | Quadrupler testchip photomicrographs.                             | .80 |
| Fig. 3-31: | : Frequency quadrupler output power vs. frequency                 | .81 |

| Fig. 3-32: | Frequency quadrupler output power for 3 control voltages                       | 82    |
|------------|--------------------------------------------------------------------------------|-------|
| Fig. 3-33: | Quadrupler phase noise at 90GHz output signal.                                 | 82    |
| Fig. 3-34: | Quadrupler and source phase noise difference.                                  | 83    |
| Fig. 4-1:  | Frequency divider based on master-slave D-FlipFlop                             | 89    |
| Fig. 4-2:  | Regenerative frequency division [5].                                           | 90    |
| Fig. 4-3:  | Dynastat divide-by-two schematic.                                              | 91    |
| Fig. 4-4:  | Dynastat divide-by-two configured in the dynamic mode                          | 92    |
| Fig. 4-5:  | Block diagram of the frequency divide-by-8 prototype                           | 93    |
| Fig. 4-6:  | Dynastat prototype testchip micrograph                                         | 93    |
| Fig. 4-7:  | Measured and simulated input sensitivity vs. frequency                         | 94    |
| Fig. 4-8:  | Measured phase noise for a 60GHz input.                                        | 95    |
| Fig. 4-9:  | Phase noise difference between clock source and div-by-8                       | 95    |
| Fig. 4-10: | Low voltage dynastat frequency divider schematic.                              | 97    |
| Fig. 4-11: | Shielded differential inductor layout.                                         | 98    |
| Fig. 4-12: | Simulated low-voltage dynastat simulated input sensitivity for a               |       |
|            | square waveshape (static) and sinusoidal input (static and dynamic             | ). 99 |
| Fig. 4-13: | Self-oscillation frequency vs. control voltage Vmode.                          | 100   |
| Fig. 5-1:  | Register based on master-slave D flip-flop.                                    | 106   |
| Fig. 5-2:  | Registers in a close loop and its time diagram.                                | 107   |
| Fig. 5-3:  | Clock distribution options for registers in close loop                         | 107   |
| Fig. 5-4:  | 2 <sup>11</sup> -1 PRBS generator with trigger and monitor outputs running fro | om    |
|            | a half-rate clock.                                                             | 109   |
| Fig. 5-5:  | Half-rate-clock PRBS generator with trigger output.                            | 110   |
| Fig. 5-6:  | Physical layout of the clock distribution between registers.                   | 110   |
| Fig. 5-7:  | D-type flip-flop register schematic.                                           | 111   |
| Fig. 5-8:  | XOR gate with reset schematic.                                                 | 113   |
| Fig. 5-9:  | Shield differential inductor layout.                                           | 113   |
| Fig. 5-10: | 2:1 multiplexer schematic.                                                     | 114   |
| Fig. 5-11: | Photomicrograph of integrated BiST block.                                      | 115   |
| Fig. 5-12: | Measured PRBS half-rate output sequence vs. time                               | 116   |
| Fig. 5-13: | Measured PRBS half-rate eye diagram.                                           | 116   |
| Fig. 5-14: | Half-rate PRBS measured output spectrum (40 GHz clock)                         | 117   |
| Fig. 5-15: | PRBS half-rate measured discrete tones (40 GHz clock)                          | 117   |
| Fig. 6-1:  | 40 Gb/s MZ modulator driver block diagram.                                     | 122   |
| Fig. 6-2:  | Timing phase control for individual clocks, n=1,2,3.                           | 123   |
| Fig. 6-3:  | Cross-section of top metal (AM) output line and M2 substrate shield.           | 125   |
| Fig. 6-4:  | Latch, pre-driver and limiting amplifier schematic                             | 127   |
| Fig. 6-5:  | I/O clock phase inverter, vector summer, and clock buffer.                     | 129   |
| Fig. 6-6:  | Simulated vector summing output within 1 quadrant for 10 code                  |       |
| 0. 0 0.    | settings.                                                                      | 129   |
| Fig. 6-7:  | Clock buffer schematic and custom inductor layout.                             | 130   |
| Fig. 6-8:  | 2-stage injection-locked oscillator                                            | 131   |
|            |                                                                                |       |

| Fig. 6-9 | : Simulated frequency response for the 2-stage injection-locked       |
|----------|-----------------------------------------------------------------------|
|          | oscillator                                                            |
| Fig. 6-  | 0: DA output waveforms for different interstage delays                |
| Fig. 6-  | 1: Schematic of the proposed calibration circuit                      |
| Fig. 6-  | 2: Op-amp schematic circuit                                           |
| Fig. 6-  | 3: Peak detector proposed in by Meyer [7] compared to energy detector |
|          | developed in this work                                                |
| Fig. 6-  | 4: Simulated frequency response for the 2-stage injection-locked      |
|          | oscillator                                                            |
| Fig. 6-  | 5: Output voltage vs. clock delay time for the calibration circuit139 |
| Fig. 6-  | 6: Calibration sequence for DA cells during input line phase          |
|          | adjustment140                                                         |
| Fig. 6-  | 7: 40 Gb/s MZ modulator driver prototype photomicrograph141           |
| Fig. 6-  | 8: Modulator driver prototype power consumption142                    |
| Fig. 6-  | 9: DA output return loss143                                           |
| Fig. 6-2 | 0: Calibration sequence for DA cells during input line phase          |
|          | adjustment144                                                         |
| Fig. 6-2 | 1: Measured output waveforms for two input phase settings145          |
| Fig. 6-2 | 2: Time domain driver output at 40GHz eye diagram                     |
| Fig. 6-2 | 3: Time domain driver output with on-chip 211-1 PRBS146               |
| Fig. 6-2 | 4: Output spectrum 40 GHz 211-1 PRBS147                               |
| Fig. 6-2 | 5: Discrete tones 40 GHz 211-1 PRBS147                                |
| Fig. A-  | 1: Shunt feedback amplifier low frequency model                       |
| Fig. A-  | 2: Shunt feedback amplifier low frequency signal flow diagram164      |
| Fig. B-  | 1: Small-signal circuit for frequency response analysis               |
| Fig. B-  | 2: Simplified small-signal circuit for frequency response analysis167 |
|          | -                                                                     |

| Table 2-1: Measured amplifier linearity                          | 44  |
|------------------------------------------------------------------|-----|
| Table 2-2: Broadband amplifiers performance comparison           | 45  |
| Table 3-1: Simulated wideband and narrowband doubler performance |     |
| comparison                                                       | 69  |
| Table 3-2: Frequency doubler performance comparison              | 74  |
| Table 3-3: Frequency quadrupler performance comparison           |     |
| Table 4-1: Divider performance comparison                        |     |
| Table 5-1: PRBS performance comparison                           | 118 |
| Table 6-1: Modulator driver performance comparison               |     |

## SUMMARY

#### Wideband Circuits for Optical Communications

Demand for data bandwidth drives the growth and adaptation of the technologies used in communications. Fiber-based communications systems require electronic circuits with increased bandwidth and improved energy efficiency. The electronic modulator driver, a key component in an optical link, is implemented as a distributed amplifier (DA). This thesis presents innovations in the architecture of a conventional DA developed to overcome performance limitations imposed by the DA input transmission line (TL). At the same time, increased functionality is incorporated into the driver by integrating an energy detector circuit, which is used for calibration, and a built-in self-test (BiST) circuit, which is used for characterization.

The need for greater speed requires the continued downscaling of transistors. However, reducing the area of active devices alters their electrical characteristics and creates challenges in their use, for example, by constraining their use within a reduced bias voltage. In this work, different circuit blocks in the digitally-controlled modulator driver are optimized to operate with a lower supply voltage ( $\leq 2.5$  V) than that found in conventional topologies. The new circuit topologies incorporate bandwidth extension techniques to maximize their operating frequency range.

This thesis investigates various wideband circuits in two parts. In the first part the performance of an advanced SiGe-BiCMOS technology is benchmarked implementing a Darlington feedback amplifier, frequency multipliers, and frequency dividers. In the second part of the thesis, the concepts developed for the benchmarking circuits are applied to the design of a digitally-controlled modulator driver for optical communications. The well-known Darlington feedback amplifier has been implemented in different technologies. It is considered a generic broadband amplifier that can be used in many systems. Chapter 2 discusses the design of a Darlington feedback amplifier and the steps taken to extend its bandwidth. Three prototypes are tested to verify the equations describing its low frequency operation that were derived to help support its design. The time constant analysis, which identifies the parasitics that limit the maximum frequency for the amplifier, makes it possible to optimize the topology to achieve 25 % and 53 % higher bandwidth using inductive peaking and cascoding, respectively, in two additional prototypes. The same chapter evaluates the amplifier's noise figure and linearity.

Chapters 3 and 4 discuss frequency multiplication and division, respectively. In Chapter 3, unbalanced cross-coupled differential pairs are used as the core of a frequency multiplier. This topology can be biased using a lower supply voltage than that required for traditional cascoded topologies. The circuit, which is shown to have an even-order transfer function, is used for even-order harmonic generation. The study of the core includes its implementation for broadband and narrowband operation. The core is then used to implement a frequency doubler and a frequency quadrupler prototype. The conversion gain of the broadband frequency doubler is positive within the DC-100 GHz range, while the narrowband frequency quadrupler is designed for a center frequency of 89 GHz and has a 3-dB bandwidth of 16 GHz.

The limits for the operating frequency range of frequency dividers are: the reduced maximum toggle frequency of static dividers, and the minimum operating frequency of dynamic dividers. These limits are overcome by the design of the dynastat divider, a dual-mode frequency divider (Chapter 4), that can operate in static or dynamic mode (dynastat divider).

Chapter 5 describes the design of a pseudo-random-bit-sequence (PRBS) generator with a length of  $2^{11}$ -1. The PRBS is implemented with a half-rate clock using linear shift registers and a multiplexer to generate the full-rate 40-Gb/s

sequence. The topology of the register is modified so that it can be biased using a single -2.5 V supply. Synthetic transmission lines are used to distribute the clock, increasing the PRBS maximum operating frequency.

Chapter 6 describes the design of a digitally-controlled distributed amplifier (DA). The DA is designed to drive a balanced Mach-Zehnder optical modulator at 40-Gb/s with a 6- $V_{p-p}$  differential voltage. This type of amplifier replaces the analog input line of traditional implementations with a digital circuit that retimes the input data, thereby overcoming the dispersion, attenuation, and pulse distortion associated with an analog implementation. A calibration circuit is used together with a 3-step calibration algorithm to obtain minimum rise/fall times (<6 ps) after fabrication of the driver. The wideband circuit (28-48 GHz) relies on a new digitally-controlled clock phase generator. An on-chip 1-0 data sequence is used for built-in self-test (BiST) calibration, and the PRBS discussed in Chapter 5 is incorporated for characterization. The design of the digitally-controlled modulator driver is a step towards achieving a fully-digital driver with signal processing capability.

Chapter 7 presents the contributions of the thesis. The implementation of successful demonstrators validates the circuit analysis and design approach which produced circuit topologies capable of operating with a reduced supply voltage (2.5 V) and an optimized maximum operating frequency.

Summary

xiv

## SAMENVATTING

#### Breedbandige Circuits voor Optische Communicatie

De vraag naar meer en meer bandbreedte stuwt de groei en innovatie van technologieën voor communicatie doeleinden. Optische communicatie systemen vereisen elektronische circuits met een hoge bandbreedte en energie efficiëntie. De elektrische uitgangstrap voor de modulator, een belangrijk element in een optische link, is geïmplementeerd als een gedistribueerde versterker (DA). Dit proefschrift presenteert innovaties in de architectuur van een conventionele DA om zodanig beperkingen in de prestaties van een optische transmissielijn (TL) te overwinnen. Bovendien is er aanvullende functionaliteit toegevoegd door een energie detectie circuit te integreren, welke gebruikt kan worden voor kalibratie, en een ingebouwde zelf-test voor karakterisatie doeleinden.

De behoefte aan hogere snelheden vereist een voortdurende verkleining van transistoren. Door echter de oppervlakte van de actieve componenten te verkleinen veranderen de elektrische eigenschappen en dit brengt uitdagingen teweeg, bijvoorbeeld het gebruik van zulke componenten in combinatie met lage voedingspanningen. In dit werk worden verschillende circuits in de digitaal gecontroleerde modulator geoptimaliseerd om ingezet te worden met een lagere voedingsspanning ( $\leq 2.5$  V) dan in een conventionele topologie.

Dit proefschrift bespreekt verscheidene breedbandige circuits in twee delen. In het eerste deel worden de prestaties van een geavanceerd SiGe-BiCMOS proces geëvalueerd door middel van de implementatie van een teruggekoppelde Darlington versterker, frequentie vermenigvuldigers en frequentie delers. In het tweede deel van

#### Samenvatting

dit proefschrift worden de concepten uit het eerste deel toegepast op het ontwerp van een digitaal gecontroleerde modulator voor optische communicatie.

De bekende teruggekoppelde Darlington versterker is geïmplementeerd in verschillende technologieën. De versterker wordt beschouwd als een generieke breedband versterker die inzetbaar is in veel verschillende systemen. Hoofdstuk 2 behandelt het ontwerp van een teruggekoppelde Darlington versterker en de stappen die zijn genomen om de bandbreedte te verhogen. Er zijn drie prototypes getest om de vergelijkingen die het laag frequente gedrag beschrijven, te verifiëren. De analyse van de tijdconstanten die de parasitaire effecten identificeren die de maximale frequentie van de versterker bepalen, maakt het mogelijk om de topologie verder te optimaliseren om zo een 25 % tot 53 % hogere bandbreedte re realiseren gebruikmaken van inductive peaking en cascoding in twee aanvullende prototypes. Hetzelfde hoofdstuk behandelt tevens het ruisgetal en de lineariteit van de versterker.

De hoofdstukken 3 en 4 bespreken frequentie vermenigvuldiging en deling. In hoofdstuk 3 worden ongebalanceerde kruisgekoppelde differentiële trappen gebruikt als de basis van een frequentie vermenigvuldiger. Deze topologie kan opereren onder een lagere voedingsspanning dan een traditionele gecascadeerde topologie. Het circuit dat een even overdrachtsfunctie heeft, wordt gebruikt voor de generatie van de even harmonischen. De beschouwing van dit circuit omvat tevens de implementatie voor breedbandige en smalbandige doeleinden. Het circuit wordt vervolgens ingezet om in een prototype voor een frequentie verdubbelaar alsmede voor een verviervoudiger. De conversiefactor van de breedbandige verdubbelaar is positief binnen het DC-100 GHz bereik. De smalbandige verviervoudiger is ontworpen voor een frequentie van 89 GHz en heeft een 3-dB bandbreedte van 16 GHz.

De beperkingen voor het frequentiebereik van frequentie delers zijn voor statische delers de maximale toggle frequentie en voor dynamische delers de minimale frequentie. Deze beperkingen worden overwonnen door middel van een dynastat deler, een duale frequentie deler die zowel in statische als in dynamische wijze kan opereren (Hoofdstuk 4).

Hoofdstuk 5 beschrijft het ontwerp van een pseudo-random-bit-sequence (PRBS) generator met een lengte van  $2^{11}$ -1. De PRBS is gerealiseerd met een klok met een gehalveerde snelheid gebruikmakend van lineaire schuifregisters en een multiplexer die de 40-Gb/s bitstream genereert. De topologie van het schuifregister is gemodificeerd zodat deze kan opereren met behulp van een -2.5 V voedingsspanning. Er wordt gebruik gemaakt van synthetische transmissielijnen om de klok te distribueren, zodat de maximale frequentie wordt verhoogd.

Hoofdstuk 6 beschrijft het ontwerp van een digital gecontroleerde gedistribueerde versterker. De DA is ontworpen om een gebalanceerde Mach-Zehnder optische modulator te bedienen op 40-Gb/s met een  $6-V_{p-p}$  differentiële spanning. Dit type versterker vervangt de analoge ingang van traditionele implementaties met een digitaal circuit dat de data een de ingang opnieuw kan timen en daarmee kan de dispersie, verzwakking en puls-distorsie, zoals die voorkomen in analoge implementaties, worden overwonnen. Kalibratie wordt gebruikt in combinatie met een 3-staps algoritme om minimale rijs- en daaltijden (<6 ps) te verkrijgen na fabricage. Het breedbandige circuit (28-48 GHz) maakt gebruik van een nieuwe digitaal gecontroleerde klok-fase generator. Een 1-0 data sequence zit op de chip geïntegreerd voor gebruik in een ingebouwde zelf-test kalibratie en de PRBS uit hoofdstuk 5 wordt gebruikt voor karakterisatie. Het ontwerp van deze digitaal gecontroleerde modulator is een stap richting een volledig digitale modulator met signaalverwerkingseigenschappen.

Hoofdstuk 7 presenteert de wetenschappelijke bijdragen van dit proefschrift. De circuitanalyse en ontwerpstrategieën voor topologieën die onder een lage voedingsspanning kunnen opereren met een geoptimaliseerde maximale frequentie, zijn gevalideerd met behulp van de implementatie van succesvolle prototypes. Samenvatting

xviii

# 1 Introduction

Global internet data traffic grew 100-fold from 2000 to 2010. Today, traffic is growing 16 times faster, and recent forecasts project an increase of 250 exabytes/ month<sup>1</sup> over the decade from 2010 to 2020 [1], [2]. The volume of data has grown in part because of the development of wireless devices such smartphones, which have increased data processing capability and integrated sensors. Also, new technologies are being introduced that allow machines to communicate with each other directly, such as the internet of things (IoT) [3]. Satisfying the demand for higher data rates is expected to drive the establishment of new wireless communication standards, such as the 5<sup>th</sup> generation (5G) of wireless systems [4], [5].

Fig. 1-1 shows the trends in global internet [1] and mobile data traffic [6] between 2014 and 2019. This growth implies bandwidth requirements of: 1-10 Gb/s to the subscriber in a 5G network; 100-Gb/s in a wireless backhaul network (e.g., between mobile cell-sites); 1 Tb/s for data transport within a metropolitan-area network (MAN); and 1 Pb/s for the core transport network [7], which is often based on the internet protocol.

In addition to technologies that rely on the internet, industry continues to develop commercial and consumer applications, such as automotive radar [8], wireless personal-area networks with Gbit/s data transfer capability (e.g., WirelessHD [9] and WiGig [10]), and private wireless backhaul networks [11].

<sup>1. 1</sup> exabyte = 1,000,000 terabytes

Chapter 1



Fig. 1-1: Total global [1] and mobile data traffic [6].

Regardless of the application, data is transported after modulation onto an optical or electromagnetic (e.g., RF) carrier. Wireless links use electromagnetic signal transmission and offer mobility, but they suffer interference from other electromagnetic signals, attenuation due to time-varying atmospheric conditions, and multipath propagation effects. Transmission across an optical fiber does not suffer from these impairments. A fiber channel is immune to electromagnetic interference and multipath effects, and is the preferred medium for long-distance links because attenuation can be as low as 0.15 dB/km [12]. It is also possible to transmit terabits of data per second with error rates below 10<sup>-12</sup> across optical fibers using current technologies [7], [13].

Communications systems rely upon developments in technology to satisfy the increasing demand for advances in high data rate services, such as the continuous improvement in integrated circuit (IC) performance resulting from technology scaling. Networks that support the exchange of information may be classified according to the span of the data link: personal-area networks (PAN) operate within the range of an individual; local-area networks (LAN) operate within a limited area,

such as a residence, school or building; metropolitan-area networks (MAN) cover a larger geographical area such as an urban area; and wide-area networks (WAN) span regions, countries and continents [14].

#### **1.1** Objectives of this thesis

This work studies the design of wideband circuits for broadband communications in two parts. The performance of an advanced 90-nm SiGe-BiCMOS technology is benchmarked implementing representative, wideband analog and digital circuit building blocks in the first part. Novel amplifier, frequency multiplier, and frequency divider design concepts are developed to operate from a reduced supply voltage than that found in conventional topologies. The benchmark circuits are also used to study the optimization of these circuits for maximum bandwidth.

In the second part of this thesis, wideband circuit concepts developed for the benchmarking circuits are applied to the design and verification of a 40-Gb/s Mach-Zehnder modulator driver for optical communications, which has integrated built-in self-test (BiST) and built-in calibration (BiC) features.

#### **1.2** Optical fiber communication

In optical communication, the intensity, phase, or polarization of the light emitted by a laser, or a combination of these properties in more complex modulations, is modulated by data for transmission. Different modulation schemes were developed to increase the optical channel capacity, from on-off keying encoding the data in the light intensity, to optical quadrature amplitude modulation, such as the 10 Gbaud 16-QAM over 20 km demonstrated in [15]. Furthermore, lightwave of multiple wavelengths can be combined in a single fiber to increase further the total channel capacity using wavelength-division multiplexing (WDM) [16]. Digital signals can use orthogonal frequency-division multiplexing (OFM) [17], and space division multiplexing (SDM) based on multicore or multimode fibers has the potential for further bandwidth extension [18].

No matter which techniques are used to increase the optical channel capacity, the data needs to be encoded in the optical carrier via the process of optical modulation.

#### **1.2.1 Optical modulation**

The conversion of electrical signals into light can be achieved by applying the signals directly to the power source of a lightwave generator (direct modulation), or by using them to change the characteristics of a previously generated light beam (external modulation).

Direct modulation can be implemented by controlling the laser diode current. In this type of modulation, the bandwidth is limited by the frequency response of the driver circuit and the physical characteristics of the diode laser. Turning the laser on or off creates electrical and thermal stress, shifts the laser frequency over time (chirp), and reduces its operational lifetime. In addition, direct modulation of a laser produces oscillations on the rising edge of the pulse, known as relaxation oscillation. The maximum modulation frequency is limited by the relaxation oscillation frequency, which typically ranges from 1-10 GHz for a vertical cavity surface emitting laser (VCSEL) [19].

In external modulation, a continuous wave (CW) light beam is passed through a modulator, which changes the characteristics of the light according to the signal that is applied. There are two main types of external modulators: the electro-absorption (EA) modulator, and the electro-optic (EO) modulator. These are constantly undergoing development due to their promising characteristics. Electro-absorption modulation is based on changes in the absorption spectrum of the material when an electric field is applied (the Franz-Keldysh effect [20]). A voltage applied to an EA

#### Chapter 1

modulator switches it between transparent and opaque states, thus modulating the light passing through the modulator. Electro-optic modulators change the refractive index of a material by applying an electric field (e.g., Kerr effect; Pockels or electro-optic effect [21]). Lightwave propagating thought the material experience a phase shift when a voltage is applied. The phase modulation of the optical carrier induced by the Pockels effect can be transformed into intensity modulation using a Mach-Zehnder interferometer-based modulator.

#### **1.2.2 Mach-Zehnder modulator (MZM)**

The schematic of an interferometer-based modulator is shown in Fig. 1-2. The incoming light is split into two paths, one of which is subjected to electro-optic modulation, controlled by an electrical signal. Light travelling along the two paths recombine at its output, creating an output light beam whose optical power depends on the phase difference between each path.

The optical output power is a function of the external voltage, which defines the electric field that modulates the phase difference between the light beams in the modulator paths. The modulator response to the applied voltage can be described in terms of the half-wave voltage  $V_{\pi}$ , i.e., the voltage that must to be applied to the electrode of the optical waveguide to induce a phase shift of 180°, which produces (ideally) zero optical power at the modulator output.

The interferometer can be constructed by implanting an optical waveguide into an electro-optic crystal, such as lithium-niobate (LiNbO3). Enhanced phase-shift



Fig. 1-2: Mach-Zehnder modulator.

efficiency and reduced  $V_{\pi}$  is obtained using compound semiconductors to fabricate the modulator [22].

The voltage signal provided by the driver is applied to the modulator electrode, which is modelled electrically as a transmission line (TL). Standing waves, created by reflected waves in a TL, are suppressed using a termination load matched to the electrode characteristic impedance Zo, whose value is often 50  $\Omega$  Mach-Zehnder modulators are drive using single-ended or differential signals, and they require a maximum driving voltage in the range 5-6 V [23]. Moreover, the driver must provide an output voltage with minimum rise/fall times to reduce time and pulse distortion [24], and it must have an output return loss (ORL) better than 10 dB across the bandwidth including the third harmonic of a transmit pulse train (i.e., ORL better than 10 dB up to 60 GHz in a 40-Gb/s system) [25].

#### 1.2.3 Towards an integrated digital optical transmitter

Future optical links will integrate signal processing circuits, the modulator, and its driver in the same die to reduce problems associated with the interconnection of multiple ICs, such as bandwidth limitation and crosstalk. However, further research is required before integrated digital optical transmitters are used in optical communications. A simplified block diagram of a transmitter in an optical link is shown in Fig. 1-3. A DAC provides the input signal to the driver, which is equalized



Fig. 1-3: Simplified optical link transmitter block diagram.

according to the characteristics of the channel by a pre-emphasis component. Calibration of the driver is performed in a closed loop to compensate for static and dynamic errors in the driver, and a characterization component is used to evaluate the performance of the driver. Research topics for the design of integrated digital optical transmitters include the incorporation of pre-emphasis, DAC, and built-in calibration and characterization capability.

#### **1.3** IC technology for high-speed/wideband communication

High-data rate systems require multi-gigahertz bandwidth circuits. Traditionally, III-V semiconductor technologies such as gallium arsenide (GaAs) [26] and indium phosphide (InP) [27] have been preferred to satisfy the output voltage and bandwidth requirements of modulator drivers for optical communications. The preference for III-Vs is explained by their higher breakdown voltage in comparison to a silicon-based technology with similar  $f_T$ . For example, the 0.25-µm InP/InGaAs dual heterojunction bipolar transistor (DHBT) found in [28] (173/470 GHz  $f_T/f_{max}$ ) has a 12 V BV<sub>CEO</sub>, which is more than six times larger than the 1.8 V BV<sub>CEO</sub> of an HBT in a 0.13-µm SiGe-BiCMOS with 200/280  $f_T/f_{max}$  [29].

However, co-integration of high-performance analog/RF and high-complexity digital circuitry is possible using SiGe technologies. These circuits can then be manufactured at a lower cost in high volume than a III-V equivalent, which enables applications beyond what encounter today.

#### 1.4 Wideband circuits and technology benchmarking

Development of BiCMOS technologies facilitates the design and demonstration of leading-edge circuits and systems operating at unprecedented speeds. Aside from faster circuits, advanced SiGe HBTs can also be used to mitigate PVT-variations for higher yield and improved reliability, or the higher operating speed can be traded-off for lower power consumption and improved energy efficiency. Circuit performance benchmarking is important to demonstrate the capabilities of new technologies for applications of interest. Three type of circuits were selected to implement benchmark circuits: general purpose broadband amplifiers, frequency multipliers, and frequency dividers.

The general purpose amplifier is intended to provide direct evaluation of the technology capability via its gain-bandwidth product. This circuit should be easy to test, and simple enough to correlate its performance to transistor metrics. Transceivers benefit of frequency converter circuits for up- or down-conversion. Moreover, limitation in the output power of amplifiers and the tuning range of oscillators operating near the device cut-off frequency make multipliers an important component in very high frequency transmitters. Frequency dividers are essential components for the frequency control of mm-wave oscillators in phase-locked loops and they are a representative circuit used to benchmark the capability of digital circuits.

Transistors operating at high speed require to be bias at current densities close to peak- $f_T$  current density, therefore, broadband circuits with reduced power consumption require topologies that use lower supply voltages, which aligns with the trade-off between  $f_T$  and breakdown voltage in advanced technologies (Johnson limit [30]). The benchmark circuits in this thesis make use of new topologies, compared to conventional designs, for lower power consumption, and they apply bandwidth extension techniques to maximize their operating frequency.

#### **1.4.1 Wideband amplifiers**

The Darlington pair is widely used in resistive feedback amplifiers, and it is selected for the implementation of the general purpose wideband amplifier benchmark circuit. It uses two transistors in its topology, and it has higher input impedance and current gain than a single transistor. Different techniques were proposed to increase its bandwidth, including series-inductive peaking at the input [31], at the output [32], or within the feedback loop [33]. A complete analysis of the Darlington amplifier gain-bandwidth product, input and output impedances for HBT and HEMT devices was presented in [34]. However, the analysis provides little understanding of the circuit for its optimization. Chapter 2 investigates the use of a Darlington pair as active device in a feedback amplifier. The chapter focus on the design of the Darlington amplifier, and describes the optimization of its gain-bandwidth product. Findings on the design of resistive feedback Darlington amplifiers are validated with the implementation of three amplifier benchmark circuits. Evaluation of their noise figure and linearity are included in the chapter.

#### **1.4.2 Frequency multiplier**

Upconversion is an alternative widely used in the upper mm-wave and sub-mm-wave frequencies, where a lower frequency source is upconverted via a multiplier, or a chain of multiplier stages. There are passive, injection-locked, and active frequency multipliers. Passive multipliers are typically comprised of a non-linear device, that generates an output with harmonics of its input signal, and filter(s), that select the frequency component of interest. The filter design trade-offs are between bandwidth, insertion loss and harmonic suppression. Injection-locking frequency multipliers use a regenerative circuit to obtain the desired harmonic, but they suffer from limited bandwidth [35].

Wideband active multipliers exploit a circuit transfer function to generate the desired harmonic. This type of multiplier is often implemented using the Gilbert cell topology. One of the drawbacks of this topology is the minimum voltage headroom required by the cascoded differential pairs used in the circuit. An alternative for active multiplication, studied by Kimura, is based on unbalanced cross-coupled differential pairs [36]. This topology is suitable for low voltage and low power consumption, and it is studied for broadband and narrowband applications in Chapter 3, including frequency doublers, and a frequency quadrupler.

#### **1.4.3 Frequency divider**

A frequency divider produces an output signal whose frequency is  $f_{in}/N$ , where  $f_{in}$  is the frequency of the input signal and N is the division factor. The phase noise of the divided signal is reduced by 20log(N). Therefore, spectral purity increases using frequency dividers. Emitter-coupled-logic (ECL) master-slave D flip-flops (MS-D-FF) are critical blocks used in microwave frequency direct synthesis, analog-digital converters and fiber-optic transmission chip-sets [37]. Using the inverted output fed back as data input of the ECL MS-D-FF realizes a static frequency divider. The maximum clock frequencies of a static frequency divider benchmarks the speed of a digital circuit in a technology. Furthermore, the maximum divisible frequency is increased applying the principle of regenerative frequency division [38], leading to dynamic dividers. Higher operating frequency is achieved by this type of dividers, but they are constrained by a minimum operating frequency.

In Chapter 4 a frequency divider topology that can operate as static or dynamic divider is presented. Two versions of the dual operation mode (dynastat) divider topology (one using a 4.5 V and another a 2.5 V supply) are implemented in SiGe-BiCMOS technologies.

Bandwidth demand (Section 1.1) cannot be satisfied merely by scaling up the network capacity using current technologies because this would cause an exponential increase in energy consumption [39]. Optical networks with improved energy efficiency must be build using components for increased data rates and reduced power consumption. The second part of this thesis investigates the design of a Mach-Zehnder modulator driver, developed using a new architecture. Design concepts used in the benchmark circuits for low voltage operation are applied into the optical modulator driver. An overview of optical communications and the targeted capabilities of the modulator driver are presented in the following sections.

#### Chapter 1

#### 1.5 Mach-Zehnder modulator driver

The modulator driver must be implemented using a wideband amplifier circuit. Alternatives for its implementation include the Darlington, staggered, Cherry-Hooper, and distributed amplifiers. Among them, the Darlington amplifier [40] is the only single stage amplifier. It has broadband capability and can operate from a low supply voltage. The staggered amplifier [41] divides the signal amplification in frequency bands, and it provides an overall wideband operation combining multiple stages with equalized frequency response. The Cherry-Hooper amplifier [42] is a two stage (transconductance-transimpedance) amplifier.

The voltage driving the modulator is defined by the driver output current and the characteristic impedance of the modulator electrode, e.g., a 3  $V_{p-p}$  driving voltage in a 50- $\Omega$  electrode requires 60 mA. The Darlington, staggered, and Cherry-Hooper amplifiers require large output transistors to conduct the output current, and the parasitics associated to the large devices limit the frequency response of the driver.

The distributed amplifier (DA) combines multiple gain stages (see Fig. 1-4). It distributes the input signal to the gain stages using an input transmission line (TL) and sums the gain stages outputs in an output TL. The chip area occupied by a DA is larger than the previously mentioned wideband amplifiers. Furthermore, the DA



Fig. 1-4: Distributed amplifier circuit.
output TL characteristic impedance must be matched to its load impedance, i.e., the modulator electrode Zo. The DA output TL has back-termination resistors matched to the output line characteristic impedance, which are required to prevent reflections from the back of the output line. Back-terminations of the DA dissipates half of its output current, which reduces the power efficiency of the DA. However, despite the large area and lower efficiency, the DA can achieve simultaneously a high gain-bandwidth (GBW) product and a multi-volt output voltage because its output current is conducted by multiple gain stages, instead of a single stage with large devices, which makes possible the design of the input and output lines including the (now) distributed parasitic capacitance.

# **1.5.1** Conventional distributed amplifier limitations

A conventional distributed amplifier has limited flexibility in its design. Correct phase alignment between signals traveling at the input and output is obtained satisfying precise matching requirements between the input and output transmission lines. The performance of the DA can be degraded during fabrication due to process and mismatch variations, and during operation it can be sensitive to supply voltage and temperature variations. Moreover, attenuation of the signal traveling in the input line causes dispersion, and it limits the maximum number of gain stages in a DA [43]. The limitations associated to the input transmission line have been addressed with the design of a digitally-controlled distributed amplifier.

# **1.5.2 Digitally-controlled distributed amplifier**

A simplified block diagram of a digitally-controlled distributed amplifier, demonstrated in [24], is shown in Fig. 1-5. Conventional latches replace the input transmission line to obtain replicas of the input signal at the inputs of the DA gain stages. The digitally-controlled DA eliminates the dispersion, attenuation, ringing and pulse distortion associated to the input line of a conventional implementation.



Fig. 1-5: Distributed amplifier with digitally retimed input data.

The clock used to retime the input signal in each latch is derived from an input clock. The phase of each retiming clock is controlled individually in a phase control circuit, which is used to match the delay of the replicas at the input of the DA gain stages to the delay of the signal propagating in the output line. Digital control of the retiming clocks facilitates the calibration of the circuit after fabrication by setting the phase of the clocks digitally via control bits b1 to bm.

#### **1.6** Organization of this thesis

Aside from the benchmarking circuits, this thesis presents the design of a digitally-controlled modulator driver that targets 40-Gb/s data rate, and a  $6-V_{p-p}$  differential output swing across a 100- $\Omega$  load. The digital control incorporated in the driver can be used to reduce imperfections of analog designs by suppressing the effect of process, voltage supply, and temperature variations. However, calibration of the digital input line requires means to identify the optimum control settings. In this work, built-in calibration (BiC) capability is implemented in the driver to exploit the digital control functionality in the new modulator driver. For this goal, an on-chip 1-0 data source, an energy detector circuit, and a three-step calibration algorithm are integrated in the driver IC.

Characterization of the modulator driver using built-in self-test (BiST) capability reduces the time and complexity required during production. The modulator driver presented in this work incorporates a  $2^{11}$ -1 pseudo random bit sequence (PRBS) generator which operates at half-rate clock with an integrated frequency divider, and it outputs a trigger signal to be used for synchronization during characterization.

The 40-Gb/s MZM modulator driver developed in this work has a power consumption 10% lower than that of the circuit reported in [24] despite a fourfold increase in the data rate. Furthermore, it operates across data rates of 28-48 Gb/s enabled by a new digitally-controlled clock synchronizer circuit.

Chapter 2 presents the design of a single stage resistive feedback wideband amplifier. A Darlington pair is used as the active component embedded in a resistive feedback design. The wideband amplifier gain-bandwidth product is maximized using inductive peaking and a cascoded Darlington pair.

A frequency multiplier circuit topology, suitable for low voltage operation, is studied in Chapter 3. Unbalanced emitter-coupled pairs are used in the multiplier core, which is explored for narrow and broadband applications. The results lead to the design of a broadband frequency doubler, a narrowband frequency doubler, and a narrowband frequency quadrupler.

Frequency dividers capable to operate in dynamic or static mode (dynastat divider) are presented in Chapter 4. The design of two dynastat divider prototypes is presented in this chapter, one biased from a 4.5 V supply and another biased from a -2.5 V supply.

The complete design of the 40-Gb/s digitally-controlled DA is presented in two chapters. In Chapter 5, the architecture, design and characterization of a 40-Gb/s 2<sup>11</sup>-1 PRBS generator is presented. The PRBS uses a half-rate clock distribution

scheme, and it is designed to operate from a single -2.5 V supply. Moreover, it incorporates innovations in the clock distribution, and it outputs a trigger signal.

The design and characterization of the digitally-controlled modulator driver is presented in Chapter 6. The circuit operates from +5/-2.5 V supplies. The output stage of the 40-Gb/s modulator driver is a distributed amplifier, which has 6 V<sub>p-p</sub> output voltage. The prototype includes a new clock synchronizer control circuit, an energy detector for BiC, and BiST capability implemented with the 40-Gb/s 2<sup>11</sup>-1 PRBS.

Chapter 7 presents the major research contributions and recommendations for future work.

#### References

[1] S. K. Korotky, "Semi-empirical description and projections of Internet traffic trends using a hyperbolic compound annual growth rate," *Bell Labs Technical Journal*, vol.18, no.3, pp.5-21, Dec. 2013.

[2] M. Forzati, A. Bianchi, C. Jiajia, et al., "Next-generation optical access seamless evolution: concluding results of the European FP7 Project OASE," *Journal of Optical Communications and Networking*, IEEE/OSA, vol.7, no.2, pp.109-123, February 2015.

[3] L. Coetzee, J. Eksteen, "The Internet of Things - promise for the future? An introduction," IST-Africa Conference Proceedings, pp.1-9, May 2011.

[4] T. S. Rappaport, S. Shu, R. Mayzus, Z. Hang, Y. Azar, K. Wang, G.N. Wong, J.K. Schulz, M. Samimi, F. Gutierrez, "Millimeter Wave Mobile Communications for 5G Cellular: It Will Work!," Access, IEEE, vol.1, pp.335-349, 2013.

[5] L. Xichun, A. Gani, R. Salleh, O. Zakaria, "The Future of Mobile Wireless Communication Networks," International Conference on Communication Software and Networks, pp. 554-557, Feb. 2009.

[6] CISCO, "Cisco Visual Networking Index: Mobile Data Traffic Forecast Update, 2014-2019," 2015.

[7] C. Lin, Z. Ming, M.M.U. Gul, M. Xiaoli, C. Gee-Kung, "Adaptive Photonics-Aided Coordinated Multipoint Transmissions for Next-Generation Mobile Fronthaul," *Journal of Lightwave Technology*, vol.32, no.10, pp.1907-1914, 2014.

[8] J. Wenger, "Automotive radar – status and perspectives," in Proc. of Compound Semiconductor Integrated Circuit Symposium, pp. 21–24, Nov. 2005.

[9] H. Singh, Oh Jisung, Kweon ChangYeul, Q. Xiangping, S. Huai-Rong, N. Chiu, "A 60 GHz wireless network for enabling uncompressed video communication," IEEE Communications Magazine, vol. 46, no. 12, pp. 71-78, Dec. 2008.

[10] C. J. Hansen, "WiGiG: multi-gigabit wireless communications in the 60 GHz band," *IEEE Wireless Communications*, vol. 18, no. 6, pp. 6–7, Dec. 2011.

[11] J. Wells, "Faster than fiber: The future of multi-G/s wireless," *IEEE Microwave Magazine*, vol. 10, no. 3, pp. 104–112, May 2009.

[12] S. Makovejs, C.C. Roberts, F. Palacios, H.B. Matthews, D.A. Lewis, D.T. Smith, P.G. Diehl, J.J. Johnson, J.D. Patterson, C.R. Towery, S.Y. Ten, "Record-low (0.1460 dB/km) attenuation ultra-large aeff optical fiber for submarine applications," IEEE conference in Optical Fiber Communications (OFC), pp. 1-3, March 2015.

[13] H. Suzuki, K. Watanabe, K. Ishikawa, H. Masuda, K. Ouchi, T. Tanoue, and R. Takeyari, "Very-high speed InP/InGaAs HBT ICs for optical transmission system," *IEEE Journal of Solid-State Circuits* vol. 33, no. 9, pp. 1313–1320, Sept. 1998.

[14] IEEE Standard for Local and Metropolitan Area Networks: Overview and Architecture, in IEEE Std 802-2014 (Revision to IEEE Std 802-2001), pp.1-74, June 30 2014.

[15] T. Kuri, T. Sakamoto, T. Kawanishi, T., "Laser-phase-fluctuation-insensitive optical coherent transmission of 16-QAM radio-over-fiber signal with offset-frequency-spaced two-tone local light," Optical Fiber Communications Conference and Exhibition (OFC), pp. 1-3, March 2015.

[16] H. Ishio, J. Minowa, and K. Nosu, "Review of status of wavelength-divisionmultiplexing technology and its applications," *Journal of Lightwave Technology*, vol. 2, no. 4, pp. 448–463, Aug. 1984.

[17] D. Qian, M.-F. Huang, E. Ip, Y.-K. Huang, Y. Shao, J. Hu, and T. Wang, "High capacity/spectral efficiency 101.7-Tb/s WDM transmission using PDM-128QAM-OFDM over 165-km SSMF within C and L-bands," *Journal of Lightwave Technology*, vol. 30, no. 10, pp. 1540–1548, May 2012.

[18] G.M. Saridis, D. Alexandropoulos, G. Zervas, D. Simeonidou, "Survey and Evaluation of Space Division Multiplexing: From Technologies to Optical Networks," IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 2136-2156, Fourthquarter 2015.

[19] S. Ahmadipanah, R. Kheradmand, F. Prati, "Enhanced Resonance Frequency and Modulation Bandwidth in a Cavity Soliton Laser," *IEEE Photonics Technology Letters*, vol. 26, no. 10, pp. 1038-1041, May 2014.

[20] L. V. Keldysh, "Ionization in the Field of a Strong Electromagnetic Wave," J. Exptl. Theoret. Phys. (U.S.S.R.) 47, pp. 1945-1957, Nov. 1964.

[21] R. Paschotta, "Article on Pockels effect," Encyclopedia of Laser Physics and Technology, Wiley-VCH, 1st edition, October 2008.

[22] Jae Hyuk Shin, N. Dagli, "Ultralow Drive Voltage Substrate Removed GaAs/AlGaAs Electro-Optic Modulators at 1550 nm," *IEEE Selected Topics in Quantum Electronics Journal*, vol. 19, no. 6, pp. 150-157, Nov.-Dec. 2013.

[23] Jianfeng Ding, Hongtao Chen, Ruiqiang Ji, et al., "Low-Voltage, High Extinction Ratio Carrier-Depletion Mach-Zehnder Silicon Optical Modulator," Communications and Photonics Conference and Exhibition, pp. 1-6, Nov. 2011.

[24] Y. Zhao, L. Vera, J.R. Long, "A 10 Gb/s, 6 V p-p , Digitally Controlled, Differential Distributed Amplifier MZM Driver," *IEEE-JSSC*, vol.49, no.9, pp. 2030-2043, Sept. 2014.

[25] E. Säckinger, "Broadband Circuits for Optical Fiber Communication". New York, NY, USA: Wiley, 2005.

[26] H. L. Hung, G. M. Hegazai, T. T. Lee, F. R. Phelleps, J. L. Singer, and H. C. Huang, "V-band GaAs MMIC low-noise and power amplifiers," *IEEE Transactions on Microwave Theory and Techniques*, vol. 36, no. 12, pp. 1966–1975, Dec. 1988.

[27] K. W. Kobayashi, L. T. Tran, S. Bui, J. Velebir, D.Nguyen, A. K. Oki, and D. C. Streit, "InP based HBT millimeter-wave technology and circuit performance to 40 GHz," in *Proc. of IEEE Microwave and Millimeter-wave Monolithic Circuits Symposium*, 1993, pp. 85–88.

[28] N. Kashio, K. Kurishima, M. Ida, H. Matsuzaki, "InP/InGaAs double heterojunction bipolar transistors with BVCEO = 12 V and fmax = 470 GHz," *Electronics Letters*, vol. 51, no. 8, pp. 648-649, April 2015.

[29]B.A. Orner, Q.Z. Liu, B. Rainey, et al., "A 0.13 μm BiCMOS technology featuring a 200/280 GHz (fT/fmax) SiGe HBT," Proceedings of Bipolar/BiCMOS Circuits and Technology Meeting, pp. 203-206, Sept. 2003.

[30] E. Johnson, "Physical Limitations on Frequency and Power Parameters of Transistors," 1958 IRE International Convention Record, vol. 13, pp. 27-34, March 1966.

[31] P. Shen, W-R Zhang, H-Y Xie, D-Y Jin, W. Zhang, J. Li, J-N Gan, "An ultra-wideband Darlington low noise amplifier design based on SiGe HBT," International Conference on Microwave and Millimeter Wave Technology. pp.1372-1375, April 2008.

[32] Y. Kuriyama, J. Akagi, T. Sugiyama, S. Hongo, K. Tsuda, N. Iizuka, M. Obara, "DC to 40-GHz broad-band amplifiers using AlGaAs/GaAs HBT's," *IEEE-JSSC*, vol.30, no.10, pp.1051-1054, Oct 1995.

[33] H.S. Tsai, R. Kopf, R. Melendes, M. Melendes, A. Tate, R. Ryan, R. Hamm, T.-K. Chen, "90 GHz basedband lumped amplifier," *Electronics Letters*, vol.36, no.22, pp.1833-1834, Oct 2000.

[34] W. Shou-Hsien, C. Hong-Yeh, C. Chau-Ching, W. Yu-Chi, "Gain-Bandwidth Analysis of Broadband Darlington Amplifiers in HBT-HEMT Process," Transactions on Microwave Theory and Techniques, vol. 60, no.11, pp. 3458-3473, Nov. 2012.

[35] E. Monaco, M. Pozzoni, F. Svelto, A. Mazzanti, "Injection-Locked CMOS Frequency Doublers for -Wave and mm-Wave Applications," *Solid-State Circuits, IEEE Journal of*, vol.45, no.8, pp.1565,1574, Aug. 2010.

[36] K. Kimura, "A bipolar low-voltage quarter-square multiplier with a resistive-input based on the bias offset technique," *IEEE-JSSC*, vol.32, no.2, pp. 258-266, Feb 1997.

[37] Q. Lee, D. Mensa, J. Guthrie, S. Jaganathan, T. Mathew, Y. Betser, S. Krishnan, S. Ceran, M.J.W. Rodwell, "66 GHz static frequency divider in transferred-substrate HBT technology," IEEE-RFIC. pp. 87-90, 1999.

[38] R.L. Miller, "Fractional-Frequency Generators Utilizing Regenerative Modulation," Proceedings of the IRE, vol.27, no.7, pp.446-457, July 1939.

[39] J. Baliga, R.W.A. Ayre, K. Hinton, W.V. Sorin, S. Tucker Rodney, "Energy Consumption in Optical IP Networks," *IEEE Journal of Lightwave Technology*, vol.27, no.13, pp.2391-2403, July1, 2009.

[40] D. A. Hodges, "Darlington's Contributions to Transistor Circuit Design," *IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications*, vol. 46, no. 1, pp. 102–104, Jan. 1999.

[41] J.R. Nelson, "A Theoretical Comparison of Coupled Amplifiers with Staggered Circuits," Proceedings of the Institute of Radio Engineers, vol. 20, no. 7, pp. 1203-1220, July 1932.

[42] E.M. Cherry, D.E. Hooper, "The Design of Wide-band Transistor Feedback Amplifiers," Proceedings of the Institution of Electrical Engineers, vol. 110, no. 2, pp. 375-389, February 1963.

[43] T. Y. Wong, "Fundamentals of distributed amplification". Artech House, Inc., 1993.

[44] C. Moy, J. Palicot, "Software radio: a catalyst for wireless innovation," IEEE Communications Magazine, vol. 53, no. 9, pp. 24-30, September 2015.

# **Part I** BENCHMARK CIRCUITS

2

# WIDEBAND AMPLIFIER

The design of a Darlington amplifier with shunt resistive feedback is investigated in this chapter. The circuit is biased from a 2.1-V supply, and its gain-bandwidth product benchmarks the performance of the technology. Simple design equations for gain, and input/output impedances are derived. Moreover, inductive-peaking and cascoding that mitigate the effects of parasitics are used to increase the circuit gain-bandwidth (GBW) product. Three broadband amplifier prototypes were fabricated to verify the optimization of the GBW product. The prototypes (fabricated in IBM SiGe-BiCMOS 9HP [1]) are designed for 50- $\Omega$  input/ output matching, and 12-dB forward transmission gain (S<sub>21</sub>).

#### 2.1 Broadband amplifiers

The performance of a technology can be evaluated by comparing the gainbandwidth product of broadband amplifiers [1]. Alternatives for their implementation include the distributed, staggered, Cherry-Hooper, and Darlington amplifiers.

The distributed amplifier (DA) was described in Section 1.5. It distributes the input signal to the gain stages using an input transmission line (TL) and combines the gain stage outputs in an output TL. A distributed amplifier absorbs the gain stage capacitive loading in a synthetic transmission line at the input and output, achieving multi-volt output voltage and high gain-bandwidth (GBW) product simultaneously [2]. However, the DA occupies a larger chip area than single-stage wideband amplifiers, and it has lower power efficiency because at least one-half of



the RF output power is dissipated in a back-termination resistor used to prevent reflections in the output TL.

A different approach sub-divides the desired passband for signal amplification into smaller frequency bands, and it provides an overall broadband operation by combining the outputs of its equalized stages [3]. An example of this type of broadband amplifier is shown in Fig. 2-1 [4], which combines a Darlington feedback amplifier and degenerated gain stages to achieve 102 GHz bandwidth. Problems associated with staggered amplifiers, compared to single-stage amplifiers, include gain variation in its frequency response, and increased group delay variation across its operating frequency.

Broadband amplification and increased gain are obtained using cascade amplifiers (see Fig. 2-2 [5]), which produce an overall amplification equal to the product of their individual voltage gain.



Fig. 2-2: 2-stage Darlington amplifier [5].



Fig. 2-3: Cherry-Hooper amplifier [8].

The Cherry-Hooper amplifier [6] is a two-stage (transconductancetransimpedance) amplifier, which was modified to increase its bandwidth [7] and gain [8]. Its schematic is shown in Fig. 2-3. One of the main limitations of this type of amplifier is its minimum supply voltage, which is constrained by the cascoded transistors in its topology.

# 2.2 Darlington pair

The Darlington pair, shown in Fig. 2-4a [9], is a compact stage with broadband capability that can operate from a reduced supply voltage. The unity gain frequency ( $f_T$ ) of the Darlington pair is about twice the value of a single transistor  $f_T$ , as shown in Fig. 2-4b. The increased maximum operating frequency of the



Fig. 2-4: Darlington pair.

Darlington pair makes it attractive for high-frequency, broadband circuits, which is why it has been widely used as broadband gain stage [10], broadband mixer [11], active balun [12], low-noise amplifier [13], power amplifier [14], and gain stage in a distributed amplifier [15].

A single stage Darlington feedback amplifier is selected as the benchmark circuit for wideband amplification. It is an excellent benchmarking circuit because its performance relates directly to transistor metrics, and the circuit is relatively easy to characterize from 2-port measurements.

# 2.3 Darlington feedback amplifier

A Darlington amplifier with resistive feedback is shown in Fig. 2-5. The amplifier consists of shunt feedback resistor ( $R_F$ ), transistors  $Q_1$  and  $Q_2$  connected in a Darlington configuration, bias current mirror ( $Q_{b1}$ - $Q_{b2}$ ), and ballast resistor ( $R_E$ ).

Without the ballast resistor ( $R_E$ ) the collector current of  $Q_2$  ( $I_{CQ2}$ ) increases with increasing temperature. After introducing  $R_E$ , the base-emitter voltage of  $Q_2$ ( $V_{BE,Q2}$ ) reduces when  $I_{C,Q2}$  increases, which regulates  $I_{C,Q2}$ . Bias stability in the amplifier therefore is obtained at the cost of reduced transconductance.

The amplifier in Fig. 2-5 is embedded in a 50- $\Omega$  system (i.e., 50- $\Omega$  source and load), and it uses two external bias-Ts to simplify testing. The output bias-T supplies  $V_{CC}$  to the amplifier, but it can be removed if the bias current is fed through the load



Fig. 2-5: Shunt feedback, broadband reference amplifier.

( $R_L$ ). The amplifier power consumption increases if the output bias-T is removed, because  $V_{CC}$  must be increased to account for the voltage drop across  $R_L$ .

A single transistor can implement the active gain. However, the capacitive loading on the feedback network formed by  $R_F$  and  $R_G$  at the input from a single transistor is approximately twice the loading introduced by a Darlington pair. The Darlington pair gain-bandwidth (GBW) product is also merely twice the GBW of a single transistor (i.e.,  $f_T$  doubling [16]).

Transistors Q<sub>1</sub> and Q<sub>2</sub> are biased via I<sub>b</sub>, V<sub>BB</sub> and V<sub>CC</sub>. The collector current of Q<sub>1</sub> is set by I<sub>b</sub> and current mirror Q<sub>b1</sub>-Q<sub>b2</sub>. The collector current of Q<sub>2</sub> is set by its base-emitter voltage, which is controlled via V<sub>BB</sub> and ballast resistor R<sub>E</sub> (i.e.,  $V_{b_Q2}=V_{BB}-V_{be,Q1}$  and  $V_{e_Q2}=I_{e_Q2}\cdot R_E$ ). Moreover, V<sub>BB</sub> and V<sub>CC</sub> set the basecollector voltage of Q<sub>1</sub> (V<sub>BC,Q1</sub>=V<sub>BB</sub>-V<sub>CC</sub>) and Q<sub>2</sub> (V<sub>BC,Q2</sub>=V<sub>BB</sub>-V<sub>be,Q1</sub>-V<sub>CC</sub>), but only one bias-T is required if V<sub>BB</sub> or V<sub>CC</sub> is defined by the voltage drop across R<sub>F</sub>.

# 2.3.1 Low frequency gain, input and output resistance

The amplifier gain, input and output impedances can be calculated using nodal or mesh analysis. For example, [17] analyzes the Darlington amplifier gain and bandwidth in a HBT-HEMT process. The transistor model used in the analysis includes its input resistance, and parasitic capacitances, however, the expressions obtained provide little insight into the circuit for its design and optimization.

Following a different approach, the simplified low-frequency, small-signal model shown in Fig. 2-6 is used to analyze the amplifier. The schematic includes the



Fig. 2-6: Shunt feedback amplifier low frequency model.

generator resistance (R<sub>G</sub>), the amplifier equivalent shunt input resistance (R<sub>be</sub>), feedback resistance (R<sub>F</sub>), transistor transconductance (g<sub>m</sub>), and load resistance (R<sub>L</sub>). The signal flow graph and the transmittances from one node to another via each branch in the circuit are shown in Fig. 2-7. Three node voltages (generator E<sub>g</sub>, input V<sub>i</sub> and output V<sub>o</sub>) are represented in the graph. Parameters  $\alpha$ , G<sub>a</sub>, G<sub>p</sub> and H symbolize the transmittances from one node to another via each branch in the circuit<sup>1</sup>. The amplifier voltage gain ( $\Delta v$ ) is obtained using Mason's gain rule,  $\Delta_v = \frac{V_o}{E_g} = \frac{\alpha(Gp + Ga)}{1 - ((Gp + Ga) \cdot H)}$ .

If  $\Delta_v = V_o/E_g$  and the amplifier input and output impedances are equal and matched to the generator and load resistances  $S_{21} = V_o/(E_g/2) = 2\Delta_v$ . Then, the forward transmission coefficient is calculated from the amplifier gain as<sup>2</sup>

$$S_{21} = -g_{m-total}(R_F || R_L) \approx -g'_{m2}(R_F || R_L).$$
<sup>(1)</sup>

The amplifier transconductance  $(g_{m-total})$  is approximately the degenerated transconductance of Q<sub>2</sub>,  $g'_{m2} = \frac{i_o}{v_i} \approx \frac{g_{m2}}{1 + g_{m2}(R_E + r_{e2})}$ , where  $r_{e2}$  is the extrinsic emitter resistance of Q<sub>2</sub>, and R<sub>E</sub> is the ballast resistor.

The forward active gain and feedback resistor define the low-frequency input and output impedances of the amplifier ( $R_{out}$ ,  $R_{in}$ ), assuming the active device terminal impedances are >>  $R_{F}$ . This is a valid assumption in a bipolar implementation at low frequency.



Fig. 2-7: Amplifier low frequency signal flow diagram.

<sup>1.</sup> Calculation of the transmittances can be found in Appendix A.

<sup>2.</sup> Eq. 1 and Eq. 2 are derived in appendix A.

Then

$$R_{out} = R_{in} = R_F / (1 - S_{21}).$$
<sup>(2)</sup>

When the amplifier embedded in a 50- $\Omega$  environment (see Fig. 2-5)  $R_{out}=R_{in}=50 \ \Omega$ . The amplifier gain, and its input and output impedances are approximated by Eq. 1 and Eq. 2. For example, the feedback resistance for a 12-dB gain amplifier (i.e.,  $S_{21}=-4$ ) calculated using Eq. 2, is  $R_F=250 \ \Omega$ , and the transconductance required from the active device calculated from Eq. 1 is 96 mS. Independently of the technology used to implement the amplifier, Eq. 1 and Eq. 2 are used to calculate  $R_F$  and  $g_m$ .

#### **2.3.2 Transistors sizes**

A small-signal model of the Darlington pair for frequencies  $\ll f_T$  is shown in Fig. 2-8. The transistor output resistance  $r_{out}$  is assumed infinite in the model, and the Miller capacitance  $C_{Mbc}^3$  is not included because the transistor impedance is dominated by  $C_{be}$  in the amplifier to be designed<sup>4</sup>. The transconductance of the circuit is



Fig. 2-8: Simplified small-signal model of the Darlington pair.

<sup>3.</sup>  $C_{Mbc} = C_{bc}(1 + \Delta_v)$ , where  $C_{bc}$  is the base-collector capacitance and  $\Delta v$  is the voltage gain.

<sup>4.</sup> For a 12-dB gain  $1/jwC_{Mbc} > 1/jwC_{be}$ , due to  $C_{be}/C_{bc} \approx 10$  for a bipolar transistor.

which is a valid approximation at low frequencies, since  $g_{m1}$ - $g_{m2}$  is divided by >> 1 at low frequencies (e.g.,  $g_{m1}/sC_2 \approx \omega_T/j\omega$  for equal area transistors biased at the same current density). The Darlington pair transconductance is dominated by  $g_{m2}$ . The maximum gain-bandwidth product is obtained when  $Q_2$  is biased at peak- $f_T$  current density ( $J_{C,fT}$ ). Therefore,  $Q_2$  must be sized to provide the required transconductance (after introducing  $R_E$ ) when biased at  $J_{C,fT}$ .

To determine the optimum size for  $Q_1$ , the frequency response of the current gain in the circuit in Fig. 2-8 is calculated as

$$\frac{i_o}{i_{in}} = \frac{g_{m1} \cdot g_{m2}}{sC_1 \cdot sC_2} + \frac{g_{m1}}{sC_1} + \frac{g_{m2}}{sC_2} \quad . \tag{4}$$

The transconductance of a bipolar transistor divided by its input capacitance approximates the device transit frequency ( $\omega_T \cong g_m/C_{in}$ ), which is independent of the transistor area for a given bias current density. Then, Eq. 4 is

$$\frac{i_o}{i_{in}} = \frac{\omega_{T1} \cdot \omega_{T2}}{s^2} + \frac{\omega_{T1}}{s} + \frac{\omega_{T2}}{s} \approx \beta_1 \cdot \beta_2 + \beta_1 + \beta_2 \quad .$$
(5)

Equation 5 shows that the Darlington pair current gain is defined by the transit frequency of the transistors, which depends on their bias current densities. Transistor  $Q_2$  is biased at  $J_{C,fT}$  for maximum bandwidth, and it is sized to provide the transconductance required to implement the amplifier gain. However, the area of  $Q_1$  has no effect on the Darlington pair transconductance at low frequencies (Eq. 3), and only its bias current density affects the pair current gain (Eq. 5). In this work, the optimum area of  $Q_1$  is based on its effect on the input impedance, as explained next.

The small-signal model of a single transistor feedback amplifier, including the transistor base resistance  $r_b$ , is shown in Fig. 2-9. The circuit input impedance is

$$\frac{v_{in}}{i_{in}} = Z_{in} \cong \frac{R_F + R_L}{1 + g_m R_L} + j\omega C \frac{(R_F + R_L)}{(1 + g_m R_L)^2} (g_m R_L r_b - (R_F + R_L)) .$$
(6)



Fig. 2-9: Small-signal model of a single-transistor feedback amplifier, including the base resistance.

The real component of the impedance (~50- $\Omega$ ), which is equal to Eq. 2 for  $R_F > R_L$ , is determined by the transconductance (96 mS), load resistance (50  $\Omega$ ), and feedback resistance (250  $\Omega$ ). The imaginary component of the impedance is positive or negative depending on the value of the base resistance.

Variation of the input impedance with respect to  $r_b$  is verified using simulations. A single transistor, common-emitter amplifier (i.e., not a Darlington pair) is simulated from 1 GHz to 100GHz. The transistor width is maintained constant and equal to 90 nm and its length is varied from 2.5 µm to 6.5 µm in 1 µm increments. The active device is biased at the peak-f<sub>T</sub> current density for every size, and the ballast resistor value is changed for each case to maintain 96 mS



Fig. 2-10: S11 of a single transistor feedback amplifier.

transconductance. The amplifier input reflection coefficient ( $S_{11}$ ) for each simulation is plotted on a Smith chart in Fig. 2-10. The emitter length (Le) and base resistance ( $r_b$ ) for each simulation are indicated in the figure. At low frequencies, Zin equals 50  $\Omega$ , as predicted by Eq. 2 and Eq. 5. At increased frequencies, the base resistance changes  $Z_{in}$  according to Eq. 5. Although the area of  $Q_1$  has a negligible effect on the amplifier gain at low frequencies, if it is biased at  $J_{C,fT}$ , it affects the amplifier input impedance.

The first broadband amplifier benchmark circuit is designed for 12-dB gain. The circuit requires 250- $\Omega$  feedback resistor and 96 mS transconductance, as previously calculated. A 90 nm x 7  $\mu$ m transistor and a 3.5- $\Omega$  resistor are chosen to implement Q<sub>2</sub> and R<sub>E</sub>, respectively. The transconductance  $g'_{m2}$  obtained from simulations with the transistor biased at peak-f<sub>T</sub> current density (J<sub>C,fT</sub> ~22 mA/ $\mu$ m<sup>2</sup> [1]) is 96 mS. Simulations of the Darlington amplifier with Q<sub>1</sub> implemented with a 90-nm x 4.2- $\mu$ m transistor show a capacitive input impedance at frequencies above 30 GHz, which is compensated by the input transmission line that connects the amplifier to the IC input pad. The reflection coefficient of the amplifier benchmark circuit, including the input TL, is presented in Section 2.5.

# 2.4 Bandwidth enhancement

The circuit shown in Fig. 2-5 was designed for a 12-dB gain and 50- $\Omega$  input/ output matching in Section 2.3, and it is named as the reference amplifier. In this section, the frequency response of the circuit is studied. Two methods to increase the bandwidth of the reference amplifier are investigated, which result in two other benchmark wideband amplifiers, named series-peaked and cascoded amplifiers.

If the base-collector capacitances  $(C_{bc})$  of  $Q_1$  and  $Q_2$  (see Fig. 2-5) are included in the calculation of the amplifier transfer function, an equation that provides little insight into the circuit is obtained. A different approach is used to quantify the effect of these parasitics on the amplifier bandwidth. The circuit time constants are obtained first, which are inversely proportional to the amplifier cut-off frequency. Then, the contribution of circuit parasitics to these time constants are quantified to identify those which limit the bandwidth. The simplified small-signal circuit of the amplifier (Fig. 2-11 [18]) is used to calculate the time constants. The base resistance of  $Q_1$  affects the  $R_F$  and  $R_G$  branches, but the effect of  $r_b$  in the feedback loop can be ignored assuming  $r_b \ll R_F$ . The base resistance of  $Q_1$  and the generator resistance ( $R_G$ ) are added in series ( $R_{GT}$ ). The overall Miller capacitance ( $C_{\mu}$ ) has contributions from  $Q_1$  (32 %) and  $Q_2$  (68 %), and its value is extracted from simulations of the reference amplifier. The total capacitance from the base of  $Q_1$  to ground ( $C_{in}$ ), and from the collectors of  $Q_1$  and  $Q_2$  to ground ( $C_L$ ) are extracted from the same simulations. The load resistance ( $R_{LT}$ ) is the parallel combination of  $R_L$  and the output resistance of  $Q_2$ , and it equals  $R_L$  when  $r_o >> R_L$ , which is a valid assumption in a bipolar implementation.

The -3dB bandwidth is inversely proportional to the circuit time constant  $\tau_i = C_{in}R_{GT} + (1 + g'_{m2}R_{LT})C_{\mu}R_{GT} + C_{\mu}R_{LT} + C_{L}R_{LT}^{5}$ , which is comprised of four time constants whose values were determined using parasitics extracted from simulations of the reference amplifier. The contribution of the components to  $\tau_i$  are  $\tau_1 = C_{in}R_{GT}$  (46 %),  $\tau_2 = (1 + g'_{m2}R_{LT})C_{\mu}R_{GT}$  (42.4 %),  $\tau_3 = C_{\mu}R_{LT}$  (2.9 %), and  $\tau_4 = C_{L}R_{LT}$  (8.7 %).



Fig. 2-11: Simplified small-signal circuit used for the frequency response analysis.

<sup>5.</sup> The derivation of  $\tau_i$  can be found in appendix B.

Time constants  $\tau_1$  and  $\tau_2$  have a major effect on  $\tau_i$ . Reducing  $\tau_i$  increases the circuit -3dB bandwidth, therefore,  $\tau_1$  and  $\tau_2$  need to be reduced to increase the bandwidth. Notice that the amplifier gain affects  $\tau_2$  only. Circuit modifications are used to reduce  $\tau_1$  and  $\tau_2$ , which leads to the inductive-peaked and cascoded amplifiers.

# 2.4.1 Inductive peaking

The bandwidth of the Darlington feedback amplifier can be extended using inductors in the feedback [19], input [20], and output [21] to peak the response. However, the trade-off between increased bandwidth and gain flatness needs to be considered. A comparison of the bandwidth improvement and peaking, obtained using an inductor in three different locations in the circuit, is shown in Fig. 2-12. In the figure, the percentage increase in the bandwidth with respect to the reference amplifier (Fig. 2-5), and the percentage increase in  $|S_{21}|$  caused by peaking of the frequency response with respect to  $S_{21}$  at 1 GHz are noted. The largest bandwidth improvement is obtained when an inductor is connected in series with the base of  $Q_1$ , which results in a 25 % greater bandwidth and 2% peaking of the forward gain. Notice that introducing  $L_1$  reduces the loading of  $C_{in}$  on the input signal source.



Fig. 2-12: Options to implement inductive peaking.



Fig. 2-13: Inductive series-peaked broadband amplifier.



Fig. 2-14: Custom made inductor for the inductive series-peaking of a broadband amplifier.

The series-peaked amplifier (Fig. 2-13) is implemented using the 72-pH inductor shown in Fig. 2-14. The top three metal layers (M6, M7, and M8) are used to implement the 12- $\mu$ m x 12- $\mu$ m, two-turn octagonal inductor. It is shielded from the substrate using an M1 shield (Fig. 2-14). Electromagnetic (EM) simulations using the 2.5D Method-of-Moments (MoM) predicted a self-resonant frequency of 196 GHz.

# 2.4.2 Cascoding

An alternative to mitigate the Miller effect in the amplifier is the use of a cascode transistor (see [22], [23] for examples). The schematic of a Darlington amplifier with  $Q_2$  cascoded by  $Q_3$  is shown in Fig. 2-15. The on-chip decoupling



capacitor (C<sub>D</sub>) reduces the impedance from the base of Q<sub>3</sub> to ground. Note that the amplifier power consumption is unchanged after adding Q<sub>3</sub>, because neither the voltage supply nor the bias current are increased. Cascoding Q<sub>2</sub> with Q<sub>3</sub> reduces the capacitive contribution of Q<sub>2</sub> to C $\mu$ , due to the reduction of the Miller effect, thereby reducing  $\tau_2$  and  $\tau_3$ , and increasing the amplifier bandwidth. Combining inductive series-peaking and cascoding via Q<sub>3</sub> extends the amplifier bandwidth by 53 % compared to that of the reference amplifier (Fig. 2-5).

Transistor  $Q_3$  affects not only the bandwidth of the amplifier. A Darlington amplifier with a time delay in the collector current of  $Q_2$  is shown in Fig. 2-16. The simulated forward and inverse transmission coefficients for the delay  $\tau_{Q3}$  set to 0 and 1 ps are plotted in Fig. 2-17. The delay inserted at the collector of  $Q_2$  degrades the reverse isolation of the amplifier, as seen from the simulation. A delay within the



Fig. 2-16: Darlington amplifier with a delay introduced in the collector current of Q2.



Fig. 2-17: Simulated effect of a delay in the Darlington pair amplifier.

loop affects the stability of the design, which is verified by simulations of the gain margin (GM) and phase margin (PM) for the three amplifiers. Reduced GM and PM is observed from the reference (GM=15dB, PM=90°), to the inductive-peaked (GM=11dB, PM=81°), and the cascoded (GM=5dB, PM=76°) designs.

# 2.4.3 Amplifier noise figure

The reference (Fig. 2-5) and series-peaked (Fig. 2-13) amplifiers have similar noise figures (NF), which are higher than the NF of the cascode amplifier. The base-collector bias voltage of  $Q_2$  ( $V_{BC_Q2}$ ) in the reference amplifier (Fig. 2-5) equals  $V_{BB}$ - $V_{BE,Q1}$ - $V_{CC}$ , where  $V_{BE,Q1}$  is ~0.9 V when  $Q_1$  is biased at peak- $f_T$  current density. The voltage  $V_{BC_Q2}$  is reduced when  $Q_3$  cascodes  $Q_2$ , assuming that the supply voltage is kept constant. The effect of  $Q_3$  on the noise figure is evaluated by comparing two circuits: a common-emitter transistor, and a cascoded transistor (shown in Fig. 2-18a). The NF values from 1 GHz to 20 GHz for both circuits are plotted in Fig. 2-18b. The NF of the common-emitter transistor, which is biased at 1.2 V reverse base-collector voltage, is considered first. The base-collector depletion layer is extended into the base (Early effect) when  $V_{BC}$  increases. A wider BC depletion layer increases the internal base resistance and the number of scattering events due to the high electric field [24]. Therefore, the transistor internal base



Fig. 2-18: Noise figure comparison between a common-emitter and a cascode topology.

resistance at 1.2 V  $V_{BC}$  increases respect to lower  $V_{BC}$ , which increases the noise contribution of  $Q_2$ . If the reverse base-collector voltage is reduced to 0.1 V, the internal base resistance and NF decrease. In a common-emitter configuration, increasing  $V_{BC}$  from 0.1 to 1.2 V raises the NF by 0.6 dB.

In a cascode configuration  $V_{BC_Q2}$  equals the base voltage of  $Q_3$  ( $V_{CASC}$ ) minus its  $V_{BE}$ , notice that when  $Q_3$  is biased at  $J_{C,fT}$  its  $V_{BE}$  approaches 0.9 V. The value of  $V_{CASC}$  is selected to set  $V_{BC_Q2}$  to 0.1 V. The NF of the cascode stage is shown in Fig. 2-18b. The contribution of  $Q_3$  to the NF is negligible, because the cascode stage NF value equals the NF of  $Q_2$  (alone) biased at the same  $V_{BC}$ . Thus, the cascode amplifier is expected to have a lower NF compared to the reference and series-peaked amplifiers

# 2.4.4 Amplifier linearity

Odd-order distortion at low input power levels creates third-order intermodulation distortion (IM3). The cascode amplifier IM3 was simulated for different voltages across the base-collector junction of  $Q_2$  ( $V_{BC_Q2}$ ), controlled via  $V_{CASC}$  as shown in Fig. 2-19a.



Fig. 2-19: Schematic of a cascode amplifier, and Cjc vs. base-collector junction reverse bias voltage

The amplifier IM3 depends on the (non-linear) base-collector depletion capacitance ( $C_{jc}$ ) [25]. Its variation vs. the collector-base bias voltage ( $V_{CBbias}$ ) is shown in Fig. 2-19b. If  $V_{CBbias}$  is increased,  $C_{jc}$  is constant when the base-collector voltage is modulated by the output voltage, however, at reduced  $V_{BCbias}$  the value of  $C_{jc}$  is varied by the base-collector voltage, which causes distortion.

The amplifier has a fixed supply voltage, but  $V_{CASC}$  can be used to set the collector-base bias voltages of  $Q_2$  and  $Q_3$ . The amplifier IM3 vs. input power, for  $V_{CB_Q2} = -0.3, 0, 0.3, \text{ and } 0.7 \text{ V}$ , is plotted in Fig. 2-20. At a given input power (Pin), IM3 reduces when  $V_{CB_Q2}$  increases (e.g., see Pin=-15dm, and  $V_{CB_Q2}$  from -0.3 V



Fig. 2-20: Third-order intermodulation distortion for different  $V_{BC}$  in the cascode amplifier, simulation results.

to 0 V). Increasing  $V_{CB_Q2}$  beyond 0V degrades IM3 because  $V_{CB_Q3}$  reduces (see Pin=-15dm, and  $V_{CB_Q2}$  from 0.3 V to 0.7 V).

Better linearity is expected in the reference and inductive-peaked amplifiers compared to the cascode amplifier, because they have higher  $V_{BC_Q2}$ . The cascode amplifier IP3 can be improved at the cost of higher supply voltage and increased NF. Improvement of IP3 could be investigated following a similar approach to [22], which implements a Darlington cascode amplifier in GaAs PHEMT and GaN HEMT technologies. The PHEMT devices have a quadratic relationship between I<sub>D</sub> and  $V_{GS}$ , which generates less harmonics than the exponential relationship between I<sub>C</sub> and  $V_{BE}$  in the bipolar devices. Moreover, higher breakdown voltage ( $BV_{DG} > 14 V$ ) of the 0.5-µm PHEMT used in [22] allowed the use of a 5-V supply to obtain 44.3 dBm OIP<sub>3</sub> (measured) after adding a load at the gate of the cascode transistor. The impedance tuning circuit at the gate of the cascode transistor improved IP3 by 2 dB respect to the conventional Darlington amplifier implemented in the same technology.

# 2.5 Amplifier measurement and characterization

Three Darlington amplifiers have been implemented in IBM-9HP SiGe-BiCMOS technology (300/350 GHz  $f_T/f_{max}$  [1]). The reference (Fig. 2-5), series-peaked (Fig. 2-13) and cascode (Fig. 2-15) amplifiers have been configured for on-wafer probing using GSG probes for the RF path, and DC probes for biasing. The photomicrograph of the cascode amplifier is shown in Fig. 2-21, which has a total area of 0.20 mm<sup>2</sup> (including pads), of which only 0.003mm<sup>2</sup> is required by the active circuitry (i.e., peaking inductor, biasing, transistors and feedback resistor). The base of Q<sub>3</sub> is biased at a voltage V<sub>CASC</sub>=1.85 V, and the supply V<sub>CC</sub>=2.1 V for all three prototypes. Each amplifier draws 23 mA from the supply. The Agilent N5251A vector network analyzer was calibrated using the TRL method to de-embed the cables and probes. Effects of the pad parasitics and the on-chip input and output



Fig. 2-21: Chip photomicrograph of the cascode amplifier.

transmission lines were included in the amplifier design, and the following measurements include their effects.

Simulations and measurements of the forward transmission coefficient (|S21|) from 1 to 110 GHz are shown in Fig. 2-22, and the -3 dB bandwidth is indicated for each of the amplifiers. The measured feedback resistor ( $R_F$ ) is only 1.6 % below the design value of 250  $\Omega$  (i.e., 246  $\Omega$ ). The low-frequency gain is 12-dB for the three



Fig. 2-22: Measured (solid line) vs. simulated (dashed)  $|S_{21}|$  for the three amplifiers.



Fig. 2-23: Measured (solid) vs. simulated (dashed)  $|S_{12}|$  for the three amplifiers.

amplifiers, as designed using Eq. 2. Simulations for the reference (Fig. 2-5), series-peaked (Fig. 2-13) and cascode (Fig. 2-15) amplifiers predict -3 dB bandwidths of 79, 106 and 123 GHz, respectively, and show excellent agreement with the measured values for the reference (80 GHz) and series-peaked (100 GHz) amplifier. The -3dB bandwidth of the cascode amplifier exceed the maximum frequency of the VNA (i.e., 110 GHz); 123-GHz bandwidth is predicted from simulation.

The reference and series-peaked amplifiers have relatively flat reverse transmission coefficients (|S12| in Fig. 2-23), with a variation from -16 dB to -14 dB between 1 and 110 GHz. The behavior is different for the cascode amplifier, changing from -16 dB to -8.7 dB over the same frequency range. The difference may arise from the delay introduced in the forward path by the cascode transistor  $Q_3$ , as shown in Fig. 2-17.

Simulated and measured input (IRL) and output (ORL) reflection coefficients are plotted on a Smith chart in Fig. 2-24 for the cascode amplifier. The ORL is inductive at low frequencies due to the 110-µm transmission line, while the input is a combination of the 150-µm input transmission line and the base resistance as shown



Fig. 2-24: Measured (solid) and simulated (dashed) input and output reflection coefficients for the cascode amplifier.

by Eq. 6 in Section 3.1.2. Furthermore, all amplifiers have a measured input and output return loss better than 10 dB within their -3 dB bandwidth.

The three amplifiers tested are unconditionally stable (i.e., k > 1 and  $|\Delta| < 1$  according to Rollet stability factor [26]). The series-peaked and cascode amplifier stability factors (k) and determinants of the measured S-parameters ( $\Delta$ ) are plotted in







Fig. 2-26: Group delay extracted from measured S-parameters for the three amplifiers.

Fig. 2-25a. In the figure, k and  $\Delta$  of the reference amplifier follow those of the series-peaked closely, and have been omitted to simplify the plots. The stability factor  $\mu$  [27] was calculated using the measured data for the 3 amplifiers, and simulated data for the cascode amplifier. The results are shown in Fig. 2-25b. Multiple amplifiers can be cascaded due to their stability.

Constant group delay over frequency is desired to avoid the distortion created by the dispersion of the signal, which can cause intersymbol interference in digital data transmission. Fig. 2-26 shows amplifiers measured group delay, along with the simulation for the cascode amplifier. Variation of the group delay is within  $\pm 1$  ps.

The noise figure of each of the amplifier was measured up to 18 GHz (noise source available up to that frequency). The results are presented in Fig. 2-27 and compared to simulation results. Lower NF is obtained from the cascode amplifier due to lower noise contribution from  $Q_2$  when biased at smaller  $V_{BC_Q2}$ , as discussed in Section 2.4.3 and shown in Fig. 2-18. Simulations show that the dominant sources of



Fig. 2-27: Measured (solid) and simulated (dashed) noise figure (NF) for the amplifiers.

noise are the base resistances of  $Q_1$ ,  $Q_2$  and the feedback resistor  $R_{F}$ .

Second- and third-order distortions (IM2 and IM3, respectively) measured at 5 GHz are shown in Fig. 2-28 for the cascode amplifier. Measurements and simulations are seen in the figure. Furthermore, Table 2-1 shows the 1dB compression, IIP2 and IIP3 for the three amplifiers. As described in Section 3.2.4, a lower base-collector reverse bias voltage degrades the amplifier linearity. The



Fig. 2-28: Measured (solid) vs. simulated (dashed) linearity for the cascode amplifier.

linearity of the cascode amplifier is improved by increasing the supply voltage and  $V_{CASC}$ . Further improvement is obtained at the cost of higher power consumption. For example, the ballast resistor value can be increased to degenerate  $Q_2$  further, but to maintain the gain, the area and bias current of  $Q_2$  need to be increased too.

|               | P1dB dBm | IIP2 dBm | IIP3 dBm |
|---------------|----------|----------|----------|
| Reference     | -9       | 22       | 14.5     |
| Series-peaked | -9       | 22.5     | 15       |
| Cascode       | -15      | 19       | 12       |

 Table 2-1: Measured amplifier linearity

The fabricated amplifiers are benchmarked against other published amplifiers in Table 2-2.

Distributed amplifiers fabricated in CMOS technology are included in the table for comparison. Reference [28] is a distributed amplifier with 9-dB gain and 92-GHz bandwidth, and reference [29] is 4-level, tapered distributed amplifier with 14 dB gain and 73.5-GHz bandwidth. Ref. [29] has a GBW/ $P_{DC}$  value that is 0.89 higher than the one from [28] (i.e., 4.39 vs. 3.5), but it also has a larger area in comparison to [28] (1.72 mm<sup>2</sup> vs. 0.45 mm<sup>2</sup>). The larger area and smaller GBW/P<sub>DC</sub> compared to the cascode amplifier (i.e., 0.152 mm<sup>2</sup> and 9.1 GHz/mW, respectively) show the performance advantages of the SiGe HBT. Reference [30] uses a multi-stage design of four emitter followers and one cascode stage achieving an overall -3 dB bandwidth of 84.6 GHz. Its 990-mW power consumption defines a  $GBW/P_{DC}$  of 0.85, which is less than one-tenth of the  $GBW/P_{DC}$  realized by the cascode amplifier (Fig. 2-15). Reference [31] consist of a cascade of stagger-tuned stages equalized for broadband response and low ripple. However, the group delay (GD) of the design has a large variation (±6 ps) compared to the single stage GD variation (±1 ps for the reference, series peaked and cascode amplifiers). Furthermore, the 10-dB gain and 102-GHz bandwidth amplifier consumes 73 mW

| Source                         | Technology/fT                   | Gain<br>(dB) | BW<br>(GHz)           | Power<br>Dissipation<br>(mW)/<br>Vsupply | FoM =<br>GBW/Pdc       | Active<br>Area<br>(mm <sup>2</sup> ) | NF<br>(dB) | Topology                                        |
|--------------------------------|---------------------------------|--------------|-----------------------|------------------------------------------|------------------------|--------------------------------------|------------|-------------------------------------------------|
| Reference<br>Fig. 2-5          |                                 | 12           | 79                    |                                          | 6.58                   | 0.152                                | 8.8        |                                                 |
| Series-<br>peaked<br>Fig. 2-13 | 90nm SiGe-<br>BiCMOS/ 300GHz    | 12           | 100                   | 48 / 2.1V                                | 8.3                    | 0.152                                | 8.9        | Single-stage Darlington.<br>Single-ended        |
| Cascode<br>Fig. 2-15           |                                 | 12           | >110<br>123<br>(sims) |                                          | 9.1<br>10.25<br>(sims) | 0.197<br>0.003<br>(core)/            | 7.5        |                                                 |
| [28] MWCL,<br>2011             | 45nm CMOS SOI/<br>350GHz        | 9            | 92                    | 73.5 / 1.2                               | 3.5                    | 0.45                                 | -          | Distributed amplifier.<br>Single-ended          |
| [29] MTT,<br>2009              | 90nm CMOS                       | 14           | 73.5                  | 84 / 1.2                                 | 4.393                  | 1.72                                 | -          | Tapered cascaded distrib-<br>uted amplifiers.   |
| [30] JSSC,<br>2007             | 0.18µm SiGe bipo-<br>lar 200GHz | 20           | 84.6                  | 990 / 5.5V                               | 0.85                   | 0.63                                 | 21.5       | 4 cascaded EF + 1 cas-<br>code.<br>Differentail |
| [31] JSSC,<br>2011             | 0.12μm SiGe-<br>BiCMOS          | 10           | 102                   | 73/2                                     | 4.425                  | 0.29                                 | 5.8        | Staggered.<br>Single-ended                      |
| [32] BCTM,<br>2013             | 0.13μm SiGe-<br>BiCMOS 200GHz   | 20           | >67<br>82(sim)        | 92 / 2.7V                                | 7.28<br>8.91 (sim)     | 0.28 0.04<br>(core)                  | 6          | 2-stage Darlington.<br>Single-ended             |

# Table 2-2: Broadband amplifiers performance comparison

and has a GBW/P<sub>DC</sub> close to half the value of the cascoded amplifier (i.e., 4.425 vs. 9.1). Example [32] uses a 2-stage Darlington amplifier achieving 20-dB gain and a simulated bandwidth of 82 GHz (measured only up to 67 GHz due to setup limitations) and it has a GBW/P<sub>DC</sub> of 7.28 (8.91 using its simulated -3 dB bandwidth). By comparison, the measured data for the cascode amplifier is used to simulate two cascaded stages. The two-stage amplifier has 23.9-dB gain and 116-GHz bandwidth. The GBW product of the simulation is 1.82 THz, and the power consumption is 96 mW, therefore, the GBW/P<sub>DC</sub> is 19, exceeding the GBW/P<sub>DC</sub> of 10.25 and 8.9 of the cascode amplifier and ref. [32].

Techniques used to extend the -3 dB bandwidth of the amplifier increases the performance capability of a technology. The broadband amplifier without extended bandwidth can be used to track different device parameters and their effect on the circuit performance (e.g.,  $r_b$ ,  $C_{be}$ ,  $C_{bc}$ , etc.). The Darlington amplifier in Fig. 2-5 was designed in four SiGe-BiCMOS technologies following the design procedure of Section 2.3. A comparison of the designed amplifiers is shown in Fig. 2-29 ([1]). All of the simulated amplifiers have 12-dB gain and are matched to 50  $\Omega$  at the input and



Fig. 2-29: Forward transmission coefficient across four SiGe-BiCMOS generations.

output. The amplifiers GBW product across technologies are 54 GHz (0.5-μm - 5HP), 110 GHz (180-nm - 7HP), 216 GHz (130-nm - 8HP), and 316 GHz (90-nm - 9HP). The Darlington amplifier serves as an effective technology benchmark circuit because its GBW product provides a means of technology evaluation and can be used to make a comparison between technologies.

# 2.6 Summary

The design of a broadband Darlington feedback amplifier was studied in this chapter. Two simple equations (Eq. 1 and Eq. 2) were used for the design of the low-frequency gain and input/output matching. The effect of the transistors sizes on the gain, bandwidth and matching was reviewed. Furthermore, the transistor parameters that limit the amplifier maximum operating frequency have been identified by calculating the dominant circuit time constant. The effects of parasitics were reduced using inductive peaking and adding a cascode stage, obtaining an overall bandwidth improvement of 53% with respect to the (reference) Darlington amplifier. Three prototypes with 12-dB gain and matched to 50- $\Omega$  input/output impedances were fabricated in 90-nm SiGe-BiCMOS technology. The broadband amplifier prototypes validate the low-frequency design equations and bandwidth extension techniques proposed for the Darlington feedback amplifier.

#### References

[1] J.J. Pekarik, J. Adkisson, P. Gray, et al., "A 90nm SiGe BiCMOS technology for mmwave and high-performance analog applications," IEEE Bipolar/BiCMOS Circuits and Technology Meeting, pp. 92-95, Sept. 2014.

[2] S. Mohammadi, J.-W. Park, D. Pavlidis, et al., "Design Optimization and Characterization of High-Gain GaInP/GaAs HBT Distributed Amplifiers for High-bit-rate Telecommunication," *IEEE Transactions on Microwave Theory and Techniques*, vol. 48, no. 6, pp. 1038–1044, Jun. 2000.

[3] J.R. Nelson, "A Theoretical Comparison of Coupled Amplifiers with Staggered Circuits," Proceedings of the Institute of Radio Engineers, vol. 20, no. 7, pp. 1203-1220, July 1932.

[4] K. Joohwa, J.F. Buckwalter, "Staggered Gain for 100+ GHz Broadband Amplifiers," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 5, pp. 1123-1136, May 2011.
[5] Z. Xuan, R. Ding, T. Baehr-Jones, M. Hochberg, "A 92 mW, 20 dB gain, Broadband Lumped SiGe amplifier with Bandwidth Exceeding 67 GHz," Proceedings of the IEEE-BCTM, Bordeaux FR, pp. 107-110, Oct. 2013.

[6] E.M. Cherry, D.E. Hooper, "The Design of Wide-band Transistor Feedback Amplifiers," Proceedings of the Institution of Electrical Engineers, vol. 110, no. 2, pp. 375-389, February 1963.

[7] N. Ishihara, O. Nakajima, H. Ichino, Y. Yamauchi, "9 GHz Bandwidth, 8-20 dB Controllable-gain Monolithic Amplifier Using AlGaAs/GaAs HBT Technology," *Electronics Letters*, vol. 25, no. 19, pp. 1317-1318, Sept. 1989.

[8] Y.M. Greshishchev, P. Schvan, "A 60-dB Gain, 55-dB Dynamic Range, 10-Gb/s Broad-band SiGe HBT Limiting Amplifier," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 12, pp. 1914-1920, Dec. 1999.

[9] D. A. Hodges, "Darlington's Contributions to Transistor Circuit Design," *IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications*, vol. 46, no. 1, pp. 102–104, Jan. 1999.

[10] J.F. Kukielka and C.P. Snapp, "Wideband Monolithic Cascadable Feedback Amplifiers Using Silicon Bipolar Technology," *Monolithic Microwave Integrated Circuits*, IEEE Press, 1985, pp. 330-331.

[11] M.-D. Tsai, C.-S. Lin, C.-H. Lien, and H.Wang, "Broad-band MMICs Based on Modified Loss-compensation Method Using 0.35-µm SiGe BiCMOS Technology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 53, no. 2, pp. 496–505, Feb. 2005.

[12] S.-H.Weng, H.-Y. Chang, and C.-C. Chiong, "A DC–21 GHz Low Imbalance Active Balun Using Darlington Cell Technique for High Speed Data Communications," *IEEE Microwave and Wireless Components Letters*, vol. 19, no. 11, pp. 728–730, Nov. 2009.

[13] J. Lee and J. D. Cressler, "Analysis and Design of an Ultra-wideband Low-noise Amplifier Using Resistive Feedback in SiGe HBT Technology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 54, no. 3, pp. 1262–1268, Mar. 2006.

[14] K. W. Kobayashi, Y. C. Chen, I. Smorchkova, et al., "1-Watt Conventional and Cascoded GaN-SiC Darlington MMIC Amplifiers to 18 GHz," IEEE RFIC Symposium, June 2007, pp. 585–588.

[15] K. W. Kobayashi, R. Esfandiari, and A. K. Oki, "A Novel HBT Distributed Amplifier Design Topology Based on Attenuation Compensation Techniques," *IEEE Transactions on Microwave Theory and Techniques*, vol. 42, no. 12, pp. 2583–2589, Dec. 1994.

[16] C. T. Armijo and R. G. Meyer, "A new wide-band Darlington amplifier," in *IEEE-JSSC*, vol. 24, no. 4, pp. 1105-1109, Aug 1989.

[17] W. Shou-Hsien, C. Hong-Yeh, C. Chau-Ching, W. Yu-Chi, "Gain-Bandwidth Analysis of Broadband Darlington Amplifiers in HBT-HEMT Process," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, no. 11, pp. 3458-3473, Nov. 2012

[18] L. Vera, J.R. Long, B.J. Gross, "A Low-power SiGe Feedback Amplifier with Over 110GHz Bandwidth," Proceedings of IEEE-BCTM, San Diego CA, pp.1-4, Sept. 2014.

[19] D. Costa, A. Khatibzadeh, "A Wideband AlGaAs/GaAs Heterojunction Bipolar Transistor Amplifier Optimized for Low-near-carrier-noise Applications up to 18 GHz," IEEE MTT-S Digest, pp.1645-1648 vol.3, May 1994.

[20] H.S. Tsai, R. Kopf, R. Melendes, et al., "90 GHz Basedband Lumped Amplifier," *Electronics Letters*, vol. 36, no. 22, pp. 1833-1834, Oct 2000.

[21] S.S. Mohan, M.D.M. Hershenson, S.P. Boyd, T.H. Lee, "Bandwidth Extension in CMOS with Optimized on-chip Inductors," *IEEE Journal of Solid-State Circuits* vol.35, no.3, pp. 346-355, March 2000.

[22] K.W. Kobayashi, "Linearized Darlington Cascode Amplifier Employing GaAs PHEMT and GaN HEMT Technologies," *IEEE Journal of Solid-State Circuits* vol. 42, no. 10, pp. 2116-2122, Oct. 2007.

[23] H. Shanwen, W. Zhong, G. Huai, G.P. Li, "A Novel Darlington Cascode Broadband Drive Power Amplifier in 2µm InGaP/GaAs HBT Technology," Proceedings of Wireless and Microwave Technology Conference, pp. 1-6, April 2012.

[24] M. Schroter and A. Chakravorty, "Compact Hierarchical Modeling of Bipolar Transistors with HICUM", World Scientific, Singapore, ISBN 978-981-4273-21-3, 2010.

[25] H. E. Abraham and R. G. Meyer, "Transistor design for low distortion at high frequencies," in IEEE Transactions on Electron Devices, vol. 23, no. 12, pp. 1290-1297, Dec 1976.

[26] J. Rollett, "Correction to Stability and Power-Gain Invariants of Linear Two Ports," *IEEE Transactions on Circuit Theory*, vol.10, no.1, pp. 107-107, March 1963.

[27] M. L. Edwards and J. H. Sinsky, "A new criterion for linear 2-port stability using a single geometrically derived parameter," in IEEE Transactions on Microwave Theory and Techniques, vol. 40, no. 12, pp. 2303-2311, Dec 1992.

[28] K. Joohwa and J.F. Buckwalter, "A 92 GHz Bandwidth Distributed Amplifier in a 45 nm SOI CMOS Technology," *IEEE Microwave and Wireless Components Letters*, vol. 21, no. 6, pp. 329-331, June 2011.

[29] A. Arbabian, A.M. Niknejad, "Design of a CMOS Tapered Cascaded Multistage Distributed Amplifier," *Transactions on Microwave Theory and Techniques*, vol. 57, no. 4, pp. 938-947, April 2009.

[30] S. Trotta, H. Knapp, K. Aufinger, et al., "An 84 GHz Bandwidth and 20 dB Gain Broadband Amplifier in SiGe Bipolar Technology," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 10, pp. 2099-2106, Oct. 2007.

[31] K. Joohwa, J.F. Buckwalter, "Staggered Gain for 100+ GHz Broadband Amplifiers," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 5, pp. 1123-1136, May 2011.

[32] Z. Xuan, R. Ding, T. Baehr-Jones, M. Hochberg, "A 92 mW, 20 dB gain, Broadband Lumped SiGe amplifier with Bandwidth Exceeding 67 GHz," Proceedings of the IEEE-BCTM, Bordeaux FR, pp. 107-110, Oct. 2013.

3

# **FREQUENCY MULTIPLICATION**

Frequency multipliers are necessary for power generation in the upper mm-wave and sub-mm-wave frequencies. The limitation in the output power of amplifiers and the tuning range of oscillators operating near the device cut-off frequency make multipliers an important component in higher frequency transmitters. Distribution and generation of local oscillator carriers at lower frequencies (to optimize phase noise) and scaling to higher frequencies using multiplication is attractive from a system design perspective.

# **3.1** Passive and active frequency multipliers

Frequency multipliers may be classified as passive or active. Passive multipliers often use a single non-linear device for harmonic generation (e.g., a transistor or diode), followed by a gain stage and passive narrowband filtering to select the desired overtone. Amplification is required to restore the RF signal amplitude lost in the conversion process, which consumes DC power. Because of their simplicity and efficiency in narrowband applications, passive multipliers have been designed which operate up to sub-mm-wave frequencies [1], [2]. However, input-to-output isolation for passive multipliers can be poor without high-order, narrowband filtering, which increases the chip area and manufacturing cost for the complete circuit.

Active multipliers provide conversion gain when translating the input to the desired output frequency. Isolation and suppression of potential spurious signals can be realized by selection of the circuit topology [3], which simplifies integration of an

oscillator with up-conversion blocks in a transceiver. It relaxes the filtering requirements needed to control output spurs at harmonics of the input frequency, making it possible to realize higher spectral purity across a wide band of frequencies within a compact area when compared to passive multipliers.

# 3.1.1 Active frequency multiplier topologies

Transistor-based multipliers typically use the translinear transfer of a differential pair to generate energy at the desired harmonic. Gilbert [4] devised a symmetric circuit topology for a translinear multiplier that suppresses even-order spurious components and cancels signal feedthrough, thereby avoiding the need for high-order post-multiplication filters that bandlimit the frequency response.

The Gilbert multiplier (shown in Fig. 3-1) was benchmarked in many silicon technologies, with examples realizing 16-dB maximum conversion gain (CG<sub>max</sub>) from DC-17 GHz using Si-BJTs [3], 18-42 GHz bandwidth with 8.6-dB CG<sub>max</sub> using SiGe-HBTs [5], and across 100 GHz with 1-dB CG<sub>max</sub> in an InP DHBT technology [6]. The main limitations of the circuit, as a broadband frequency multiplier, are its minimum supply voltage and spurious suppression. Undesired harmonic outputs are suppressed well in the 4-quadrant analog multiplier circuit originally proposed by Gilbert. However, output spurious levels increase at odd harmonics of the input when conversion gain and operating frequency are increased through circuit simplifications



Fig. 3-1: Gilbert multiplier.

(e.g., removing the predistortion stage). In addition, overdriving the circuit to maximize the multiplier conversion gain unbalances the circuit and generates unwanted even-order harmonics, and its bandwidth is limited by the difference in response between the 2 inputs of the translinear cascode.

The Gilbert cell topology relies on the cascoding of current steering stages and requires 2.7 V of headroom in a silicon bipolar implementation, or ~1.8 V in an advanced CMOS technology and it is less suitable in the deep submicron technology era where ~1 V supplies for CMOS are common. The focus in this chapter is on the low-voltage Kimura multiplier topology [7] which is selected for its low power capability.

# 3.2 Low-voltage multiplier topology

The multiplier core consists of two asymmetrically biased differential pairs with cross-coupled collectors that realizes an even-order transfer function. It is capable of wideband operation as a frequency doubler or quadrupler at bias voltages below 2 V in a bipolar technology. A schematic of the Kimura multiplier core is shown in Fig. 3-2. Initially proposed by Ogawa and Kusakabe as a multiplier in 1978 [8], Kimura developed the circuit as pseudo-logarithmic rectifier [9], and later as a 4-quadrant multiplier using unbalanced differential pairs in MOS and bipolar [7].

A differential input signal  $(V_{i+}-V_{i-})$  drives the multiplier core, which generates a current output  $\Delta I_{out}$  (i.e.,  $I_{op}$ - $I_{om}$ ) at even-order multiples of the input



Fig. 3-2: Frequency doubler core schematic.

signal. The multiplier circuit input loop consists of the signal source and its impedance, and the differential (base) inputs of  $Q_{2,3}$  and  $Q_{1,4}$  (with reduced voltage,  $V_K$  on  $Q_1$  and  $Q_3$ ). The output loop consists of the load at the multiplier output and the collectors of transistors  $Q_{2,4}$  and  $Q_{1,3}$ . The DC offset caused by asymmetry of the bias currents flowing in the core transistors is minimized by setting the core output nodes ( $V_{op}$  and  $V_{om}$  in Fig. 3-2) to the same DC voltage, as discussed later in the chapter.

The differential output current ( $\Delta I_{out}$ ) as a function of the differential input voltage ( $\Delta V_{in}$ ) for the circuit of Fig. 3-2 is given by

$$\Delta Iout = \alpha_F I_0 [\tanh((\Delta V_{in} + V_K)/2V_T) - \tanh((\Delta V_{in} - V_K)/2V_T)], \qquad (1)$$

where  $\alpha_F$  is the forward transfer ratio of the BJT/HBT,  $V_T$  is thermal voltage kT/q,  $I_0$  is the DC current biasing each differential pair, and  $V_K$  is an offset voltage.  $V_K$  can be implemented using different emitter area transistors in each pair or as a DC bias. Either method creates the asymmetry required to realize the transfer function of Eq. 1. When  $V_K$  is developed with unequal emitter areas,  $V_K = ln(K)V_T$ , where K is the ratio of areas used in the (now) asymmetrically biased differential pairs.

Eq. 1 can be rewritten as

$$\Delta Iout = \frac{2\alpha_F I_0 (e^{V_K/V_T} - e^{-V_K/V_T})}{e^{V_K/V_T} + e^{-V_K/V_T} + e^{-\Delta V_{in}/V_T} + e^{-\Delta V_{in}/V_T}} = \frac{2\alpha_F I_0 \sinh(V_K/V_T)}{\cosh(V_K/V_T) + \cosh(\Delta V_{in}/V_T)} \quad , \tag{2}$$

using identities for the hyperbolic functions. Taking the first two terms of the Taylor series for  $\cosh(x) = \sum_{n=0}^{\infty} \frac{x^{2n}}{(2n)!}$ , Eq. 1 can be simplified to

$$\Delta Iout = \frac{2\alpha_F I_0 \sinh(V_K/V_T)}{1 + \cosh(V_K/V_T) + \Delta V_{in}^2/(2V_T^2)},$$
(3)

where the term squaring input voltage  $\Delta V_{in}$  that yields even-order harmonics in the output current is readily apparent.



Fig. 3-3: Frequency doubler normalized transfer function.

Fig. 3-3 shows a plot of the normalized differential output current from Eq. 1 and Eq. 3 versus input voltage for  $V_T=26$  mV and  $V_K=2.29V_T$ . It is clearly an even-order function that will produce an output rich in even-order harmonics and little odd-order harmonic energy. The error in the approximation of Eq. 3 is less than 10 % for  $|\Delta V_{in}| \le 50 mV$ .

The small-signal transconductance  $g_m = (d(\Delta Iout))/(d(\Delta V_{in}))$  of an ideal frequency doubler should be a linear function of  $V_{in}^{1}$ . Otherwise,  $(\Delta Iout)/(\Delta V_{in})$  is higher than quadratic in order (e.g., the approximation made to derive Eq. 3 no longer holds) and a single input tone produces significant harmonics above the 2<sup>nd</sup> in the output current. From differentiation of Eq. 1, the transconductance of the circuit in Fig. 3-2 is

$$g_m = \frac{d(\Delta Iout)}{d(\Delta V_{in})} = \frac{\alpha_F I_0}{2V_T} \left[ \left( \operatorname{sech} \left( \frac{\Delta V_{in} + V_K}{2V_T} \right) \right)^2 - \left( \operatorname{sech} \left( \frac{\Delta V_{in} - V_K}{2V_T} \right) \right)^2 \right].$$
(4)

It has been shown in [7] that a  $V_K$  of 2.29 $V_T$  yields a transconductance that is approximately linear across the widest input voltage range. Fig. 3-4 is derived from Eq. 4. It shows that the peak transconductance decreases by 21 % and that the linear

<sup>1.</sup> Assuming an ideal doubler where  $\Delta Iout = \Delta V_{in}^2$ : gm=d( $\Delta Iout$ )/d( $\Delta V_{in}$ )=2 $\Delta V_{in}$ , which is linear.



Fig. 3-4: Doubler transconductance derived from Eq. 4 for 3 different  $V_{\rm K}$  values.

input voltage range is  $|\Delta V_{in}| < 20mV$  when  $V_K$  is set to  $1.29V_T$ , while  $V_K=2.29V_T$  gives a linear  $g_m$  within  $|\Delta V_{in}| < 40mV$ . However, the highest transconductance (i.e.,  $g_m=(\alpha_F I_0)/(2V_T)$  for  $|\Delta V_{in}| = 90mV$ ) is obtained when  $V_K=3.29V_T$  (see Fig. 3-4), but  $g_m$  is no longer linear within  $|\Delta V_{in}| < 90mV$ .

Higher harmonics are generated when the Kimura core is overdriven by a differential input ( $|\Delta V_{in}|$ ) larger than approximately 1.5V<sub>T</sub>. Transient simulations of the differential output current (normalized to  $\alpha_F I_0$  and the input signal period) for amplitudes  $\Delta V_{in}$  ranging from 10 mV to 250 mV are plotted in Fig. 3-5. The



Fig. 3-5: Large-signal transient simulation results for ∆lout.

coefficients of a Fourier series for the DC,  $2^{nd}$ ,  $4^{th}$ , and  $6^{th}$  harmonics in the simulated output current are plotted against the peak differential input voltage in Fig. 3-6. The second harmonic reaches a maximum of  $\sim 0.8\alpha_F I_0$  for  $\Delta V_{in-pk}$  at 135 mV, and the magnitude of the fourth harmonic approaches  $0.15\alpha_F I_0$ . Therefore, the input signal should be backed-off in amplitude when used as a doubler (e.g., to  $\sim 65 \text{ mV}_{pk-diff}$ ) in order to suppress the (spurious) fourth harmonic at the output.

The current output of the Kimura multiplier is rich in even-order harmonics and (ideally) contains no signal at the fundamental frequency or odd harmonics of the input, as seen in Fig. 3-6. Analysis of the response for the multiplier is therefore simplified by the difference in frequency between currents flowing in the input and output loops.

The isolation between input and output loops due to this frequency difference implies that changes in one part of the circuit (e.g., at the input) do not affect the other (e.g., the output) because the two loops operate independently. Moreover, parasitic feedback from the output back to the input has little effect on the frequency response because the Miller effect at the fundamental frequency is negligible in this circuit. Any residual higher frequency energy fed back from the output (e.g., via the transistor's Miller capacitance) may be filtered out by tuning (i.e., narrowband



Fig. 3-6: Large-signal spectral components vs.  $\Delta V_{\text{in}}.$ 

design), or by driving the core from a low-impedance source in a wideband implementation. The largest tone at the output is one octave away from the input signal for a doubler (even higher for greater multiplier orders), and leakage back to the input is easily suppressed with a low-order filter (e.g., single L-C resonator). Isolation between output and input improves when small-area transistors with minimal parasitics are employed in the core.

# 3.2.1 Core input optimization

Voltage  $V_K$  can be implemented as a DC bias or using different emitter area transistors in each pair, nevertheless, the latter approach introduces an imbalance in parasitic capacitance between the input and output of the circuit due to transistor mismatch. Transistors  $Q_1$  to  $Q_4$  are made identical in area, and a DC voltage offset is used to implement  $V_K$ . This minimizes the effects of parasitic mismatch on the multiplier response.

Optimization of the frequency response on the input side begins with a simulation of the multiplier core of Fig. 3-2 using 50- $\Omega$  sources and a short-circuit load to enforce complete isolation between the input and output loops. A total bias current of 9 mA is chosen (i.e., I<sub>0</sub>=4.5 mA), and large-signal Spectre-RF simulations predict that the widest frequency response is realized when transistors Q<sub>1</sub> to Q<sub>4</sub> are 90 nm x 2.5 µm in area.

The transistor's parasitic emitter resistance ( $R_E$ ) decreases the peak output current and increases the input voltage range of the multiplier core as described in [7]. The effect of  $R_E$  (~3- $\Omega$ ) on the multiplier performance is small, even for the small-area transistors used in the actual implementation. Simulations of the prototype circuit's transfer function at low frequency reveal an 8 % drop in peak output current, and a 5 % increase for the input voltage range when compared to the predictions of closed-form solution Eq. 1.



Fig. 3-7: Small-signal frequency response of  $\Delta I_{out}$ .

The differential output current ( $\Delta I_{out}$ ) normalized to the maximum output current of the doubler versus frequency is shown in Fig. 3-7 (indicated as  $Q_1$ - $Q_4$  only). It peaks near DC and rolls-off with increasing frequency, dropping to 20 % of the low-frequency maximum at 36 GHz.

In order to extend the response further, emitter followers  $Q_5$  to  $Q_8$  are added at the circuit inputs (see Fig. 3-8). The transistors are also 90 nm x 2.5 µm in area and biased close to peak  $f_T$  (i.e., 4 mA). They lower the impedance driving the baseemitter capacitance of the core transistors. Simulations predict that the output impedance of the followers (i.e.,  $Q_5$ - $Q_8$  in Fig. 3-8) rises from 16  $\Omega$  (1/gm) at low frequency to 42+ j25  $\Omega$  at 100 GHz. The inductive part of the follower output



Fig. 3-8: Modified circuit to improve its input bandwidth.

impedance peaks the high frequency response of the multiplier core, thereby extending its bandwidth. With this modification, the simulated output current drops by 20% at 104 GHz (i.e., a 70 GHz increase over the circuit without followers), as shown in Fig. 3-7 (i.e., with  $Q_5$ - $Q_8$ ).

Substantial feedback from the collector to the base occurs when the ratio  $X_{Cjc}/r_b$  approaches unity. However,  $X_{Cjc}/r_b=9$  at 100 GHz for the small-area core transistors used in this work (from small-signal simulation). Consequently, feedback from output to input is negligible when a low impedance driving source such as an emitter follower is used.

Emitter follower buffers also simplify the implementation of the offset voltage at the inputs of  $Q_1$  and  $Q_3$ . Resistors  $R_1$  and  $R_2$  inserted in series with the emitters of  $Q_5$  and  $Q_7$ , respectively (15  $\Omega$  each as seen from Fig. 3-8), lower the baseemitter voltage (V<sub>BE</sub>) of  $Q_1$  and  $Q_3$  by 60 mV (i.e., V<sub>K</sub>=2.29V<sub>T</sub>) compared to the  $V_{BE}$  of  $Q_2$  and  $Q_4$ . Quiescent currents of 0.25 mA and 4.25 mA flow through identically sized diff pairs Q1, Q3 and Q2, Q4, respectively. However, inserting the series resistors disrupts the anti-phase relationship between I<sub>op</sub> and I<sub>om</sub> due to the low-pass filtering effect of R<sub>1</sub>, R<sub>2</sub> and the input capacitance of transistors Q<sub>1</sub> and Q<sub>3</sub>. This R-C lowpass filter delays the signal current  $i_{C1}+i_{C3}$  with respect to  $i_{C2}+i_{C4}$  at the output. A pair of 70-pH inductors (i.e.,  $L_1$  and  $L_2$  in Fig. 3-9a) are added in series with the inputs of  $Q_1$  and  $Q_3$  to extend the frequency response further, as shown in Fig. 3-9b from simulation. Inductors  $L_1$  and  $L_2$  are made series resonant with the transistor input capacitance. The series peaking also compensates for phase distortion in the output currents. A transient simulation with a 50 GHz input signal predicts 11% higher differential and 7% higher common-mode output current after peaking. The increased CMRR (i.e., differential/common-mode ratio) indicates less phase distortion between signal currents  $i_{C1}+i_{C3}$  and  $i_{C2}+i_{C4}$  after the insertion of L<sub>1</sub> and L<sub>2</sub>.

Chapter 3



Fig. 3-9: Bandwidth improvement of doubler core with series inductors.

For a narrowband (NB) design, a tuned L-C network matching the Kimura core to the RF source on the input side realizes the desired input voltage swing of 100-150 mV, while consuming less RF power than a wideband 50- $\Omega$  termination. Consider the trifilar transformer of Fig. 3-10 as the interface to the multiplier pairs for a NB application. Each pair of transistors is DC biased independently via the center tap (e.g., 1.54 V to Q<sub>1</sub>,Q<sub>3</sub> and 1.6 V to Q<sub>2</sub>,Q<sub>4</sub>.) to enforce the desired input offset, V<sub>K</sub>. The RF source is connected to the primary winding with a total self inductance (L<sub>P</sub>) of 70 pH. The real part of the shunt-equivalent impedance of a bipolar differential pair is typically very large at the input (i.e., 5 k $\Omega \parallel$  16 fF for a 90 nm x



Fig. 3-10: Narrowband input interface to the multiplier core.

1.25  $\mu$ m area HBT), so a termination resistor is still required. Placing the termination at the transformer secondary outputs (e.g.,  $R_{T1}=R_{T2}=250 \ \Omega$  in Fig. 3-10) fulfills this requirement, and the step-up in voltage between primary and secondary reduces loading on the RF source (i.e., turns ratios 1:1.45:1.45). A differential input swing of 120 mV<sub>pk-pk</sub> is realized across the inputs of each pair for an RF input power of -18.4 dBm. The emitter followers used at the input for a wideband circuit are not required in the NB case. Resonant tuning at the input compensates for roll-off effects of the conversion gain due to transistor parasitic capacitance across a narrow frequency range without consuming DC power.

The trifilar transformer balun of Fig. 3-10 is comprised of an overlay of metals with thicknesses of 3  $\mu$ m (copper primary) and 0.81  $\mu$ m (aluminum secondary), respectively. The oxide thickness between metal layers is 2.13  $\mu$ m, while the secondary resides over 7.125  $\mu$ m thick oxide on a 300  $\mu$ m thick, 10  $\Omega$ -cm silicon substrate [10]. Simplified, equivalent circuit parameters for the transformer are listed on the schematic of Fig. 3-10. Tuning capacitors C<sub>S1</sub> and C<sub>S2</sub> are padded by the

parasitic capacitance of the transformer windings to equalize the capacitance across the secondaries. A coupling coefficient ( $k_m$ ) of 0.7 at low frequency (1 MHz) is predicted between the windings from electromagnetic simulations. The fractional bandwidth at the input is 22%, i.e., 10 GHz bandwidth centered at 45 GHz, and the simulated passband at the input ranges from 39 GHz to 49 GHz with  $|S_{11}| < -10$  dB.

#### **3.2.2 Output load**

At the output, DC offset between the quiescent currents of  $Q_1$  and  $Q_3$  with respect to  $Q_2$  and  $Q_4$  of the multiplier core must be eliminated in order to realize a wideband circuit that can be DC coupled to a load (e.g., an output buffer). The offset voltage at the load resulting from the large difference in bias currents could be suppressed using a parallel resonant circuit for narrowband applications (as shown in Fig. 3-11). The small DC winding resistance ( $R_L$ ) of an inductor load ( $L_L$ ) with parasitic capacitance ( $C_L$ ) minimizes the offset voltage between outputs, which simplifies offset compensation. The AC load seen by the multiplier core is then a parallel LC circuit with a center frequency and Q-factor that determines the passband of the output loop. The overall frequency response of the multiplier is simply a cascade of the input and output responses as the 2 loops are isolated sufficiently. Simulations predict 10.5 GHz bandwidth at the output (84.5 to 95 GHz) for an inductance of 140 pH in parallel with 22 fF (formed by load capacitance  $C_L$  and the



Fig. 3-11: Narrowband multiplier load.



Fig. 3-12: Wideband output load for the frequency multiplier.

total parasitic capacitance at the collectors of  $Q_1$  to  $Q_4$ ). Overall, 11% fractional bandwidth is achieved for the NB doubler with a maximum conversion gain of 6 dB at 90 GHz (see performance summary of Table 3-1 on page 69).

Wideband operation of the Kimura multiplier requires a broadband load, and the offset caused by asymmetry in the collector DC currents must be suppressed by other means. The wideband, active load with common-mode feedback (CMFB) is shown in Fig. 3-12. Any DC offset between  $V_{op}$  and  $V_{om}$  at the outputs of the multiplier core is sensed by the op-amp inputs. Voltage  $V_{CTRL}$  (i.e., the DC bias voltage of active load  $Q_{16}$  and  $R_{p2}$ ) is used to set  $V_{op}$  equal to  $V_{om}$  by varying the common-mode bias at the multiplier output in response to any error voltage. The feedback action eliminates the offset in the differential output voltage of the multiplier.

The schematic for the 2-stage CMOS op-amp used in the CMFB loop is shown in Fig. 3-14. The thick-oxide MOS transistors that comprise the amplifier can handle common-mode inputs exceeding the 4.5V supply. It is compensated by resistor  $R_C$  (4.6 k $\Omega$ ) and capacitor  $C_C$  (308 fF) in series [11]. The output node has an



Fig. 3-13: Multiplier with feedback bias-circuit.

80 fF capacitor connected to  $V_{DD}$  to lower the AC impedance of  $V_{CTRL}$  at RF. The op-amp was designed for a 3-dB bandwidth of 1 MHz, and closed-loop stability was verified from small-signal and transient simulations. At high frequencies, the finite output impedance of the op-amp does not affect the circuit because the load impedance is dominated by  $R_{L1,2}$ . Small-signal simulations predict 9.4-dB gain margin and 84.4° of phase margin for the loop including the op-amp,  $R_{P2}$ ,  $Q_{16}$  and  $R_{L2}$  (see Fig. 3-13).



Fig. 3-14: Op-amp schematic.



Fig. 3-15: Feedback and active inductor circuits for DC offset suppression.

The active load consisting of  $Q_T$  and resistor  $R_P$  (see Fig. 3-15a) illustrates the effect of  $Q_{15,16}$  and  $R_{P1,2}$ . Its simplified small-signal equivalent circuit is shown in Fig. 3-15b. The impedance seen looking into the emitter is given by

$$Z_{ACT} = \frac{g_m + R_p(\omega C_{\pi})^2}{g_m^2 + (\omega C_{\pi})^2} + j \frac{\omega C_{\pi}(R_p \cdot g_m - 1)}{g_m^2 + (\omega C_{\pi})^2} \quad .$$
(5)

For  $R_{pg_m} \gg 1$ , Eq. 5 can be simplified to the series R-L equivalent

$$Z_{ACT} \approx \frac{1}{g_m} + j \frac{\omega C_\pi R_p}{g_m} \quad . \tag{6}$$

The inductive reactance of the load is used to peak the multiplier response by compensating for parasitic capacitance at the outputs. However, it should be noted that the difference in DC bias currents from the multiplier produce a different  $g_m$  in each active load. Therefore, different resistor values for  $R_{P1}$  (200  $\Omega$ ) and  $R_{P2}$  (160  $\Omega$ ) are selected to equalize the reactive components according to Eq. 6. The difference between the DC resistance seen looking into each active load is small (~6  $\Omega$ ) compared to the 145  $\Omega$  of  $R_{L1}$  and  $R_{L2}$  connected in series. The simulated reactance of the active inductor is positive up to 148 GHz. However, the total reactance of the load is dominated by the parasitic capacitance of  $R_{L1,2}$ .

## 3.2.3 Broadband and narrowband comparison

The block diagram of a broadband active frequency multiplier built around the Kimura core is shown in Fig. 3-16. The single-ended RF input is terminated onchip and then buffered to the multiplier differential pairs. The DC offset caused by asymmetry of the bias currents flowing in the core transistors is minimized by setting the core output nodes ( $V_{op}$  and  $V_{om}$  in Fig. 3-16) to the same DC voltage using a common-mode feedback (CMFB) loop comprised of an op-amp and an active load, as explained previously. Finally, an output buffer drives the following stage or a low impedance off-chip load (e.g., 50- $\Omega$  equipment used for testing and characterization).

A schematic of the broadband input balun for the prototype testchip is shown in Fig. 3-17. The response of the balun is widened by resonating peaking inductors  $L_1$ and  $L_2$  with the parasitic capacitance at the collectors of  $Q_1$  and  $Q_2$ , respectively. Feedthrough via the base-emitter capacitance is compensated by  $C_1$  (22 fF). The phase error at the differential output is reduced by selecting different areas for  $Q_1$  to  $Q_4$  (i.e.,  $l_{emitter}=2 \ \mu m$  for  $Q_1$  and  $Q_4$ , 6  $\mu m$  for  $Q_2$  and  $Q_3$ ,  $w_{emitter}=90 \ nm$ ). Simulations of the balun for a 50 GHz input predict that the phase imbalance at the output changes by 0.8% (i.e., from 1.1% to 1.9%), while the difference in output amplitudes falls from 20% to 3% after these modifications.



Fig. 3-16: Block diagram for the active frequency multiplier.



Fig. 3-17: Active input balun.

The circuit shown in Fig. 3-18 buffers  $V_{op}$  and  $V_{om}$  from the multiplier outputs to drive 50- $\Omega$  loads. Shunt feedback decreases the buffer output impedance, and peaking inductors  $L_{F1}$  and  $L_{F2}$  increase its small-signal bandwidth from 70 GHz to 130 GHz, as in [12]. The bases of  $Q_3$  and  $Q_4$  are shorted together on chip (base node  $V_{CASC}$  in Fig. 3-18), and AC-grounded via a 500-fF capacitor shunted by a 1.12-pF capacitor (damped by an on-chip 3- $\Omega$  series resistor). Further decoupling of the voltage biasing  $V_{CASC}$  by 1  $\mu$ F in parallel with 22  $\mu$ F is added off-chip during testing and characterization of the prototype described later in this chapter.



Fig. 3-18: Schematic of the broadband output buffer.

Chapter 3



Fig. 3-19: Simulated CG vs. frequency for NB and WB doubler circuit examples.

Simulated conversion gains for the wideband (WB, Fig. 3-16) and narrowband (NB, Fig. 3-10 and Fig. 3-11) multipliers are compared in Fig. 3-19, and their simulated performance is summarized in Table 3.1..

 Table 3-1: Simulated wideband and narrowband doubler performance comparison

| Parameter                  | Wideband (WB)                                   | Narrowband (NB)                    |  |
|----------------------------|-------------------------------------------------|------------------------------------|--|
| Bandwidth                  | DC-105 GHz (CG > 0)                             | 84.5-95 GHz (-3dB BW)              |  |
| RF input amplitude         | 65 mV-pk                                        | 40 mV-pk                           |  |
| Conversion gain            | 12 dB max.                                      | 6 dB max.                          |  |
| Spurious suppression       | 28 dBc                                          | 90 dBc                             |  |
| Number of transistors      | 8                                               | 4                                  |  |
| Input/Output configuration | Emitter Follower (input).<br>Broadband (output) | Transformer input,<br>LC-tank load |  |
| Core transistor length     | 2.5 um                                          | 1.5 um                             |  |
| Core power dissipation     | 25 mA from 4.5 V: 112.5 mW                      | 3 mA from 1.8 V: 5.4 mW            |  |

The potential for low power operation of a frequency doubler based on unbalanced emitter coupled pairs is clearly seen for the NB example. Only 1.8 V is required to supply the narrowband multiplier (i.e., 1.2 V for the core and 0.6 V for a MOS tail current source), because the DC drop across the L-C resonator load at the output is negligible. It consumes a fraction of the power dissipated by the wideband circuit (112.5 mW vs. 5.4 mW). Comparable conversion gains are realized in-band using a lower g<sub>m</sub> from the multiplier core in the NB case (8 mS at I<sub>0</sub> = 1.5 mA, as per Eq. 4), because the 800  $\Omega$  impedance of the L-C load at resonance (tank Q of 9) is approximately 3x larger than the 290  $\Omega$  (active) load of the WB doubler. The RF input signal is also reduced from 65 mV (WB) to 40 mV (NB) thanks to the step-up of the transformer input balun in the NB design.

The NB multiplier has clear advantages in power consumption and efficiency compared to the wideband multiplier. However, a wideband doubler could be used in a multiband transceiver that covers the 57-64 GHz, 71-76 GHz, 81-86 GHz, and 92-95 GHz bands proposed for mm-wave communication. This may be advantageous in a basestation application, where the higher power consumption could be supported easily. A wideband prototype is therefore developed further in the next section of this chapter. It is less conventional in its design, and provides proof of the Kimura multiplier concept at RF.

#### **3.3** Wideband doubler prototype

A wideband prototype has been implemented according to the block diagram of Fig. 3-16 and it is used to prove the Kimura multiplier concept at RF.

A photomicrograph of the wideband doubler test circuit fabricated in IBM's 90-nm SiGe-BiCMOS technology [10] is shown in Fig. 3-20. The multiplier core and active load occupy just 7200  $\mu$ m<sup>2</sup> of the 0.37 mm<sup>2</sup> area (incl. bondpads). This is a small fraction of the area required by a typical passive mm-wave doubler, e.g., the 20 GHz, 1.05 mm x 0.8 mm passive frequency doubler in [13]. The current drawn by



Fig. 3-20: Photomicrograph of the doubler testchip.

the principal circuit blocks in the doubler prototype are: 25 mA in the core and active load, 9.5 mA by the input balun, 12 mA in the output buffer, and 8.5 mA for biasing (55.6 mA in total). When powered from a single 4.5 V supply, it consumes 250 mW in total.

On-die characterization using a Rohde and Schwartz FSUP spectrum analyzer was performed in 3 bands: 10 GHz to 50 GHz, 50-75 GHz (i.e., V-band), and



Fig. 3-21: Doubler output spectrum measured in: a) V-band (50-75 GHz), and b) W-band (75-100 GHz).

75-100 GHz (W-band) via mm-wave downconverting mixers. The RF powers applied to the input and measured at the output are corrected for losses of the probes (0-1 dB, depending on frequency) and cables (1 to 2.5 dB) comprising the test set-up.

Fig. 3-21 shows the V- and W-band output spectra measured for 30, 35, 40, 45, and 50 GHz inputs overlapped on the plots. Measured conversion gain (CG) for the doubler decreases from +12 dB to 0 dB as the output frequency ranges from DC to 100 GHz.

The relationship between input and output powers ( $P_{in}$  at 40 GHz and  $P_{out}$  at 80 GHz) for the doubler is plotted in Fig. 3-22. The output power increases almost linearly (in dB) up to -15 dBm input power, beyond which the output power saturates at approximately -7.5 dBm. Measured (power) gain for an 80 GHz output is 6.6 dB when -15 dBm is applied at the input. Gain expansion is observed between measured input power levels of -20 dBm and -15 dBm at 80 GHz output, which agrees with the simulations shown in Fig. 3-22. The greatest suppression of the fundamental frequency (28 dBc) is realized at an input power,  $P_{in}$  of -14 dBm. The 4<sup>th</sup> harmonic (i.e., x4=160 GHz at the output) could not be measured due to bandwidth limitations of the test set-up, however, simulation data is plotted in Fig. 3-22 for comparison.



Fig. 3-22: Measured and simulated (dashed) doubler output vs. input power for a 40GHz input signal.

Chapter 3



Fig. 3-23: Measured and simulated (dashed) doubler output power vs frequency.

The x4 component increases rapidly above -15 dBm input power (i.e., peak  $V_{in} > \sim 6V_T$ ), but lies below the fundamental frequency for input powers below -10 dBm.

In Fig. 3-23, the measured and simulated output power (x2) vs. frequency are plotted from 10-110 GHz. Suppression of the fundamental frequency is also shown, as the measured and simulated outputs at the fundamental are plotted on the same figure. The fundamental frequency is suppressed by more than 28 dBc across 10 to 100 GHz, while the x4 component is suppressed by at least 30 dBc across the same frequency range (using the x2 output as reference). A slow roll-off in the output power is observed in both simulations and measurements.

Phase noise measurements for the doubler prototype are shown in Fig. 3-24a. The measured phase noise of the input signal generator at 20 GHz is plotted on the same figure. Below 0.8 MHz offset, the phase noise seen at the output of the doubler is 6 dB larger than the phase noise produced by the source (i.e., 20log(N) where N is the multiplication factor). Above 0.8 MHz the doubler noise floor increases the phase noise difference above 6 dB respect to the signal generator.

The difference in noise between the input (signal generator) and output (frequency doubler) is seen clearly from the plot of Fig. 3-24b. White noise is added to the sinusoidal input applied to the multiplier (in simulation) to ensure that the output phase noise is well above the doubler noise floor. The simulation results indicate that the difference in phase noise between the doubler input and output is close to the ideal of 6 dB across 1 kHz to 1 MHz offset. The simulations also predict that noise added by the output buffer exceeds the multiplier noise above 1 MHz offset. The buffer noise can be reduced at the expenses of higher power consumption.

| Ref.                    | Mult.<br>Fact. | Max.<br>CG<br>(dB) | Output<br>BW<br>(GHz) | Input<br>Power<br>(dBm) | Max./Min.<br>Suppr. of<br>fin (dBc) | DC Power (mW)/<br>VSupply (V) | Technology             |
|-------------------------|----------------|--------------------|-----------------------|-------------------------|-------------------------------------|-------------------------------|------------------------|
| This<br>work            | x2             | 12                 | DC-100                | -16                     | 28/22                               | 250/4.5                       | 90-nm SiGe-<br>BiCMOS  |
| [6],<br>TMTT<br>2005    | x2             | 1                  | DC-100                | -12                     | 30/-                                | 150/4                         | 200 nm InP<br>HBT      |
| [14],<br>MWCL<br>2008   | x2             | 5.7                | 3-50                  | 0                       | 30/15                               | 600/-                         | GaAs<br>PHEMT          |
| [15],<br>MWCL<br>2009   | x2             | 10.2               | 36-80                 | -8                      | 36/20                               | 137/3.3                       | 180 nm SiGe-<br>BiCMOS |
| [16],<br>ICICDT<br>2010 | x2             | 6                  | 106-128               | 0                       | -/-                                 | 23/-                          | 65nm CMOS              |

 Table 3-2: Frequency doubler performance comparison

The prototype doubler presented in this chapter is compared with other published frequency doublers in Table 3-2. Compared to the Kimura doubler, only example [6] - a Gilbert multiplier core implemented in InP - has similar bandwidth. However, this multiplier requires 4 dB higher input signal amplitude from the driving source and has a maximum conversion gain of only 1 dB compared to 12 dB CG for the SiGe doubler prototype. Example [14] extracts the x2 frequency output from the common-drain node of a differential amplifier (the same principle used in [17]), but it

Chapter 3



Fig. 3-24: Phase noise for the doubler at 40GHz output.

requires 16 dB higher input power (i.e., 0 dBm). It develops 6 dB lower conversion gain while consuming twice as much DC power, and operates across one-half of the bandwidth of our SiGe doubler. The Gilbert multiplier with transmission line loads implemented in SiGe [15] offers the highest spurious suppression, i.e., 36 dBc and 20 dBc maximum and minimum across 36-80 GHz, respectively. However, it requires 8 dB higher input power and operational bandwidth is less than one-half that of the wideband SiGe prototype developed in this work. Finally, example [16] is a

narrowband injection-locked based multiplier implemented in a 65-nm CMOS technology. Its performance is inferior to the SiGe prototype in conversion gain (i.e., only 6 dB conversion gain at 115 GHz), input sensitivity (min. 0 dBm input power required) and narrowband operation (106-128 GHz locking range). Power consumption can be traded-off with bandwidth (as seen from Table 3-1 on page 69), and greater efficiency would be realizable from a narrowband Kimura-type doubler operating from the minimum supply voltage (i.e., ~1.8 V).

# **3.4** Frequency quadrupler

Higher-order multiplication may be realized by cascading frequency doublers, as the broadband response of the Kimura frequency doubler obviates the need for tuning to align center frequencies in a cascade. However, it is possible to design the multiplier to select a higher harmonic and suppress the other (unwanted) frequencies, and obtain greater efficiency using a single multiplier stage. For example, an output rich in the 4<sup>th</sup> harmonic is obtained by driving the core (Fig. 3-2) with a square-waveshape signal rather than sinusoidal input (V<sub>in</sub>, as in Fig. 3-25).



Fig. 3-25: Quadrupler input/output waveforms.



Fig. 3-26: Quadrupler  $\Delta I_{out}$  frequency components.

Fig. 3-26 shows that the 4<sup>th</sup> harmonic output current approaches 45% of  $I_0$  for an input voltage between 0.17 V to 0.22 V (i.e., 15% more than for a sinusoid as in Fig. 3-6). The 2<sup>nd</sup> and 6<sup>th</sup> harmonics in the output current must then be suppressed. The design of a quadrupler based on the Kimura core targets 0 dB CG at 90 GHz, and it is described next.

The quadrupler prototype (see Fig. 3-27) replaces the active input balun used for the doubler with a fully differential buffer (with dual 50- $\Omega$  on-chip terminations). A CMFB loop (similar to the one used in the doubler prototype) is used to eliminate offset in the differential voltage at the output of the multiplier caused by asymmetry



in currents biasing the diff pairs ( $I_{4p}$  and  $I_{4m}$ ). The tuned active load of the quadrupler is designed to filter out all harmonics except the 4<sup>th</sup>. A passive inductor is a potential alternative in this implementation, however, it would occupy a larger area in the physical layout, and the winding resistance of the inductor would produce a DC offset at the quadrupler output. An active inductor requires less area, and can be made tunable, but it increases the required supply voltage, and generates unwanted noise and distortion at the output [18].

The quadrupler uses different emitter areas to implement the required offset voltage ( $V_K$ ). Compared to the doubler core in Fig. 3-13, the quadrupler uses emitter followers ( $Q_5$  to  $Q_8$ ) to lower the input impedance (as the doubler core in Fig. 3-13), 2.5 µm length transistors biased at 4 mA. The transistors lengths are 6 µm for  $Q_1$ - $Q_3$ , and 2 µm for  $Q_2$ - $Q_4$ . Both pairs are biased at 4 mA ( $I_0$ ).

A Cherry-Hooper (CH, [19]) amplifier is used to implement the input buffer. The CH amplifier uses a transconductance stage ( $Q_1$ ,  $Q_6$  in Fig. 3-28) to convert the input voltage to a current and a second (transimpedance) stage defines the output voltage. The design guidelines proposed in [20] were followed for the implementation of the CH amplifier. The circuit in [19] was modified as follows: transistors  $Q_7$ ,  $Q_8$  are used to broaden the bandwidth [21], and resistors  $R_3$ ,  $R_4$  are added to increase the gain [22]. Darlington pairs  $Q_2$ ,  $Q_3$  and  $Q_4$ ,  $Q_5$  in Fig. 3-28 are



Fig. 3-28: Modified Cherry-Hooper input buffer used in the quadrupler testchip.

used to decrease the effective input capacitance of the transimpedance stage and broaden its bandwidth. The -3 dB bandwidth of the transimpedance section rises from 48 GHz to 65 GHz after this modification. However, the supply voltage is increased from 3.3 V to 4.5 V to supply the voltage headroom required by the added Darlington pairs.

The quadrupler active load is shown in Fig. 3-29a. The impedance seen at the emitter of  $Q_2$  follows Eq. 6, which is repeated here for convenience,  $Z_{ACT} \approx \frac{1}{g_m} + j \frac{\omega C_{\pi} R_p}{g_m}$ . The effective capacitance  $C_{\pi}$  of the Darlington pair  $Q_{1,2}$  is controlled in this topology, making the load tunable. The reactance component of  $Z_{ACT}$  changes with  $C_{\pi}$ , which is controlled by the current flowing in  $Q_1$ . Voltage  $V_{Tune}$  is used to vary the bias current. Fig. 3-29b shows the normalized impedance magnitude vs. frequency for three different values of  $V_{Tune}$ . With this approach, the



Fig. 3-29: Voltage controlled active inductor.

active tuned load is used to set the band-pass center frequency after fabrication, minimizing the effects of process variations on the design.

## **3.5** Narrowband quadrupler prototype

The IC photomicrograph is shown in Fig. 3-30. It has an active area of  $0.0132 \text{ mm}^2$ . The prototype input buffer consumes 17 mA, the core (incl. followers), op-amp and tuned active load consume 25 mA, the output buffer 12 mA, and the biasing circuitry 10 mA (total of 64 mA). The testchip is powered using a single +4.5 V supply and it consumes a total of 288 mW.

The testchip was measured in the V- and W-bands using downconversion mixers. The quadrupled output measured at  $V_{tune}=1$  V is plotted in Fig. 3-31. Maximum conversion gain (0 dB) is measured for an input signal of 70 mV (-20 dBm) to the quadrupler. The bandpass response of the measured conversion gain is narrower than the simulations, which is caused primarily by the parasitics originated by the deliberate asymmetry (i.e., different emitter areas) used in the Kimura multiplier core, which affect the phase relationship between currents driving the load as the frequency increases. Also, the change of the active load impedance across frequency causes the quadrupler output power to vary. Unlike the doubler, compensation in the core input to adjust the response is not used in the prototype, but



Fig. 3-30: Quadrupler testchip photomicrographs.



Fig. 3-31: Frequency quadrupler output power vs. frequency.

could be implemented in a similar manner (i.e., using series peaking of the larger-area transistors in each diff pair). Implementing this changes improves the frequency response as shown by the (expected) simulated response in Fig. 3-31.

Biasing of the quadrupler active load is controlled by  $V_{tune}$ . Output power measurements at 4 input frequencies (i.e., 20, 22, 24, and 26 GHz) for 3 bias control settings ( $V_{tune}$ =0.5, 1, and 2V) are shown in Fig. 3-32. The output power at 80 GHz is measured with a 20 GHz input signal and annotated for each setting of  $V_{tune}$  in the same figure. The output power varies by approximately 7 dB as  $V_{tune}$  is adjusted, ranging from -29 dBm for  $V_{tune}$  of 2 V, to -25 dBm at 1 V, and -22 dBm for a  $V_{tune}$  of 0.5 V.

Measured phase noise for the quadrupler is plotted in Fig. 3-33. The phase noise at the output is consistently 12 dB higher than the phase noise of the input source (both shown in Fig. 3-33) up to ~100 kHz offset from the carrier. The measured data corresponds to the expected noise increase (i.e.,  $20\log_{10}(4) = 12 \text{ dB}$ ). The simulated quadrupler noise floor is also included in the figure. The measured difference between input and output noise is plotted in Fig. 3-34. The difference in phase noise between output and input increases monotonically at offsets beyond 100 kHz, as thermal noise floor of the output buffer dominates the phase noise at



voltages.

higher offsets. The simulated quadrupler and source phase noise difference shown in Fig. 3-34 uses a signal source with higher phase noise than the quadrupler noise floor, which helps to identify the 12 dB difference across the entire range.



Fig. 3-33: Quadrupler phase noise at 90GHz output signal.

Chapter 3



Fig. 3-34: Quadrupler and source phase noise difference.

Frequency quadruplers reported in recent literature are compared with the quadrupler prototype in Table 3-3 on page 84. All of the circuits listed in the table are designed as quadruplers, i.e., they are not cascaded frequency doubling circuits. Following references [23] to [25], -3 dB bandwidth at the output is quoted for each quadruplers. The 16 GHz bandwidth of example [23] is comparable to the performance of the SiGe prototype from this work (16 GHz). Although it consumes just 35.3 mW (compared to 288 mW for the SiGe prototype) it requires 17 dB higher input power, i.e., -3 dBm RF input power compared to -20 dBm of the Kimura based quadrupler. The multiplier in [24] has 23 GHz bandwidth and consumes less than one-half the power of the Kimura-based quadrupler (i.e., 117 mW compared to 288 mW), with comparable spurious suppression. Nevertheless, [24] has 18 dB conversion loss, and requires 28 dB more input power (i.e., 8 dBm) than our quadrupler. Example [25] has only 4 GHz bandwidth, requires 12 dB higher input power compared to the Kimura quadrupler, and it has a conversion loss 7.5 dB higher than the SiGe quadrupler prototype.
| Ref.                   | Mult.<br>Fact. | Max.<br>CG<br>(dB) | Output<br>BW<br>(GHz) | Input<br>Power<br>(dBm) | Max.<br>Suppr. of<br>f <sub>in</sub> (dBc) | DC Power<br>(mW)/ Supply<br>Voltage (V) | Technology           |
|------------------------|----------------|--------------------|-----------------------|-------------------------|--------------------------------------------|-----------------------------------------|----------------------|
| This<br>work           | x4             | 0                  | 81-97                 | -20                     | 17                                         | 288/4.5                                 | 90nm SiGe-<br>BiCMOS |
| [23],<br>ISSCC<br>2012 | x4             | 0.6                | 121-137               | -3                      | -                                          | 35.2/1.6                                | 0.13μm SiGe          |
| [24],<br>EuMIC<br>2010 | x4             | -18                | 52-75                 | 8                       | 15                                         | 117/-                                   | 0.25μm BiC-<br>MOS   |
| [25],<br>EUMC<br>2001  | x4             | -7.5               | 74-78                 | 8                       | -                                          | -                                       | 0.15μm<br>PHEMT      |

Table 3-3: Frequency quadrupler performance comparison

#### 3.6 Summary

A low-voltage, active IC multiplier suitable for narrow (NB) and wideband (WB) applications at mm-wave frequencies was investigated in this chapter. The multiplier core is comprised of asymmetric cross-coupled differential pairs, whose asymmetry can be implemented using different transistor sizes or using matched differential pairs with asymmetric biasing. Experimental results for a wideband frequency doubler and narrowband frequency quadrupler prototypes designed in a 90-nm SiGe-BiCMOS technology were also presented. The frequency response of the WB doubler is peaked at the input using on-chip series inductors, and at the output by a wideband active load in a common-mode feedback loop. Measured conversion gain for the doubler ranges from +12 dB to 0 dB across DC-100 GHz, with 28 dBc to 22 dBc suppression of the fundamental frequency at the output across this frequency range. The measured results agree very well with the predictions from simulation. The 81-97 GHz frequency quadrupler was implemented with unmatched emitter area transistors in a Kimura-type core. Measured conversion gain of the quadrupler is 0 dB (maximum) and spurious outputs with respect to the quadrupled output are suppressed by 17 dBc. The performance of the bipolar multiplier

prototypes is competitive with conventional active IC designs, nevertheless, its design can be improved using same size transistors asymmetrically biased using a transformer at the input, and an LC network at the output.

The performance of the Kimura-type bipolar multipliers developed in this chapter are consistent with other active IC designs (e.g., a Gilbert multiplier). However, Kimura multipliers require less bias voltage headroom than active multipliers built from cascoded stages. The NB bipolar doubler proposed in this work operates from a single 1.8 V supply with a simulated peak conversion gain of 6 dB centered at 89 GHz and a fractional bandwidth of 11%. The trade-off between current consumption (25 mA at 4.5 V vs. 3 mA at 1.8 V for WB and NB circuits, respectively) and operating bandwidth (100 GHz vs. 10.5 GHz) was clearly delineated for WB vs. NB multipliers. Finally, spurious outputs of these circuit topologies are (ideally) limited to even-order harmonics of the fundamental frequency. This relaxes the demands for output filtering and simplifies the control of spurs in a monolithic system implementation.

#### References

[2] G. Chattopadhyay, E. Schlecht, J.S. Ward, J.J. Gill, H.H.S. Javadi, F. Maiwald, and I. Mehdi, "An All-solid-state Broadband Frequency Multiplier Chain at 1500 GHz," *IEEE Transactions on Microwave Theory and Techniques*, vol. 52, no. 5, pp. 1538-1547, May 2004.

[3] P. Weger, G. Schultes, L. Treitinger, E. Bertagnolli, K. Ehinger, "Gilbert Multiplier as an Active Mixer with Conversion Gain Bandwidth of up to 17 GHz," *Electronics Letters*, vol. 27, no. 7, pp. 570-571, March 1991.

[4] B. Gilbert, "A precise Four-quadrant Multiplier with Subnanosecond Response," *IEEE Journal of Solid-State Circuits*, vol. 3, no. 4, pp. 365-373, Dec. 1968.

[5] S. Hackl and J. Bock, "42 GHz Active Frequency Doubler in SiGe Bipolar Technology," Proc. of the IEEE-ICMMT, Beijing, China, pp. 54–57, Aug. 2002.

[6] V. Puyal, A. Konczykowska, P. Nouet, et al., "DC-100-GHz Frequency Doublers in InP DHBT Technology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 53, no. 4, pp. 1338-1344, April 2005.

<sup>[1]</sup> T. W. Crowe, W.L. Bishop, D.W. Porterfield, J.L. Hesler, and R.M. Weikle, "Opening the Terahertz Window with Integrated Diode Circuits," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 10, pp. 2104–2110, Oct. 2005.

[7] K. Kimura, "A Bipolar Four-quadrant Analog Quarter-square Multiplier Consisting of Unbalanced Emitter-coupled Pairs and Expansions of its Input Ranges," *IEEE Journal of Solid-State Circuits*, vol. 29, no. 1, pp. 46-55, Jan. 1994.

[8] A. Ogawa and H. Kusakabe, "Frequency Doubling Circuit," Japan. Patent 1360758: Japanese Examined Patent Publication 61-025242B, July 14, 1986.

[9] K. Kimura, "Some Circuit Design Techniques for Bipolar and MOS Pseudologarithmic Rectifiers Operable on Low Supply Voltage," *Transactions on Circuits and Systems I: Fundamental Theory and Applications*, vol.39, no.9, pp.771-777, Sept. 1992.

[10] J.J. Pekarik, J. Adkisson, P. Gray, et. al., "A 90nm SiGe BiCMOS Technology for mm-Wave and High-Performance Analog Applications," Proc. of the IEEE-BCTM, San Diego CA, pp. 92-95, Oct. 2014.

[11] P. R. Gray; R. G. Meyer, "MOS Operational Amplifier Design A Tutorial Overview," *IEEE Journal of Solid-State Circuits*, vol. 17, no. 6, pp. 969-982, Dec. 1982.

[12] L. Vera, J.R. Long, and B.J. Gross, "A Low-Power SiGe Feedback Amplifier with Over 110 GHz Bandwidth" Proc. of the IEEE-BCTM, San Diego CA, pp. 1-4, Oct. 2014.

[13] M. Adnan, E. Afshari, "A Low Conversion Loss Passive Frequency Doubler," Proc. of the IEEE-ICCC, San Jose CA, pp. 1-4, Sept. 2011.

[14] L. Yu, Y. Tao, Y. Ziqiang, C. Jia, "A 3–50 GHz Ultra-Wideband PHEMT MMIC Balanced Frequency Doubler," *IEEE Microwave and Wireless Components Letters*, vol. 18, no. 9, pp. 629-631, Sept. 2008.

[15] A.Y-K. Chen, et. al., "A 36–80 GHz High Gain Millimeter-Wave Double-Balanced Active Frequency Doubler in SiGe BiCMOS," *Microwave and Wireless Components Letters*, vol. 19, no. 9, pp. 572-574, Sept. 2009.

[16] E. Monaco, M. Pozzoni, F. Svelto, A. Mazzanti, "A 6mW, 115GHz CMOS Injectionlocked Frequency Doubler with Differential Output," International Conference on IC Design and Technology (ICICDT), pp. 236-239, June 2010.

[17] C. Aichin and J.R. Long, "A 5-6-GHz Bipolar Quadrature-phase Generator," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 10, pp. 1737-1745, Oct. 2004.

[18] T.H. Wu, C. Meng, T.-H. Wu, "5.2 GHz SiGe HBT upconverter using active-inductor LC current mirror," *Electronics Letters*, vol. 42, no. 15, pp. 859-860, July 2006.

[19] E.M. Cherry, D.E. Hooper, D.E., "The design of wide-band transistor feedback amplifiers," *Proceedings of the Institution of Electrical Engineers*, vol.110, no.2, pp.375-389, February 1963.

[20] C.D. Holdenried, J.W. Haslett, M.W. Lynch, "Analysis and design of HBT Cherry-Hooper amplifiers with emitter-follower feedback for optical communications," *IEEE-JSSC*, vol. 39, no. 11, pp. 1959-1967, Nov. 2004.

[21] N. Ishihara, O. Nakajima, H. Ichino, Y. Yamauchi, "9 GHz bandwidth, 8-20 dB controllable-gain monolithic amplifier using AlGaAs/GaAs HBT technology," *Electronics Letters*, vol.25, no.19, pp. 1317-1318, 14 Sept. 1989.

[22] Y.M. Greshishchev, P. Schvan, "A 60-dB gain, 55-dB dynamic range, 10-Gb/s broadband SiGe HBT limiting amplifier," *IEEE-JSSC*, vol. 34, no. 12, pp. 1914-1920, Dec. 1999.

[23] W. Yong, L.G. Wang, X. Yong-Zhong, "A 9% power efficiency 121-to-137GHz phase-controlled push-push frequency quadrupler in 0.13µm SiGe BiCMOS," IEEE-ISSCC Conference Digest, pp. 262-264, Feb. 2012.

[24] N. Kuo, Z. Tsai, K. Schmalz, J. C. Scheytt, and H. Wang, "A 52–75 GHz frequency quadrupler in 0.25- SiGe BiCMOS process," Proc. of the European Microwave Integrated Circuits Conference (EuMIC), 2010, pp. 365–368.

[25] Y. Campos-Roca, L. Verweyen, M. Fernandez-Barciela, M.C. Curras-Francos, E. Sanchez, A. Hulsmann, M. Schlechtweg, "Millimeter-wave Active MMIC Frequency Multipliers," European Microwave Conference Digest, pp. 1-4, Sept. 2001.

# 4 Frequency Division

Frequency dividers using emitter-coupled logic (ECL) master-slave (MS) flip-flops are employed in mm-wave frequency synthesizers, flash-type analog-to-digital converters and fiber-optic transmission chipsets [1]. In these and other applications, the divider operating frequency range limits its use. Therefore, increasing the divider frequency range expand the range of wideband applications.

Frequency dividers can be designed to operate as static or dynamic dividers, and both operation modes are introduced next.

#### 4.1 Static frequency divider

This type of divider is based on a bistable cell, such as a D-type flip-flop, see Fig. 4-1. The flip-flop can be designed in an emitter-coupled logic (ECL) in a bipolar process, or in source-coupled logic (SCL) in a CMOS process.

Emitter-coupled-logic (ECL) master-slave D flip-flops (MS-D-FF) reach a maximum clock frequency according to the technology in which are implemented. The optimization of its design, based on weighted time constant derived from a ECL



Fig. 4-1: Frequency divider based on master-slave D-FlipFlop.

XOR ring oscillators, was reviewed in [2]. The static frequency divider maximum operating frequency ( $f_{max-toggle}$ ) was increased using inductive peaking [3], and layout optimization [4]. However, its maximum operating frequency is lower than the maximum operating frequency of a dynamic frequency divider.

#### 4.2 Dynamic frequency divider

A dynamic frequency divider may reach frequencies beyond  $f_{max-toggle}$ , by employing the regenerative frequency division principle, which can be explained using the block diagram in Fig. 4-2. A balance modulator produces two sidebands whose frequencies are  $f_0 \pm f_1$ , where  $f_0$  and  $f_1$  are the frequencies applied to the modulator. A filter outputs  $f_1$ , which is amplified and feedback to the modulator. The component  $f_1$  maintains itself in the feedback path satisfying  $f_0 \pm f_1 = f_1$ , which is comply if  $f_1 = f_0/2$  [5].

Static operation is sacrificed in dynamic dividers because the bandpass response of the dynamic circuit topology limits the minimum frequency of operation  $(f_{dyn-min})$ . Therefore, the use of dynamic dividers at low frequencies is restricted by  $f_{dyn-min}$  while static dividers can not operate beyond  $f_{max-toggle}$ .

In this chapter a divider designed to operate in either static or dynamic mode is presented. This new type of divider is described in the rest of the chapter. It is capable of wideband frequency division (in static mode), and division close to the maximum possible toggle frequency of a dynamic divider.



Fig. 4-2: Regenerative frequency division [5].

#### 4.3 Dual mode dynastat (dynamic-static) frequency divider

The dual operation mode, dynastat divider, allows mode selection using bias control of an ECL M/S D-FF (shown in Fig. 4-3). The concept is proven with the implementation of a prototype in a SiGe-BiCMOS technology, but it can be implemented in other technologies (e.g., CMOS). Low-voltage operation is also investigated with a second prototype implemented as part of a built-in self-test (BiST) circuit, and it is presented on section 4.5 on page 96.

The schematic of the dynastat divide-by-two is shown in Fig. 4-3. Cascaded master and slave latches implement the divide-by-two function, as in a fully-static ECL divider (Fig. 4-3). However, emitter followers at each FF output split the signal into separate paths. From the master (M),  $Q_7$ - $Q_8$  buffer the signal to the slave (S) stage. Followers  $Q_9$ - $Q_{10}$  feed back the same signal to a differential pair ( $Q_3$ - $Q_4$ ) forming the latch in the master. The second modification combines the bias currents from both latches. Buffer pairs  $Q_5$ - $Q_6$  in the master and  $Q_{13}$ - $Q_{14}$  of the slave are biased by the same current source ( $Q'_1$  and  $R'_1$  in Fig. 4-3). Latch pairs  $Q_3$ - $Q_4$  of the master and  $Q_{15}$ - $Q_{16}$  in the slave are biased by a second current source ( $Q'_2$ - $R'_2$  in



Fig. 4-3: Dynastat divide-by-two schematic.



Fig. 4-3). Bias currents for the latching differential pairs and emitter followers feeding signal back in the latches are controlled by voltage  $V_{mode}$ , while bias for the rest of the circuit is controlled by  $V_{bias}$ .

These control voltages set the operating mode of the divider. When  $V_{mode}$  equals the voltage  $V_{bias}$  (i.e., 0.96 V), the divider is configured to operate in the static mode. When  $V_{mode}$  equals 0 V, the latching pairs and emitter followers driving them (in both master and slave latches) are biased "off", and the divider works in the dynamic mode (see Fig. 4-4).

The divider now operates in static or dynamic mode [6], with an overall wider operating frequency range compared to either a static or dynamic divider. The area of switching transistors  $Q_1$ - $Q_2$  (master) and  $Q_{11}$ - $Q_{12}$  (slave) are 3.3 µm x 90 nm, and holding transistors  $Q_3$ - $Q_6$  (master) and  $Q_{13}$ - $Q_{16}$  (slave) are 3 µm x 90 nm. The emitter currents are 5 mA when active. Load resistors  $R_{L1}$ - $R_{L4}$  are 45  $\Omega$ , and substrate-shielded coils  $L_1$ - $L_4$  are 74 pH. The divider differential output amplitude is 225 mVp-p in the static mode, and drops to ~200 mVp-p in the dynamic mode.

# 4.4 Dynastat prototype

A divide-by-8 prototype (ref. Fig. 4-5) was designed to benchmark the dynastat concept. An input buffer with on-chip terminations is followed by a cascade of the dynastat followed by 2 ECL M/S D-type divide-by-two circuits. Two buffers amplify the clocks between the divider stages, and a 50- $\Omega$  output buffer drives the spectrum analyzer used to characterize the performance of the divider.



Fig. 4-5: Block diagram of the frequency divide-by-8 prototype.

The 0.46 mm<sup>2</sup> prototype (incl. bondpads, as in Fig. 4-6) is fabricated in IBM's 90 nm SiGe-BiCMOS 9HP technology [7]. The dynastat divider occupies  $80 \times 90 \text{ um}^2$  and consumes 38 mA operating in the static mode. DC current consumption drops to 19 mA (a 50% decrease) in the dynamic mode. The complete divide-by-8 testchip draws 160 mA from a +4.5 V supply.



Fig. 4-6: Dynastat prototype testchip micrograph.

The differential input clock used for testing in the 5 GHz - 50 GHz range was generated using a single-ended source and passive baluns. A V-band magic-T provided the differential clock from a single-ended signal from 50 GHz to 75 GHz. Evaluation of the dynastat by driving the input clock beyond the V-band was not possible because of equipment limitations. The self-oscillation frequency in each divider mode was measured at 78 GHz in the static mode, and 129 GHz in the dynamic mode. Excellent agreement between simulation and measurement below 75 GHz give us confidence when predicting agreement between the simulations and measurements at higher frequencies.

The divider input signal sensitivity vs. frequency is plotted in Fig. 4-7. The static and dynamic modes of operation overlap (shaded in Fig. 4-7) for input signals between 85 GHz and 117 GHz and a maximum input swing of 200 mV<sub>pk</sub> differential. Sensitivity increases with input frequency up to the self-oscillation frequency (SOF) in the static mode (78 GHz, measured). Agreement between measurement and simulation is very good across this range. Simulations predict decreasing sensitivity



Fig. 4-7: Measured and simulated input sensitivity vs. frequency.

Chapter 4



Fig. 4-8: Measured phase noise for a 60GHz input.

in the static mode above the  $SOF_{static}$ , however, the dynamic-mode sensitivity decreases in this frequency range. For  $f_{in}=109$  GHz the input sensitivities for the 2 modes are equal. Sensitivity in the dynamic mode continues to decrease up to  $SOF_{dyn}=129$  GHz (measured) and increases beyond  $SOF_{dyn}$ , rising to 200 mV-pk (differential) at  $f_{in}=153$  GHz.

The phase noise measured at the divide-by-8 output is plotted in Fig. 4-8. The output phase noise is 18 dB below the input signal (i.e., 20log(N) in dB, where N=8).



Fig. 4-9: Phase noise difference between clock source and div-by-8.

Fig. 4-8 also shows the noise floor (-150 dBc) of the divider predicted from a periodic steady-state (PSS) simulation. The difference in phase noise between the input signal and the divide-by-8 output is plotted in Fig. 4-9. Simulation and measurement track the (ideal) 18 dB difference predicted from theory between input and output up to 10 MHz offset.

Performance comparable to other dividers (static and dynamic in Table 4-1) is realized by the dynastat circuit. The SiGe dividers of [8] and [11] are comparable in SOF and power consumption to the dynastat in static and dynamic modes, respectively. Example [9] reaches a higher frequency using higher speed (III-V) transistors, while the lower power CMOS dynamic divider [10] has an SOF less than one-half that of the dynastat in SiGe, and cannot divide inputs below 53.4 GHz with 0 dBm input power.

| Ref.               | Mode     | Self-osc. Freq., GHz | DC Power, mW | Technology         |
|--------------------|----------|----------------------|--------------|--------------------|
| Dynastat           | Stat/Dyn | 78/129               | 171/85.5     | 90-nm SiGe-BiCMOS  |
| [8], BCTM<br>2006  | Static   | 77                   | 122          | 130-nm SiGe-BiCMOS |
| [9], CSICS<br>2010 | Static   | 143                  | 592          | 250-nm InP HBT     |
| [10], JSSC<br>2013 | Dynamic  | 62                   | 2.9          | 65-nm CMOS         |
| [11], SiRF<br>2009 | Dynamic  | 136                  | 72.6         | 130-nm SiGe-BiCMOS |

 Table 4-1: Divider performance comparison

#### 4.5 Low voltage dynastat divider

The dynastat frequency divider is capable of operating in static or dynamic modes [6], which gives it a wider operating frequency range overall compared to either static or dynamic divider circuits alone.

Lower supply voltages in current and future systems respond to the demand for lower power consumption. Technology scaling is reducing breakdown voltages in



Fig. 4-10: Low voltage dynastat frequency divider schematic.

order to increase device speed, as mentioned in Chapter 1. Therefore, there is a need for circuit topologies consuming less power and operating from lower supply voltages.

In Fig. 4-10, a frequency divider capable of dual-mode operation is implemented using a single -2.5 V supply, i.e., a low-voltage dynastat divider. Bipolar differential pairs cascode NFET devices, which are switched by the differential input signal (CLK<sub>+</sub>-CLK<sub>-</sub>). The topology avoids the use of current sources biasing the latches. Transistors  $M_1$  to  $M_4$  conduct current  $I_T$  when switched on, therefore,  $I_T$  depends on the NFET area and the input signal common-mode voltage. The low-voltage dynastat circuit topology has two main disadvantages: 1) the common-mode bias at the base nodes of  $Q_1$  and  $Q_2$  must be regulated, and 2) the noise immunity to  $V_{EE}$  supply is reduced.



Fig. 4-11: Shielded differential inductor layout.

Inductance peaking is used to extend the divider frequency response. The layout of the  $37x37 \ \mu\text{m}^2$  differential inductors used to implement L<sub>1</sub>-L<sub>2</sub> and L<sub>3</sub>-L<sub>4</sub> is shown in Fig. 4-11. The inductance and k-factor are 130 pH and 0.45, respectively.

Referring to the dynastat divider of Fig. 4-10 with  $V_{mode}$  biased at -2.5 V, the divider operates in the static mode. When  $V_{mode}$  is set to -1.5 V, transistors  $M_2$  and  $M_4$  are turned-off and the divider operates in the dynamic mode. Inductors  $L_1$  to  $L_4$  suppress the effects of capacitive loading at the collector nodes of each stage in the divider.

A low impedance at the sources of  $M_2$  and  $M_4$  is essential for proper operation, therefore a capacitance of ~200 pF is distributed across the layout to achieve a reactance of less than 1  $\Omega$  at 1 GHz. When  $V_{mode}$  equals -1.5 V,  $Q_{5,6}$  and  $Q_{11,12}$  are turned-off and the divider operates in the dynamic mode.

The simulated sensitivity vs. frequency of the low-voltage dynastat for a sinusoidal input is plotted in Fig. 4-12 for both operation modes. The divider self-oscillation frequency (i.e., freq. for minimum clock input amplitude) is 32 GHz in the static mode and 52 GHz in the dynamic mode.



(static and dynamic).

The maximum toggle frequency in dynamic mode is 56 GHz when driven by a clock signal with a differential amplitude of 400 mVp-p. It operates down to 10 MHz in static mode with a 275 mVp-p squarewave input.

The divider sensitivity at 40 GHz is 400 mVp-p in static mode, and decreases beyond 800 mVp-p in dynamic mode. Monte Carlo simulations predict a minimum of 480 mVp-p due to process variation in static mode, but this requirement is relaxed by controlling the divider sensitivity using  $V_{mode}$ . When  $V_{mode}$  is increased from -2.5 V, junction capacitances  $C_{je}$  and  $C_{jc}$  (of  $Q_{5,6}$  and  $Q_{11,12}$ ) decrease because their bias currents are reduced, shifting the clock sensitivity minimum to higher frequencies. Voltage  $V_{mode}$  can therefore be adjusted to maximize the clock input sensitivity at 40 GHz (e.g.,  $V_{mode}$ =-2.35 V brings the divider self-oscillation frequency to 40 GHz, a shown in Fig. 4-12).

As mentioned,  $V_{mode}$  can be used to increase the input voltage sensitivity at frequencies in between the static and dynamic self-oscillation frequencies. Fig. 4-13



Fig. 4-13: Self-oscillation frequency vs. control voltage V<sub>mode</sub>.

shows the simulated self-oscillation frequency vs.  $V_{mode}$ . Process, voltage, and temperature variations affect the sensitivity of the divider, however, the control implemented via  $V_{mode}$  can be used to modified it after fabrication. The divider sensitivity was maximized at 40 GHz in [12] during the characterization of a 40-Gb/s  $2^{11}$ -1 PRBS circuit, which is presented in the next chapter.

# 4.6 Summary

A dual-mode, dynastat frequency divider capable of operation in static or dynamic mode was investigated. Two topologies implement dual operation mode frequency dividers. A test chip prototyped implemented in 90-nm SiGe-BiCMOS was characterized. Dynamic operation enables a 65% increase in range beyond the static divider mode, while reducing power consumption of the divide-by-2 from 171 mW to 85.5 mW (-4.5V supply). The 50% decrease in power consumption compared to the static mode is realized without affecting the supply voltage or increasing chip area of the divider. The second prototype is part of a built-in self-test circuit, designed to work from a single -2.5 V supply. The low-voltage dynastat

divider is used in a BiST circuit for a 40-Gb/s optical modulator driver, and it is presented in the next chapter.

#### References

[1] Q. Lee, et al., "66 GHz static frequency divider in transferred-substrate HBT technology," IEEE-RFIC, Anaheim CA, pp. 87-90, June 1999.

[2] W. Fang, A. Brunnschweiler, P. Ashburn, "An analytical maximum toggle frequency expression and its application to optimizing high-speed ECL frequency dividers," *IEEE-JSSC*, vol.25, no.4, pp.920,931, Aug 1990.

[3] Kun-Hung Tsai; Jia-Hao Wu; Shen-Iuan Liu, "Frequency dividers with enhanced locking range," Radio Frequency Integrated Circuits Symposium, pp.661-664, April 2008.

[4] Z. Griffith, et al. "An Ultra Low-Power (<13.6 mW/latch) Static Frequency Divider in an InP/InGaAs DHBT Technology," Microwave Symposium Digest, 2006. IEEE MTT-S International , pp. 506-509, June 2006.

[5] R.L. Miller, "Fractional-Frequency Generators Utilizing Regenerative Modulation," Proceedings of the IRE, vol. 27, no. 7, pp.446-457, July 1939.

[6] L. Vera and J.R. Long, "A Dynastat Frequency Divider with DC-153 GHz Range," *Electronic Letters*, vol. 51, no. 12, pp. 908-910, June 2015.

[7] J.J. Pekarik, et al., "A 90nm SiGe BiCMOS Technology for mm-wave and High-Performance Analog Applications," *Proc. of the IEEE-BCTM*, San Diego CA, pp. 92-95, Oct. 2014.

[8] E. Laskin, et al., "Low-Power, Low-Phase Noise SiGe HBT Static Frequency Divider Topologies up to 100 GHz," *Proc. of the IEEE-BCTM*, Maastricht NL, pp.1-4, Oct. 2006.

[9] Z. Griffith, et al., "A 204.8GHz Static Divide-by-8 Frequency Divider in 250nm InP HBT," IEEE-CSICS, pp.1-4, Oct. 2010.

[10] C. Yue, H.C. Luong, "Analysis and Design of a 2.9-mW 53.4–79.4-GHz Frequency-Tracking Injection-Locked Frequency Divider in 65-nm CMOS," *IEEE-JSSC*, vol. 48, no.10, pp. 2403-2418, Oct. 2013.

[11] E. Laskin, A. Rylyakov, et al., "A 136-GHz Dynamic Divider in SiGe Technology," *IEEE-SiRF*, San Diego CA, pp. 1-4, Jan. 2009.

[12] L. Vera and J.R. Long, "A 40 Gb/s Low-Power 2<sup>11</sup>-1 PRBS with Distributed Clocking and Trigger Countdown Output," TCASii, vol., no., pp., 2015.

# **Part II** DIGITALLY-CONTROLLED DISTRIBUTED AMPLIFIER

# 5 Built-In Self-Test Circuit

A 40 Gb/s built-in self-test (BiST) generator for optical transmitter test and characterization designed in 0.13-µm SiGe-BiCMOS is described in this chapter. The sub-0.5-mm<sup>2</sup> area BiST produces a 2<sup>11</sup>-1 PRBS and a low-rate trigger output for verification of transmitter operation at data rates ranging from 10-Mb/s to 40-Gb/s. The PRBS generator is based on linear shift registers. This architecture is limited by the delay through the clock distribution network [1]. Higher speeds are realized by generating lower-rate identical sequences that are combined to construct a full-rate PRBS [2]. However, generators build from less than a quarter rate are unattractive because of their increased power consumption and complexity [3].

A half-rate clock scheme is used to implement the PRBS. Synthetic transmission lines distribute the clock reliably and the topology of the registers is modified to reduce the generator power consumption. A single DC supply (-2.5 V) is used in the PRBS design. The trigger output, required to synchronize test measurements performed off-chip with the half- or full-rate output streams produced by the data generator, is derived from the sequence using a countdown circuit.

The maximum operating frequency of a PRBS realized using a linear feedback shift register (SFR) topology is analyzed considering the delay introduced by the XOR gate (that forms part of this topology) equal to a register's delay.

# 5.1 PRBS maximum operation frequency analysis

A block diagram of a master-slave D flip-flop register (negative edge clock) is shown in Fig. 5-1a, and its timing diagram including input signal  $D_1$ , clock CLK



Fig. 5-1: Register based on master-slave D flip-flop.

(with period  $T_{CLK}$ ), and output signal  $Q_1$  is shown in Fig. 5-1b. In the following analysis, the time for a high-to-low and low-to-high transitions are assumed to be equal. Input  $D_1$  must remain unchanged before and after the clock active edge for proper operation, and under this constrain time constants  $t_{setup\_min}$  and  $t_{hold\_min}$  are defined. The minimum time the input signal must be stable before the clock active edge is  $t_{setup\_min}$ , and the minimum time the input signal must remain stable after the clock active edge is  $t_{hold\_min}$ . Furthermore, the register propagation delay  $(t_{pd})$  is defined as the time the register takes to change its output after an active clock edge.

Consider the four registers forming a closed loop in the data path (Fig. 5-2a). The registers have interleaved 1-0 inputs as initial conditions. The register clocks are distributed using a single line, which introduces a time difference between the clocks of the registers. The time difference between the clocks of two consecutive registers is represented by the clock delay  $t_d$ .

Correct operation of the registers in close loop is shown in Fig. 5-2b. The output of the last register is the input of the first register  $(Q_4=D_1)$ . The time that the input signal of the first register is stable between falling clock edges is  $t_{st} = T_{CLK} - t_{pd} - (t_d(N_{reg} - 1))$ , where  $N_{reg}=4$  for this example. The PRBS operates properly if  $t_{st} \ge t_{setup\_min}$ . Increasing the clock frequency reduces  $T_{CLK}$  and  $t_{st}$ . To



Fig. 5-2: Registers in a close loop and its time diagram.

reduce the effect of  $t_d$  on  $t_{st}$ , the clock can be distributed using two paths, as shown in Fig. 5-3a. Then, the effect of  $t_d$  on  $t_{st}$  reduces to  $t_{st} = T_{CLK} - t_{pd} - (t_d \cdot (N_{reg}/2 - 1))$ .

Increasing the number of registers in the PRBS limits the maximum operating frequency due to  $t_d$ . However, if the clock is distributed as shown in Fig. 5-3b, the maximum operating frequency is independent of the number of registers. In this



Fig. 5-3: Clock distribution options for registers in close loop.

configuration  $t_{st} = T_{CLK} - t_{pd} - t_d$ . Then, the minimum clock period for proper operation is obtained when  $t_{st} = t_{setup\ min}$  and the PRBS maximum clock frequency is

$$f_{CLKmax} = (T_{CLKmin})^{-1} = (t_{setup-min} + t_{pd} + t_d)^{-1} .$$
(1)

The configuration in Fig. 5-3b is adopted in the implementation of the PRBS generator, which is shown in Fig. 5-4. It consists of 2 XOR gates and eleven registers running at one-half of the input clock rate. Output  $F_6$  and the XOR of outputs  $F_{10}$  and  $F_{11}$  (ref. Fig. 5-4) are used to produce a full-rate output that implements the polynomial:  $x^{11}+x^9+1$ .

The resulting  $2^{11}$ -1 sequence has 1024 transitions per cycle, with an equal number of 0-to-1 and 1-to-0 transitions (i.e., no DC content). Therefore, the trigger output is derived by counting down the half-rate sequence through 9 divide-by-two stages (divide-by-512 in total). The trigger rate is the clock divided by  $2 \cdot (2^{11}$ -1), or ~9.77 MHz for a 40 GHz clock.

### 5.2 Clock distribution

A block diagram of the PRBS generator is shown in Fig. 5-5. The external clock is buffered to a Dynastat divider [4] (described in Chapter 5, Section 5.4), which can operate in either static or dynamic mode for increased input frequency range. The divided clock outputs are used to generate half-rate, 2<sup>11</sup>-1 pseudo-random sequences. The phase of the timing clock driving the 2:1 MUX (used to construct the PRBS output) is aligned to the half-rate sequences by selecting one of 4 possible phases from the Dynastat divider (i.e., true and inverted I/Q). The reset function precludes an all-zero output sequence.



Fig. 5-4: 2<sup>11</sup>-1 PRBS generator with trigger and monitor outputs running from a half-rate clock.



The input clock is divided by 2 in frequency, buffered, split into two paths, and then distributed to the PRBS registers by differential pairs  $Q_{1,2}$ , and  $Q_{3,4}$  (see Fig. 5-4). Each pair is loaded by a synthetic transmission line terminated by pull-up resistors (i.e.,  $R_{L1}$  to  $R_{L4}$ ). Interconnect wiring realizes a series inductance (L) of 80 pH for each transmission line section between the registers. Fig. 5-6 shows details of the clock distribution. Symmetric inductors  $L_1$  and  $L_2$  are implemented using a 145 µm long, 2 µm wide top aluminum track (M7) over slotted metals M1 and M2 as a ground plane. The higher magnetic coupling coefficient between vertical sections of the inductor layout (k=0.6) raise the inductance of each section, maximizing the inductance within the available area.



Fig. 5-6: Physical layout of the clock distribution between registers.

Differential shunt capacitance (C in Fig. 5-4) is provided by interconnect and transistor parasitics at the clock inputs of each register. The total capacitance per section is C=15 fF, yielding a characteristic impedance,  $Z_{line} = \sqrt{L/C} = 73\Omega$ , where L and C are the lumped equivalent differential inductance and capacitance per section, respectively (incl. clock input loading). The single-ended parasitic capacitance to ground at the flip-flop clock input is reduced from 19 to 10 fF by adding emitter followers Q<sub>1,2</sub> (see Fig. 5-7), at the cost of consuming 2.6 mA more current. The delay between consecutive stages is  $t_d \approx \sqrt{LC}$ .

The common-mode voltage at the true clock input of registers  $F_1$ - $F_5$  is set by the sum of the voltages dropped across  $R_{L1}$  (set by current source  $M_3$  in Fig. 5-4), the gate-source voltage of  $M_5$ , and  $R_1$ · $I_{CTRL}$  (set by  $I_{CTRL}$ ). Biasing at the other clock inputs is realized similarly.

#### 5.3 Shift register design

Each register consists of series-connected BiCMOS flip-flops. The flip-flops themselves are comprised of emitter-coupled bipolar logic cascoded onto clocked CMOS pairs operating from a -2.5 V supply (see Fig. 5-7). Push-pull current steering logic permits fast clocking from a low supply voltage (i.e., diff pairs  $M_1$  to  $M_4$ )



Fig. 5-7: D-type flip-flop register schematic.

without the headroom consumed by bias current sources used in [5]. A 400 mV<sub>p-p</sub> input clock is required to switch M<sub>1</sub> to M<sub>4</sub> fully across the anticipated process (best to worst cases), supply voltage (-2.3 to -2.7 V) and temperature ranges (25 to 85 °C). The clock common-mode voltage is controlled (using an on-chip DAC) to set the drain currents of M<sub>1-4</sub> near zero when turned "off", and to current I<sub>T</sub> when biased "on" across PVT variations. The steered current defines the desired output logic swing, I<sub>T</sub>R<sub>1</sub>=400 mV. Transistors M<sub>1-4</sub> are minimum length devices biased at the current density yielding peak f<sub>T</sub> (J<sub>NMOS</sub>=2.5 mA/µm<sup>2</sup>) when conducting I<sub>T</sub>. Their width is optimized to realize maximum switching speed in simulation.

The bipolars in the flip-flop are of minimum emitter width and optimized in length for the fastest switching times. The optimized area carries ~1.5 times the current density for peak transistor  $f_T$  when conducting  $I_T$  (i.e.,  $J_{NPN}=15 \text{ mA}/\mu\text{m}^2$ ).

Removing the bias sources normally used in CML reduces noise immunity to the  $V_{EE}$  supply. Retiming the full-rate sequence in the output MUX (ref. Fig. 5-5) is required to remove output jitter caused by such noise sources.

#### 5.4 XOR gate design

The low-voltage XOR gate (shown in Fig. 5-8) requires a differential input voltage >  $4V_T$ . Inductive peaking of the XOR load is used to widen its bandwidth to 80 GHz. The 37 x 37  $\mu$ m<sup>2</sup> layout of differential inductors L<sub>1</sub>-L<sub>2</sub> and L<sub>3</sub>-L<sub>4</sub> is shown in Fig. 5-9. The inductor self-resonant frequency is 121 GHz, and the inductance and k-factor are 130 pH and 0.45, respectively.

The propagation delays from the inputs to either output of the XOR are identical. Resetting the XOR1 in Fig. 5-8 at start-up prevent an all-zero sequence from propagating. Transistors  $Q_{19}$ ,  $Q_{20}$  and  $Q_{21}$  sink current from the tail current sources when the RESET input is 0 V, and set the XOR outputs to a logical 1. The XOR gate consumes 12 mA from a -2.5 V supply.



Fig. 5-8: XOR gate with reset schematic.



Fig. 5-9: Shield differential inductor layout.

# 5.5 Output MUX

The MUX of Fig. 5-10 interleaves the half-rate sequences from  $F_6$  and the XOR of outputs  $F_{10}$  and  $F_{11}$  (from Fig. 5-4) to generate the full-rate PRBS output (Q). Buffer BUF2 minimizes the loading on  $F_6$  and matches the propagation delay of



Fig. 5-10: 2:1 multiplexer schematic.

the path via XOR2 in order to maximize the output speed. The clock driving transistors  $M_5$  and  $M_6$  (ref. Fig. 5-10) select the desired input, which must be well defined and stable during each half clock period (i.e., no bit transitions). One of the four output phases from the Dynastat divider (buffered by followers  $Q_1-Q_2$  and  $Q_9-Q_{10}$ ) is selected to align the MUX clock and input sequences. Bits b1 to b4 select one of the 4 clock phases available via  $M_1$  to  $M_4$  for optimal clocking of the MUX.

### 5.6 PRBS characterization

The PRBS was fabricated in IBM's 0.13-µm SiGe-BiCMOS 8HP technology [6]. The frequency divider in Fig. 5-5 is the low-voltage dynastat divider



Fig. 5-11: Photomicrograph of integrated BiST block.

described in Chapter 4 (Section 4.5). The PRBS is designed as a built-in self-test (BiST) function for the optical modulator driver presented later in the next chapter. It occupies an active area of 0.49 mm<sup>2</sup> (Fig. 5-11), and it includes the clock divider and all components shown in Fig. 5-4. A single -2.5 V source supplies DC current to the Dynastat divider (12 mA), buffers (52 mA), registers (11x14 mA/reg), XOR (2x12 mA) and MUX (8 mA) blocks, for a total of 250 mA. The trigger countdown (9 divide-by-2 stages plus output buffer) consumes 220 mA in total. The output buffer drives a 100- $\Omega$  (differential) off-chip load at 380 mV<sub>p-p</sub>, differential.

The PRBS output for a half-rate clock (PRBS<sub>HALF</sub> in Fig. 5-4) is buffered to ground (G) and signal (S) pads configured GSGSG for probing on-wafer. The half-rate (20-Gb/s) output sequence shown in Fig. 5-12 was measured using a 40-GHz input clock and Tektronix 60-GHz (80E09) sampling modules. The output amplitude agrees with the 380 mV<sub>p-p</sub> (differential) predicted from simulation. The TDS-8000B digital sampling oscilloscope was synchronized by the 9.77-MHz trigger output.



Fig. 5-12: Measured PRBS half-rate output sequence vs. time.

The eye pattern measured for a single-ended output is shown in Fig. 5-13. The 0-1 and 1-0 transitions give an open, almost symmetric eye (50% duty cycle). Ringing at the 1-to-0 transitions is attributed to two sources. A portion arises from the second harmonic present in the single-ended output voltage. The larger contribution is caused by ringing from the capacitive loading of the emitter followers used in the register and output buffer stages. The ringing could be reduced by decreasing the extrinsic base resistance of the follower transistor (i.e., increasing their emitter areas), at the cost of greater current consumption.



Fig. 5-13: Measured PRBS half-rate eye diagram.



Fig. 5-14: Half-rate PRBS measured output spectrum (40 GHz clock).

The frequency spectrum measured at the half-rate PRBS output is shown in Fig. 5-14. The notch at 20 GHz is characteristic of a pseudo-random sequence at 20 Gb/s. The spectrum consists of tones with  $fclk/(n(2^{11}-1)) = 9.77$  MHz tone spacing for the half-rate sequence (n=2) generated by the 40-GHz input clock used for this test. The tone spacing for two tones around 1 GHz is shown in Fig. 5-15.

The PRBS generator developed in this work is compared to PRBS generators from the recent literature, see Table 5-1. Reference [1] implements an 80-Gb/s, half-rate PRBS that consumes 1 W from a 3.3-V supply, but does not have a trigger output or a frequency divider. Both are generally required in a BiST circuit. The



Fig. 5-15: PRBS half-rate measured discrete tones (40 GHz clock).

half-rate PRBS developed in this work consumes 53 mA less from a -2.5 V supply. Example [7] is implemented in InP, and it uses differential transmission lines to distribute the clock in a half-rate clock scheme. Due to its short sequence length  $(2^{7}-1)$ , the minimum data rate is 457 Mb/s compared to 19.5 Mb/s for the SiGe-PRBS prototype. The InP example does not include a frequency divider and consumes 3x more power. Ref. [8] is a quarter-rate PRBS with only one on-chip frequency divide-by-two, and requires another frequency divider-by-two before is used as BiST. It has a  $2^{31}$ -1 sequence and can operate up to 80-Gb/s (i.e., using a 40 GHz clock). The CMOS-bipolar cascodes used for logic and current sources require a 3.3 V supply, and it consumes 9.8 W, or 15x the power dissipation of the prototype reported here. In [9] the supply voltage is lowered to 2.5 V by avoiding cascoded devices in the register, MUX, and XOR logic blocks, but the reduction in voltage demands higher current from the supply. It realizes a power consumption comparable to our new design, but it outputs a shorter length sequence (i.e.,  $2^{7}$ -1 vs.  $2^{11}$ -1), and it does not include a frequency divider or a trigger output.

| Ref.             | Length /<br>Max. bit<br>rate | Divider /<br>MUX<br>rate scheme | DC Current (mA) /<br>Vsupply (V) /Power<br>(mW) | Technology/<br>f <sub>T</sub> /fmax (GHz) |
|------------------|------------------------------|---------------------------------|-------------------------------------------------|-------------------------------------------|
| This work        | 2 <sup>11</sup> -1 /<br>40G  | yes / yes<br>half-rate          | 250 / -2.5 / 625                                | 130nm SiGe-BiCMOS/<br>200/280             |
| EuMIC            | 2 <sup>11</sup> -1/          | no / yes                        | 303 / 3.3 / 1000                                | 350nm SiGe-Bipolar                        |
| 2014 [1]         | 80G                          | half-rate                       |                                                 | 200/250                                   |
| ESSCIRC          | 2 <sup>7</sup> -1 /          | no / yes                        | 500 / 3.5 / 1750                                | 1mm InP HBT,                              |
| 2004 [7]         | 58G                          | half-rate                       |                                                 | 170/-                                     |
| JSSC             | 2 <sup>31</sup> -1 /         | single / yes                    | 2970 / 3.3 / 9800                               | 130nm SiGe-BiCMOS                         |
| 2005 [8]         | 80G                          | quad-rate                       |                                                 | 150/150                                   |
| JSSC<br>2006 [9] | 2 <sup>7</sup> -1 /<br>55G   | no / yes                        | 220 / 2.5 / 550                                 | 120nm SiGe-BiCMOS<br>120/-                |

 Table 5-1: PRBS performance comparison

#### 5.7 Summary

A low-power, 2<sup>11</sup>-1 PRBS generator with synthetic transmission line clock distribution and a low-frequency trigger output has been demonstrated in a 130-nm SiGe-BiCMOS technology. The PRBS cell consists of a dual-mode frequency divider, a linear feedback shift register running from a half-rate clock, and a 2:1 interleaving output MUX. Synthetic transmission lines embed the input capacitance between register stages in the PRBS layout in order to maximize the output data rate at the lowest power consumption. The PRBS generator consumes a total of 250 mA, while the trigger countdown circuit consumes 220 mA, both from a single -2.5-V supply.

#### References

[1] A. Gharib, A. Talai, R. Weigel, et al., "A 1.16 pJ/bit 80 Gb/s 2<sup>11</sup>-1 PRBS generator in SiGe bipolar technology," Proc. of EuMIC, pp. 277-280, Oct. 2014.

[2] F. Simesbichler, et al., "Generation of high-speed pseudorandom sequences using multiplex-techniques," *IEEE Transactions on Microwave Theory and Techniques*, vol. 44, no. 12, pp. 2738-2742, Dec. 1996.

[3] E. Laskin, S.P. Voinigescu, "A 60 mW per lane,  $4 \times 23$ -Gb/s  $2^7$  PRBS generator," Proc. of CSICS, pp. 3, Oct. 2005.

[4] L. Vera and J.R. Long, "A Dynastat Frequency Divider with DC-153 GHz Range," *Electronic Letters*, vol. 51, no. 12, pp. 908-910, June 2015.

[5] T.O. Dickson, R. Beerkens, S.P. Voinigescu, "A 2.5-V 45-Gb/s decision circuit using SiGe BiCMOS logic," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 4, pp. 994-1003, April 2005.

[6] B.A. Orner, Q.Z. Liu, B. Rainey, et al., "A 0.13 μm BiCMOS technology featuring a 200/280 GHz (fT/fmax) SiGe HBT," Proc. of the IEEE-BCTM, Toulouse, France. pp. 203-206, Sept. 2003.

[7] H. Veenstra, "1–58 Gb/s PRBS generator with <1.1 ps RMS jitter in InP technology," Proc. of ESSCIRC, pp. 359–362, Sept. 2004.

[8] T.O. Dickson, E. Laskin, I. Khalid, et al., "An 80-Gb/s 2<sup>31</sup>-1 pseudorandom binary sequence generator in SiGe BiCMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2735–2745, Dec. 2005.

[9] D. Kucharski, K.T. Kornegay, "2.5 V 43-45 Gb/s CDR Circuit and 55 Gb/s PRBS Generator in SiGe Using a Low-Voltage Logic Family," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 9, pp. 2154-2165, Sept. 2006.
Chapter 5

# 6 Digitally-Controlled Distributed Amplifier

In this chapter, the design of a Mach-Zehnder modulator (MZM) driver using a digitally-controlled distributed amplifier (DA) is presented. It is designed and fabricated to produce  $6-V_{p-p}$  differential output swing at 40-Gb/s (6 ps rise/fall times) in a production 0.13-µm SiGe-BiCMOS technology. Innovations developed for the 40-Gb/s MZM driver include: 1) built-in calibration (BiC) of the digital input line facilitated by an on-chip energy detector and a 3-step calibration algorithm, 2) a shielded output line in standard aluminum top metal (not copper), 3) 10% less power consumption despite the 4x increase in data rate, 4) wideband operation across 28-48 Gb/s data rates enabled by a digitally-controlled clock phase generator, and 5) an on-chip 1-0 data sequence and a  $2^{11}$ -1 PRBS data source (presented in Chapter 5) used for built-in self-test (BiST), calibration and characterization.

# 6.1 Digitally-controlled distributed amplifier

A block diagram of the driver prototype is shown in Fig. 6-1. A -2.5-V DC source supplies all circuit blocks in the driver. A second +5-V supply biases the output stages via two 50- $\Omega$  metal-film, on-chip back-termination resistors seen at the upper right in Fig. 6-1. The high-speed clock input is terminated in 50  $\Omega$ , and buffered to 3 circuit blocks: an injection-locked quadrature oscillator (ILO via CB<sub>1</sub>), the input data retiming flip-flop F<sub>1</sub> via CB<sub>2</sub>, and the Dynastat clock divide-by-two circuit [1]. The divider outputs a square-wave (i.e., 1-0 pattern) synchronized to the clock, and to drive a pseudo-random-bit-sequence (PRBS) generator. Thus, 3 data sources are possible: 1) external data via a differential input with 60-mV<sub>p-p</sub>



Fig. 6-1: 40 Gb/s MZ modulator driver block diagram.

sensitivity, 2) an on-chip 1-0 pattern for calibration purposes, and 3), a 2<sup>11</sup>-1 length PRBS generator for test and characterization at 40-Gb/s.

The data source is selected via multiplexer  $M_1$  and retimed by  $F_1$ . Full retiming provides greater margin against variations in processing, supply voltage, and temperature (PVT) at the cost of a slight increase in power consumption (i.e., 8.5 mA consumed by the latch). The retimed data is buffered (DB<sub>2</sub> to DB<sub>4</sub>) and distributed to latches L<sub>1</sub> to L<sub>3</sub>. Each latch output drives a limiting amplifier stage in the distributed amplifier with a replica of the input signal timed to match the phase of the signal traveling along the output line. Three DA stages load the 730-µm long output TL terminated on-chip by 50- $\Omega$  metal film resistors which are AC-decoupled to ground by  $C_D$ . An energy detector circuit connected to the output measures the mean-square amplitude of the output signal. It is used during calibration of the driver to optimize the output amplitude and thereby maximize rise/fall times.

Quadrature (I/Q) clock streams are generated by the 2-stage ILO when injection locked to the input clock. The I/Q clock outputs from the ILO are distributed to the retiming synchronization circuits via buffers CB<sub>4</sub>-CB<sub>7</sub>, shown in Fig. 6-1. The synchronization (sync) circuit is used to set the phase of the clock used to retiming data input to each stage of the DA. Each clock synchronizer (see Fig. 6-2) consists of an inverter driving a vector summing phase shifter biased by 2 current DACs addressed by control bits  $b3_n-b7_n$ . Summation of the quadrature vectors input to the phase shifter produces a clock output with phase range of 90 degrees (i.e., 1 Cartesian quadrant). This is extended to a full 360° of range by selecting a Cartesian quadrant (i.e., differential I or  $\overline{I}$ , and differential Q or  $\overline{Q}$ ) using bits  $b1_n$  and  $b2_n$  to control the outputs of the 2 inverters driven by differential, quadrature clocks I and Q, respectively. Thus, a 7-bit word controls each phase shifter, with the two least significant bits (LSBs) controlling the inverters (i.e., 0° or 180° selection of differential I and Q). The resolution of the vector summer determines the accuracy when attempting to match the time delay between stages of the DA along the output line. Matching the delay between stages to  $3.0 \pm 0.3$  ps per stage (i.e., within 10 %



Fig. 6-2: Timing phase control for individual clocks, n=1,2,3.

error) is possible when the minimum phase step is less than  $4.3^{\circ}$  for a 40-GHz clock. A 5-bit current DAC biasing the vector summer has a resolution of  $90^{\circ} \div 2^{5} = 2.8^{\circ}$ , which corresponds to a 200-fs change in delay time per step at a 40-GHz clock. Note that jitter in the DA output signal is determined by the I/Q clock jitter (transferred to the DA output via retiming clocks CLK<sub>1</sub> to CLK<sub>3</sub> from the synchronizers) and any jitter added by the data retiming circuitry.

#### 6.1.1 Output transmission line and back-termination resistor

The 0.13- $\mu$ m SiGe-BiCMOS technology used to implement the driver prototype has 7 levels of metal available for interconnections, comprised of 5 (thin) copper layers beneath 2 (thick) aluminium metals at the top of the stack. The 4- $\mu$ m thick topmetal has the lowest parasitic capacitance to the substrate and is therefore used to implement the output transmission line (TL). Attenuation across the output line is further minimized by implementing a floating metal shield [2] in the second-level copper layer (M2). First metal (i.e., M1, below the shield) is reserved for supply and low-frequency circuit wiring. The output TL for the DA is synthesized in L-C-L lumped-element sections, where the series inductance (L) is derived from a (differential) 2-wire topmetal line, and the capacitance (C) arises from parasitics of each limiting amplifier (e.g., LA<sub>1</sub> to LA<sub>3</sub> in Fig. 6-1) and the substrate shield. The capacitive loading of each transistor on the output line is C<sub>LA</sub>=46 fF.

The physical layout for each TL section connecting consecutive stages of the DA is shown in Fig. 6-3. The 11- $\mu$ m wide (W), 4- $\mu$ m thick (T) aluminum topmetal transmission lines are separated by 200  $\mu$ m (G). The cross-sectional area is chosen to satisfy DC and RMS current restrictions imposed by electromigration requirements for the SiGe-8HP technology (i.e., 6 mA/ $\mu$ m-width of topmetal) at 6-V<sub>p-p</sub> differential output swing across dual 50  $\Omega$  loads. The section length is 215  $\mu$ m (D), yielding a self-inductance (L<sub>line</sub>) of 160 pH and parasitic capacitance (C<sub>line</sub>) of 18 fF for two topmetal lines. Each limiting amplifier connects to the output TL at the center of the



Fig. 6-3: Cross-section of top metal (AM) output line and M2 substrate shield.

section (i.e., 107.5 µm from either end), forming an L-C-L topology network. The characteristic impedance (Z<sub>0</sub>) synthesized by the topmetal interconnect and the limiting amplifier combined is  $Z_0 = \sqrt{L_{line}/(C_{line} + C_{LA})}$ , or 50  $\Omega$  for this design. Note that the transmission line sections connecting the back-termination resistors and the output pads to the TL are extended by 42.5 µm to a total length of 149.5 µm. The adds approximately 32 pH of self-inductance to these sections, which is used to compensate for the capacitive load of 13 fF added by the back-terminations, bondpads, and associated wiring.

The silicon substrate shield (see Fig. 6-3) is comprised of metal-2 (M2) fingers that are floating (i.e., not connected to any other conductors). The M2 shield fingers reside 11.8  $\mu$ m below the balanced, output transmission line, and shield the top conductors from the substrate by electric induction [2]. Digital control signals wired in metal-1 (M1) are therefore shielded from the differential output signal. Each

shield finger is:  $0.32-\mu m$  thick (t),  $220-\mu m \log (d)$ ,  $1-\mu m$  space between fingers (s), and  $2-\mu m$  in width (w). An opening in the shield of 37  $\mu m$  (g in Fig. 6-3) facilitates connections between the limiting amplifiers and the output line in the layout. Unwanted common-mode components in the output signal penetrate the floating shield and are attenuated by the resistive substrate. Large-signal simulations of the limiting amplifiers connected to the output line predict a time delay of 3 ps between consecutive output stages.

The 50- $\Omega$  back-terminations are tantalum nitride (TaN) thin-film resistors. They feed DC bias current from the +5 V supply to each DA cell. The desired differential output swing of 6 V<sub>p-p</sub> requires switching 120 mA of current between the 50- $\Omega$  loads at the differential outputs. This current is divided equally between the output stages, and therefore each stage switches 40 mA. The back-termination resistors conducts 60 mA of DC current continuously in normal operation. Given the current limit of 0.5 mA/ $\mu$ m and sheet resistance of 60  $\Omega$ /sq. for the TaN film, each back-termination resistor is therefore sized at 120- $\mu$ m in width and 100- $\mu$ m in length. The 13-fF parasitic capacitance to substrate of the back-termination is compensated by 32 pH of self-inductance realized by extending the output transmission line to the first limiting amplifier stage.

### 6.1.2 Latch and limiting amplifier

Schematics of the latch, pre-driver and limiting amplifier stages are shown in Fig. 6-4. Bipolar logic in the data path is cascoded onto push-pull CMOS pairs  $M_1$  and  $M_2$  for latches  $L_1$  to  $L_3$  (from Fig. 6-1). A supply voltage of -2.5 V may be used, because headroom for a current source to bias the CMOS current steering pair is not required. Eliminating the tail current source normally used in a differential pair also permits faster clocking because a significant source of parasitic capacitance is eliminated. However, a disadvantage of eliminating the tail current source is that the common-mode bias at the gates of  $M_1$  and  $M_2$  must be regulated. Therefore  $M_1$  and



Fig. 6-4: Latch, pre-driver and limiting amplifier schematic.

 $M_2$  are biased at the gate (i.e.,  $V_{GS}$ ) to conduct a current ( $I_T$ ) of 4 mA when switched "on", and near zero when turned "off". The gate bias voltage is set by the clock buffer preceding the latch ( $CB_n$  in Fig. 6-2). The buffer common-mode output voltage is controlled using an on-chip DAC to set the voltage drop in  $R_9$  (see Fig. 6-5).

A 400 mV<sub>p-p</sub> clock drives each latch input (see Fig. 6-4). Simulations predict full switching of transistors  $M_1$  and  $M_2$  across the anticipated process (best and worst case), supply voltage (-2.3 to -2.7 V) and temperature variations (0 to 85 °C) for this clock amplitude. The gate width of 5.2 µm at minimum gate length (0.13 µm) for the FETs biases the transistors at peak  $f_T$ , and yields the fastest switching time in simulations. Bipolar transistors  $Q_3$  to  $Q_6$  (see Fig. 6-4) are 0.13-µm wide (W), 2-µm in length (L), and biasing at 4 mA (I<sub>T</sub>) maximizes switching performance. The current density selected for the npn corresponds to 1.5 times the current density yielding peak  $f_T$ . The 400-mV<sub>p-p</sub> output voltage of the latch is determined by the voltage drop across 100- $\Omega$  polyresistors  $R_1$  and  $R_2$  when conducting the full current, I<sub>T</sub>.

Emitter followers  $Q_7$  and  $Q_8$  (W = 0.13  $\mu$ m, L = 6  $\mu$ m) buffering the latch outputs are biased at peak f<sub>T</sub> (8 mA). They drive a differential limiting amplifier pair

formed by Q<sub>9</sub> and Q<sub>10</sub> (W = 0.13  $\mu$ m, L = 9.5  $\mu$ m). Emitter followers Q<sub>11,12</sub> are 0.13  $\mu$ m by 12  $\mu$ m long transistors biased at peak f<sub>T</sub> (17 mA). The driving amplifier transistors Q<sub>13</sub> to Q<sub>16</sub> are 2 x 0.13 x 9 - $\mu$ m<sup>2</sup> in area and switch 40 mA of current. Cascode stage Q<sub>15,16</sub> reduces the Miller effect seen at the input of the differential driver, and increases the output impedance of the amplifier (i.e., larger r<sub>out</sub> and smaller parasitic capacitance) to reduce loading of the output TL. The maximum voltage swing at the output before avalanche breakdown of Q<sub>15,16</sub> is 6 V (BV<sub>CBO</sub>) because of the low impedance between the base terminal and ground for the common-base transistors in the cascode [3]. The large BV<sub>CBO</sub> value suggests that an output voltage swing beyond 3 V<sub>p-p</sub> (single-ended) is possible. However, transistor beta degrades [4] as the base-collector swing approaches BV<sub>CBO</sub>, which may affect reliability. In the absence of measured data, a larger margin was adopted for this design. A study of reliability and the maximum output swing tolerable for the limiting amplifier stages of the driver is an area for future work.

Each output stage (LA<sub>1</sub> to LA<sub>3</sub> in Fig. 6-1) may be turned on/off independently using control voltages  $V_{IO1}$  to  $V_{IO3}$ . Control voltage  $V_{IO}$  steers bias currents from sources  $M_{E1}$  to  $M_{E4}$  away from the followers in the latch using transistors  $Q_{1b,2b}$  and  $Q_{7b,8b}$  (ref. Fig. 6-4), thereby preventing any AC signal from reaching to the driver outputs. These controls are used to bias the limiting amplifier output "off" during system startup, for example.

#### 6.1.3 Phase inverters, vector summer and clock buffer

Data is clocked into latches  $L_1$  to  $L_3$  with approximately a 3-ps delay between consecutive stages to match the propagation delay of the signal along the output transmission line. Independent control of the retiming clock (i.e., phases of CLK<sub>1</sub>, CLK<sub>2</sub> and CLK<sub>3</sub> in Fig. 6-1) is realized using the synchronization circuit of Fig. 6-2. The I/Q phase selector, vector summing, and buffer sub-circuit schematics are shown in Fig. 6-5. Control bits  $b_1$  and  $b_2$  select the I or Q phase of the clock. The magnitude



of the vector sum produced at the outputs of differential pairs  $Q_{17,18}$  and  $Q_{19,20}$  varies as the tail currents are controlled by 5-bit current D/A converters  $DAC_{1a}$  (I), and  $DAC_{1b}$  (Q). The total bias current supplied by the DACs is 4 mA, and a 340-mV<sub>pk</sub> phase-variable clock is produced across 85  $\Omega$  loads  $R_{5,6}$  at the summer output. The amplitude varies by 10% as the phase is varied across its 90° control range (Fig. 6-6).



Fig. 6-6: Simulated vector summing output within 1 quadrant for 10 code settings.

#### Chapter 6

Since the timing of the data output from a latch tracks the retiming clock phase, the 3-ps delay required between output stages can be set to within +/-200 fs (i.e., DAC resolution) by trimming input codes to the DACs in calibration.

The resolution of the DAC introduces an error in the DA output voltage rise/ fall times, which was found using transient simulations. The percent error in the 20-80% rise/fall times corresponding to the DAC resolution is +/- 3.2 % at 40-Gb/s data rate. The percentage error in the rise/fall times decreases when the clock frequency is increased, and vice versa (e.g., the error in the rise/fall times increases to 7.3 % for a 20 GHz clock). Furthermore, phase error between the ILO outputs causes variation in the resolution of the vector summer, which depends upon the quadrant selected by the phase inverter stages of the synchronizer. For example, an I-Q error of +10° (max. error across 30-50GHz from simulation) in the first and third Cartesian quadrants (i.e.,  $\phi_{\epsilon-1,3}=\phi_Q-\phi_I-90^\circ=10^\circ$ ) yields a minimum phase step of 3.13° (i.e., 100°/32 states), or 217 fs at 40 GHz. The phase error if either the second or fourth Cartesian quadrant is selected would be  $\phi_{\epsilon-2,4}=\phi_1-\phi_Q-90^\circ=-10^\circ$ , and the minimum phase step decreases to 2.5° (174 fs at 40 GHz). If the delay between stages is matched to within 217 fs (i.e., worst case for 10° I/Q error) then a maximum change of +/-3.5 % in the 20-80% rise/fall time at 40 GHz clock is predicted from simulation.

The clock output from the synchronizer is buffered before retiming the data. Shunt peaking of the differential pair buffer (see Fig. 6-7) is used to extend the 3-dB



bandwidth from 43 GHz to 72 GHz, and the buffer common-mode output voltage is trimmable via  $V_{CMMD}$  (see Fig. 6-7a). The layout of the custom-made inductor is shown in Fig. 6-7b.

#### 6.1.4 Injection-locked oscillator

The 2-stage injection-locked oscillator (ILO) shown in Fig. 6-8 generates a quadrature-phase (I/Q) retiming clock synchronized to the data source (external or internal). An alternative for the generation of quadrature signals is a polyphase filter (PPF, e.g., [5]). However, the multiple PPF stages required to realize acceptable I-Q phase accuracy and wide operating bandwidth would suffer from insertion loss and the chip area occupied by a PPF. The tunable ILO for quadrature clock generation developed in this work has advantages in operating range (25-52GHz), very good I/Q accuracy ( $\pm 10^{\circ}$ ), and occupies just 90 µm x 90 µm of chip area.

The ILO first stage injects the external clock to the second stage using the differential pair  $Q_{1,4}$ , resistors  $R_{1,4}$  and emitter followers  $Q_{5,6}$ . The differential pair  $Q_{2,3}$  and loads  $R_{2,3}$  generate the in-phase voltage signal. Emitter followers  $Q_{7,8}$  buffer the voltage driving the second stage of the ILO, and  $Q_{9,10}$  are used to tune the frequency response. The second stage of the ILO generates the quadrature-phase signal via  $Q_{11,12}$  and loads  $R_{5,6}$ . It uses emitter followers  $Q_{13}$  to  $Q_{16}$  for buffering and



Fig. 6-8: 2-stage injection-locked oscillator.



Fig. 6-9: Simulated frequency response for the 2-stage injection-locked oscillator.

frequency tuning. The free-running frequency of the ILO is determined partly by low-pass filtering from load resistors  $R_{2,3}$  and  $R_{5,6}$  and parasitic capacitances contributed by  $Q_5$  to  $Q_8$ ,  $Q_{13,14}$ ,  $Q_{9,10}$  and  $Q_{15,16}$ . Emitter followers (EF)  $Q_5$  to  $Q_8$ connect to the second stage differential pair  $Q_{11,12}$ , and EF  $Q_{13,14}$  feedback the signal. Transistors  $Q_1$  and  $Q_4$  sum the injected clock with the first-stage output by a wired-OR connection of followers at the second stage input (see Fig. 6-8). Simulations predict that injection of a 400 mV-peak signal at these nodes increases the lock range of the ILO by 4 GHz compared to injecting the signal in the first stage directly via collectors  $Q_{1,2}$  and  $Q_{3,4}$  (i.e., summing the injected and first stage currents in loads  $R_2$  and  $R_3$ ). Injection via the emitter followers is less sensitive to capacitive loading of the injecting circuit.

The ILO lock range defines the operating frequency range for the entire DA. The lock range for the ILO is extended by avoiding the use of LC tuning in the stages, and minimizing Q-factor for the closed-loop circuit [6]. Electronic tuning of the self-oscillation frequency is realized by adjusting the input capacitance of emitter followers ( $Q_{9,10}$  and  $Q_{15,16}$ ) loading the collector nodes of  $Q_{2,3}$  and  $Q_{11,12}$  via the DC bias current using  $V_{BIAS3}$ . This extends the lower frequency which injection locks the ILO from 42 GHz down to 34 GHz when  $V_{BIAS3}$  is switched from -2.5 V to

-0.9 V, as shown in Fig. 6-9a. When  $Q_{9,10}$  and  $Q_{15,16}$  are biased "off" ( $V_{BIAS3}$ =-2.5 V), the self-oscillation frequency is 42 GHz, and the ILO can lock to 400-mV<sub>pk</sub> amplitude injected clocks between 34 GHz and 52 GHz, as desired for the 40-Gb/s MZM driver application. The 25-GHz to 52-GHz lock range predicted from post-layout simulation of the ILO accommodates PVT variations anticipated for the circuit. The PVT simulations (from 200 Monte Carlo trials including best to worst cases) includes process, supply voltage (-2.3 to -2.7 V), and temperature (25 to 85 °C) variations change the self-oscillation frequency by ±7.5 GHz ( $V_{BIAS3}$  "on" to "off" states). The amplitude and phase variations in I/Q across 30 to 50 GHz are ± 0.5% of the nominal peak and ± 10°, respectively. Fig. 6-9b shows the simulated amplitude difference between the in-phase and quadrature-phase outputs and the error in I and Q phases (i.e., w.r.t. 90°), for a 40-GHz input signal and  $V_{BIAS3}$ =-2.5 V.

#### 6.1.5 Dynastat frequency divider

The low voltage dynastat divider presented in Chapter 4, Section 4.5, is used in the digitally-controlled DA. It overcomes the maximum toggle frequency limit of a static divider topology and the narrow clock bandwidth operation of a dynamic divider implementation in a given technology, as described in [1] and [7]. The dynastat for the MZM driver is designed to work from a single -2.5 V supply. For  $V_{MODE}$  equal to -2.5 V, the divider operates as static frequency divider. When  $V_{MODE}$  is set to -1.5 V. The clock frequency ranges of the two modes overlap, giving a higher maximum toggle frequency than a fully-static circuit and a minimum toggle frequency approaching DC.

#### 6.2 Built-in calibration

Mismatch in timing between the input and output lines of the DA results in distortion of the desired pulse waveshape (e.g., over- or under-shooting) and sub-optimal rise/fall times. Therefore, clocks driving the DA input latches are calibrated at start-up to synchronize the re-timing of data at each stage with the



propagation delay between stages across the output line. Static timing errors caused by parameter drift (e.g., from PVT variations) during operation may also be corrected through (periodic) recalibration on the clock timing. Monte Carlo simulations for 200 trials with varying supply voltage (-2.3 to -2.7 V) and temperature (25°C to 85°C) predict a +/- 30% change in the nominal rise/fall times, i.e., 4.2 ps rise/fall times at 25°C (after calibration), and it degrades to 7.8 ps at 85°C, however, rise/fall times can be reduced to 7.05 ps at 85°C after recalibration. Simulations predict that process variations are suppressed by calibration of the circuit.

The DA rising and falling edge rates are adjusted by controlling the phase of each clock driving the retiming latches ( $L_1$  to  $L_3$  in Fig. 6-1) individually. The edge rates (i.e., rise/fall times) at the output varies according to the timing relationship between clocks CLK1, CLK2 and CLK3. Transient simulations of the DA for an alternating 1-0 input data pattern at 40-Gb/s (see Fig. 6-10) show the fastest edge rates for 3 ps delay between the clocks (6 ps rise-fall time), while rise/fall times (20-80 %) slow to 12 ps when the interstage delay between the clocks is 9 ps. As expected, the fastest edge rates are realized when the clock delay time matches the propagation delay time between stages along the output line of 3 ps.

It is clear from observation of the waveshapes that the width and opening of the data eye is largest at the driver outputs when the edge rates are as small as possible. Therefore, calibration of driver is aimed at optimization of the clock timing to realize the largest possible data eye opening. Direct measurement of the rise/fall times at the output is avoided by noting that the mean-square output voltage ( $V_{Msq}$ ) is proportional to the data eye opening, where  $V_{MSq} = \frac{1}{T_2 - T_1} \int_{T_1}^{T_2} V_{out}^2$ . In fact,  $V_{Msq}$  is largest when the interstage clock delay at the latch input matches the interstage propagation delay time across the output line. The proposed calibration circuit outputs a DC voltage proportional to the mean-square DA output voltage for a predetermined test pattern (e.g., a square wave).

The calibration circuit is shown in Fig. 6-11. It consists of sense, rectifying and amplifying sections that measure the mean-square voltage at the differential outputs of the DA (Fig. 6-1) without affecting the signal quality adversely (e.g., due to capacitive loading). Resistors  $R_1$  to  $R_4$  (390  $\Omega$  each) connected in parallel at the DA output attenuate the amplitude by 6-dB before measurement. The parasitic capacitance to ground (single-ended including post-layout parasitic extraction) of the attenuating resistors and the 45 x 65- $\mu$ m<sup>2</sup> bondpads is 18 fF. Series inductive peaking compensates the effect of this capacitance. The required self-inductance (35 pH) is set by extending the length of the output transmission line by 40  $\mu$ m.



Fig. 6-11: Schematic of the proposed calibration circuit.

Chapter 6

The attenuated output signal is rectified at the collector-emitter nodes of  $Q_{1,2}$ . Transistors  $Q_1$  and  $Q_2$  in Fig. 6-11 are minimum width, 1.5-µm in length, and are biased using a 2-mA tail current flowing through  $R_7$ . The differential impedance between the bases of  $Q_1$  and  $Q_2$  is 4 k $\Omega$  in shunt with 8.3 fF, and loading seen at the driver outputs is minimal at approximately 1.5 k $\Omega$  across DC-30 GHz bandwidth. The collector-emitter bias voltage across the dummy pair  $Q_{3,4}$  in Fig. 6-11 match the bias of the differential rectifier  $Q_{1,2}$ . Bias offset in the rectified signal is removed by subtracting the common-mode in  $V_{CE}$  from  $Q_{1,2}$  and  $V_{CE}$  from  $Q_{3,4}$  using error (OA<sub>1</sub> and OA<sub>2</sub>, with a gain of 2) and summing (OA<sub>3</sub>, with a gain of 7) amplifiers (see Fig. 6-11). These yield an OA<sub>3</sub> output voltage proportional to the mean-square DA output.

Low-pass filters C<sub>1</sub>-R<sub>5</sub>, and C<sub>2</sub>-R<sub>7</sub> (330 $\Omega \parallel$  500 fF, see Fig. 6-11) filter the full-wave rectified signal at the respective collector and emitter nodes of Q<sub>1,2</sub>. Simulations predict that the ripple in the mean-square output is less than 10 % after filtering. The ripple ( $\Delta V_{MSq}$ ) can be estimated from  $\frac{dV_{MSq}}{dt} \approx \frac{\Delta V_{MSq}}{\Delta t} = -\frac{I}{C}$ , where C<sub>1</sub>=C<sub>2</sub>=C, and I is the DC current biasing Q<sub>1,2</sub>. Time  $\Delta t$  is approximately one-half the period of the AC signal output from the rectifier [8].

Series-connected  $R_3$  (29 k $\Omega$ ) and C1 (400 fF) realize the frequency compensation of the amplifiers shown in Fig. 6-12. Simulations predict 51° phase margin and 12 dB gain margin when the op-amp is loaded at the output by 5 pF.



Fig. 6-12: Op-amp schematic circuit.



Fig. 6-13: Peak detector proposed in by Meyer [8] compared to energy detector developed in this work.

The simulated output of the rectifier section for a 20-GHz sinewave input is plotted in Fig. 6-13b. The rectified output voltage is linear across the 0 to 1.5 V input amplitude range anticipated for the driver. The error at 2-V input compared to the (ideal) linear response is just 4%. Compared to a simple emitter follower peak detector [8], the linear range of the differential full-wave rectifier ( $Q_{1,2}$  in Fig. 6-11) with respect to the input voltage is 50 % wider, as shown in Fig. 6-13.

In Fig. 6-14, a timing diagram of the 3-step calibration sequence developed in this work is shown. Data at the input of the latches (DB1 to DB3) are assumed to be identical waveforms with zero skew. The data is assumed to be a 1-0 calibration sequence (i.e., bit period =  $T_{CLK}$ ). Data must be stable for at least  $t_{setup}$  before the latch is clocked. The clock signals driving each latch (CLK1 to CLK3) are also shown in the figure, and three steps for the DA calibration are identified. The first step (Step-1) in the calibration sequence adjusts the phase of CLK1. Any violation of the setup time requirement for the latch (e.g., CLK1 rising transition before stable D input) is detected at the calibration circuit output by a decrease in voltage  $V_{MSq}$ . It



should be noted that  $t_{setup}$  of each latch (typ.  $t_{setup}=2.4$ ps for the latches in this work) depends on parasitics, timing jitter in the data, temperature variation, etc.

The second calibration step (Step-2) is used to adjust the time delay of CLK2 with respect to CLK1 to approximately 3 ps (i.e., the interstage delay across the output line). Finally, the third step in the calibration (Step-3) adjusts CLK3 to ~3ps delay with respect to CLK2. For example, the simulation result shown in Fig. 6-15 illustrates the effect of varying (simultaneously) the delay of CLK<sub>2</sub> with respect to CLK<sub>1</sub>, and CLK<sub>3</sub> with respect to CLK<sub>2</sub> with the same delay, which is in the range of -6ps to +5 ps. Clock timing for first latch in the DA (i.e., CLK<sub>1</sub>) is delayed by 2.4 ps with respect to the bit transition at the D-input of Latch 1. Voltage V<sub>MSq</sub> reaches a maximum of 1.55 V for 2.7-ps delay between clocks, which is in close agreement with the calculated propagation delay  $t_{pd} = \sqrt{L_{line} \cdot (C_{line} + C_{LA})} = 3.2ps$ .

To perform the desired calibration, data selector M1 (ref. Fig. 6-1) is set to the 1-0 pattern generated by the dynastat divider (i.e., input  $I_2$ ). The clock delay for the first DA stage only (i.e., all other stages biased "off") is selected from a total of



Fig. 6-15: Output voltage vs. clock delay time for the calibration circuit.

 $2^7 = 128$  phase settings of the clock synchronizer's 7-bit control word. Having CLK1 set to its optimum phase value, the relative delay of CLK2 is then selected from its 128 possible phase settings. Finally, the optimum relative delay of CLK3 is found by testing all 128 phase settings for the third-stage clock. All 384 combinations were tested during measurements of the prototype.

A flowchart for the three-step calibration sequence is illustrated in Fig. 6-16. Steps one to three adjust clocks CLk1 to CLK3 in sequence. During Step-1 only the the first DA stage is biased "on" (i.e.,  $V_{IO1}$ =-1.5 V in Fig. 6-1). Voltage  $V_{MSq}$  from the calibration circuit is monitored while bits  $b_1$  to  $b_7$  controlling the clock synchronizer producing CLK1 are incremented from 0 to 127. The code corresponding to the minimum output voltage from the calibration circuit is called  $t_{D1}$ . It corresponds to approximate time alignment between simultaneous transitions of the clock and data applied to latch L1. As the latch captures D=0 or D=1 with approximately equal probabilities,  $V_{MSq}$  approaches its lowest value. Once  $t_{D1}$  has been identified, the synchronizer for CLK1 is set to  $t_{D1} + t_{setup}$ . The calibration sequence continues in Step-2, with the first and second latch on ( $V_{IO1}$ = $V_{IO2}$ =-1.5 V in Fig. 6-1). A loop tests all 128 possible combinations for bits  $b_8$  to  $b_{14}$  addressing clock synchronizer 2 while monitoring  $V_{MSq}$ . The code which yields the maximum



Fig. 6-16: Calibration sequence for DA cells during input line phase adjustment.

 $V_{MSq}$  is identified. In the final step (Step-3), the third output stage is biased ON ( $V_{IO3}$ =-1.5 V) and following a similar procedure to Step-2, bits  $b_{15}$  to  $b_{21}$  addressing clock synchronizer 3 are used to vary the phase of CLK3, and the code where  $V_{MSq}$  is maximum is determined.

# 6.3 40-Gb/s digitally-controlled DA prototype

The 3-mm<sup>2</sup> digitally-controlled MZM-DA driver prototype is shown in Fig. 6-17. It has been fabricated in IBM's 0.13- $\mu$ m SiGe-BiCMOS 8HP technology [9] (BV<sub>CEO</sub>=1.8 V, BV<sub>CBO</sub>=5.9 V). DC and RF on-wafer probes are used to characterize the driver. An external clock source generates the 40-GHz clock which is fed to the chip via 40-GHz GSGSG probes. The 1-GHz trigger for eye pattern measurements is generated by mixing the clock and a second generator set at 39-GHz (i.e., 1 GHz below the clock synthesizer frequency). The active area of the complete circuit shown in Fig. 6-1 is 1.5 x 1.2 mm<sup>2</sup> including bondpads.



Fig. 6-17: 40 Gb/s MZ modulator driver prototype photomicrograph.

The prototype is powered from dual +5/-2.5 V supplies and a breakdown of the power consumption is shown in Fig. 6-18. The limiting amplifiers (LA<sub>1,2,3</sub>) consumes 120 mA from the +5 V and -2.5 V supplies. All the other circuitry is powered from a single -2.5 V supply. The pre-drivers, latches, data buffers, and the retiming flip-flop consume 197 mA, clock buffers, DACs, vector summers, phase inverters, injection-locked oscillator, buffers to distribute the clock, and input clock buffers consume 211 mA, for a total DA current consumption of 408 mA (DA power consumption is 1.92 W). The power consumption reduces to 1.55 W when external bias-Ts are used for DC biasing rather than biasing via the back terminations. The 1-0 data stream created by the dynastat divider consumes 27 mA from 0 V to -2.5 V and the calibration circuit consumes 10 mA from the +5 V supply to ground. The calibration circuits (i.e., the divider for pattern generation and calibration circuit) increase the power consumption by 117 mW, but these blocks are biased off after



Fig. 6-18: Modulator driver prototype power consumption.

calibration. The 40-Gb/s 2<sup>11</sup>-1 PRBS, consumes 250 mA, the countdown trigger generator 220 mA, and the PRBS MUX 18 mA, all from -2.5 V, for a total of 1.22 W for blocks that are required during characterization only.

The output return loss (ORL) was measured, with all output stages biased "on", using an Agilent 65 GHz PNA-X 5247A. A thru-reflect-line (TRL) standard was used for calibration. Differential S-parameters were computed from single-ended measurements at the output following the procedure from ref. [10]. The measured ORL includes the electrical behavior of the loaded output TL (output stages biased on), the top-metal aluminum output line, back-termination resistors, and the output bondpads. Excellent agreement is seen between the measured and simulated output



Fig. 6-19: DA output return loss.

return loss, as shown in Fig. 6-19. Measured ORL is better than -20 dB from 1 GHz to 58 GHz, and better than -15 dB up to 65 GHz.

The calibration sequence proposed in Section 6.2 was applied to set the clock timing data into each driver stage. Bidirectional control data is transferred via the on-chip serial-to-parallel interface (SPI, see Fig. 6-1). A 1-0 test pattern is applied during calibration and the effect of clock phase on the output waveform was measured using a Tektronix TDS-8000B sampling oscilloscope and 80E09 (60 GHz bandwidth) sampling modules.

The calibration circuit output measured across the process described in Section 6.2 is shown in Fig. 6-20. The first step in the calibration procedure is to detect the rising edges of the 1-0 test pattern applied to the driver data input with only one output stage biased "on". The data transition is detected as a dip in the output voltage as clock phase CLK<sub>1</sub> is swept digitally (i.e., following Step-1 in Fig. 6-16). The phase codeword for clock synchronizer driving stage LA<sub>1</sub> of the DA is then adjusted to ensure that the set-up time requirement for latch L1 is not violated



phase adjustment.

(i.e., ~2.4 ps delay between data and clock transitions). The second stage of the DA is then turned "on" (i.e., both LA<sub>1</sub> and LA<sub>2</sub> are "on" with  $V_{IO1}=V_{IO2}=-1.5$  V). The amplitude of the calibration circuit output is monitored while the phase control codeword to the synchronizer producing CLK<sub>2</sub> is varied, as shown in Fig. 6-20. During this step of the calibration process (Step-2), the code applied to the synchronizer generating CLK<sub>1</sub> remains fixed. The minimum rise/fall time is realized when the output voltage of the calibration circuit reaches its peak value. All 3 driver stages are then biased "on" for the final calibration step (Step-3 of Fig. 6-16), where the codeword applied to the clock synchronizer generating CLK<sub>3</sub> is swept in value. Again, the peak voltage output of the calibration circuit indicates that minimum rise/ fall time is realized. The time required to execute the entire calibration sequence using an external microcontroller is < 0.5 seconds (ADC conversion time of ~1 ms for the calibration voltage output).

Digital control over the inputs to the DA opens the possibility of generating other edge profiles at the output. For example, the slowest and fastest measured waveforms shown in Fig. 6-21. The slow output (see Fig. 6-21a) is triangular in

#### Chapter 6



Fig. 6-21: Measured output waveforms for two input phase settings.

waveshape, with symmetric, 12-ps rise/fall times (20-80%) and a single-ended output amplitude of 3  $V_{p-p}$ . This output is obtained for clock synchronizer control codes corresponding to 9 ps interstage delay. The fast output with 6-ps rise/fall times (see Fig. 6-21b) is obtained for 3 ps interstage delay.

After calibration, the data source is switched from the 1-0 pattern to the on-chip  $2^{11}$ -1 PRBS generator for characterization. The output eye measured for the calibrated DA at 40-Gb/s is shown in Fig. 6-22a, and it is identical at both single-ended outputs (except for a phase inversion). The common-mode content in the output signal (aside from DC offset) is negligible, thus the differential output is simply twice the amplitude of the single-ended output signal. The rise and fall times



Fig. 6-22: Time domain driver output at 40GHz eye diagram.

#### Chapter 6

(20%-80%) for 6- $V_{p-p}$  differential output (3- $V_{p-p}$  single-ended) are identical at 6 ps each. The differential output simulated after parasitic extraction is shown in Fig. 6-22b. The on-chip trigger countdown output is used to synchronize the sampling scope measurement. The rms time jitter in the output signal is 797 fs, which is very close to the 724 fs rms measured for the synthesizer supplying the 40-GHz input clock. Thus, jitter added by the DA is 330 fs rms.

The single-ended outputs of the driver, with the on-chip PRBS signal used as the input, is shown in Fig. 6-23. Ringing observed in the 1-to-0 transition arises from second harmonic content in the single-ended output voltage. The interconnect between the pre-drivers and limiting amplifiers (LA<sub>1,2,3</sub>) crosses one of the output transmission lines. Simulation of the output line used during the design of the DA did not include the crossing, which introduces a parasitic coupling that creates ~200 mV difference between outputs OUT and  $\overline{OUT}$ . Resimulation of the DA including the interconnection coupling matches the measurements. The parasitic capacitance of the crossover in the layout can be compensated easily by adding dummy interconnections in the other output line to balance the interconnection parasitics.



Fig. 6-23: Time domain driver output with on-chip 2<sup>11</sup>-1 PRBS.



Fig. 6-24: Output spectrum 40 GHz 2<sup>11</sup>-1 PRBS.

The measured output signal spectrum is shown in Fig. 6-24, which is measured with a R&S FSU-50 spectrum analyzer. The notch at 40 GHz is characteristic of a 40-Gb/s PRBS. The frequency spacing between tones in the spectrum of Fig. 6-25 is equal to:  $f_{clk}/(2^{11}-1) = 19.54$  MHz, as expected for a  $2^{11}-1$  length PRBS sequence generated from a 40-GHz clock. The performance of the driver using longer data sequence (e.g.,  $2^{31}-1$  PRBS) requires increased decoupling of the common-mode of the back-termination (capacitor C<sub>D</sub> in Fig. 6-1). Extra capacitance can be added external to the IC to augment the on-chip portion of C<sub>D</sub>.



Fig. 6-25: Discrete tones 40 GHz 2<sup>11</sup>-1 PRBS.

The prototype is compared to other drivers in Table 6-1. The three commercially available drivers listed (5882-picosecond, 810-SHF, and TGA4942-Qorvo) have an output return loss of only -10 dB up to 40 GHz, or 10 dB poorer than the ORL measured for the digitally-controlled DA driver, which is better than 20 dB below 57 GHz.

| Ref.                              | Data<br>rate<br>(Gb/s) | Vout<br>p-p<br>(V) | Vin<br>p-p<br>(V) | P <sub>DC</sub><br>(W) | V <sub>DC</sub> (V) | ORL<br>(dB)    | Rise/<br>Fall<br>(ps) | Jitter<br>rms<br>(ps) | Area<br>(mm x<br>mm)   | Application/<br>Technology      |
|-----------------------------------|------------------------|--------------------|-------------------|------------------------|---------------------|----------------|-----------------------|-----------------------|------------------------|---------------------------------|
| [11]<br>PSPL5882<br>Tektronix     | 40                     | 2.7<br>S.E.        | 0.6<br>S.E.       | 1.3                    | +8,<br>-5           | -10<br><40 GHz | 9/9<br>10-90%         | -                     | -                      | Electroabsor-<br>sion mod.      |
| [12] 810<br>SHF Comm.<br>Tech. AG | 40                     | 6.4<br>S.E.        | 0.33<br>S.E.      | 2.2                    | +10                 | -10<br><40 GHz | 9/9<br>20-80%         | 0.55                  | -                      | MZM                             |
| [13]<br>TGA4942-<br>SL Qorvo      | 43                     | 8<br>S.E.          | 0.4<br>S.E.       | 1.4                    | +6                  | -10<br><40 GHz | 10/10<br>20-80%       | 0.4                   | -                      | Modulator<br>driver             |
| [14], JSSC<br>2004                | 40                     | 6.3<br>Diff.       | 2.0<br>Diff.      | 1.7                    | +5,<br>-4.3         | -20<br><45 GHz | 6/7<br>20-80%         | 0.6                   | 1.0 x 1.7<br>0.5 x 1.5 | 1.2µm In-<br>GaAs-InP HBT       |
| [15], RFIC<br>2003                | 40                     | 7.5<br>Diff        | 1<br>Diff.        | 3                      | +1.5, +4,<br>-4.5   | -10<br><20 GHz | 10/10<br>20-80%       | 0.8                   | 1.4 x 1.7<br>0.9 x 1.4 | 0.15μm<br>PHEMT                 |
| [16], JSSC<br>2003                | 40                     | 6<br>Diff          | 1.4<br>Diff       | 2.8                    | -5.2                | -15<br><37 GHz | 12/12<br>20-80%       | 1.0                   | 1.95 x 4               | 0.15µm GaAs<br>PHEMT            |
| This work                         | 40                     | 6<br>Diff          | 0.3<br>Diff.      | 1.92                   | -2.5,<br>+5         | -20<br><57 GHz | 6/6<br>20-80%         | 0.33                  | 2.0 x 1.5<br>1.0 x 0.8 | MZM /<br>0.13µm SiGe-<br>BiCMOS |

 Table 6-1: Modulator driver performance comparison

The power consumption of the 5882 Picosecond PL driver is 1.3 W to deliver a 2.7  $V_{p-p}$  single ended output voltage. Therefore, two of them are required to deliver 5.4  $V_{p-p}$  differential and a power consumption of 2.6 W is required (i.e., 20 % higher power to deliver a 10% smaller voltage compared to the 6- $V_{p-p}$  output voltage in this work, which consumes 1.92 W). The 810 driver from SHF AG has a 6.6 % higher output voltage, but consumes 15 % more DC power and achieves a rise/fall time of 9 ps, or 50 % larger than the 6 ps symmetric rise/fall times of this work. The TGA4942-SL from Qorvo has 33 % higher output voltage (i.e., 8  $V_{p-p}$ ) consuming 27 % less power, compared to our design, nevertheless it has 66% higher rise/fall times (i.e., 10/10 ps), making it the slowest among the commercially available drivers. The driver designed in a III-V technology from 2004 (ref. [14]) achieved rise/ fall times of 6/7 ps (respectively) at a 6.3  $V_{p-p}$  output voltage while consuming 11 % less DC power compared to our prototype. Nevertheless, it requires a data input voltage of 2  $V_{p-p}$ , i.e., 6.6x larger than our driver. This higher input voltage requires a pre-driver stage that will increase the overall power consumption when used in a practical implementation, and it is likely that the predriver combined with the driver will consume a greater DC power in total. Ref. [15] achieved 25 % higher output voltage (i.e. 7.5  $V_{p-p}$ ) using a 0.15  $\mu$ m III-V PHEMT technology, yet the output rise/ fall times are only 10 ps (i.e., 4 ps more than our design), and it consumes 3 W of DC power (56 % higher power consumption compared to the 1.92 W of the SiGe implementation). Another III-V example [16], consumes 45 % higher DC power and realizes 12 ps rise/fall times (i.e., 2x the digitally-controlled design presented of 6 ps). It should be noted that this work presents the only driver with trimmable rise/fall times, and includes calibration and BiST circuits to compensate for PVT variations, making it unique among any other optical modulator driver at this data rate.

### 6.4 Summary

A digitally-controlled 40-Gb/s modulator driver prototype produces  $6-V_{p-p}$  differential output voltage swing with excellent symmetry at the outputs. The distributed amplifier with a digital input line is the first driver reported that can deliver edge rates trimmable from 6 ps (min. rise/fall time) to 12 ps (max.) under digital control at the 40-Gb/s. The rise/fall times realized by the DA prototype are faster than most drivers implemented in silicon or III-V technologies reported to date. Timing jitter added by the driver is 0.33 ps rms (i.e., 1.3 % of the period at 40-Gb/s). The measured output return loss is below -20 dB across 57 GHz, and better than -15 dB up to 65 GHz. The power consumption of this 40-Gb/s DA driver is 10 % less than a 10-Gb/s driver reported previously, despite the 4x increase in output data rate, and it can be reduced 16 % more (to 1.55 W) if bias-Ts are used in the SiGe prototype. The on-chip energy detector circuit facilitates the calibration of the digital

input line using an integrated 1-0 pattern generator and a relatively simple, 3-step calibration sequence. The  $2^{11}$ -1 PRBS integrated on-chip also enables built-in self-testing of the driver.

## References

[1] L. Vera and J.R. Long, "A Dynastat frequency divider with DC-153 GHz Range," *Electronic Letters*, vol. 51, no. 12, pp. 908-910, June 2015.

[2] T.S.D. Cheung, J.R. Long, "Shielded passive devices for silicon-based monolithic microwave and millimeter-wave integrated circuits," *IEEE-JSSC*, vol. 41, no. 5, pp. 1183-1200, May 2006.

[3] H. Veenstra, G.A.M. Hurkx, D. van Goor, H. Brekelmans, J.R. Long, "Analysis and design of bias circuits tolerating output voltages above BV<sub>CEO</sub>," *IEEE-JSSC*, vol.40, no.10, pp. 2008-2018, Oct. 2005.

[4] K. Jonggook, A. Sadovnikov, P. Menz, J. Babcock, "Considerations for forward active mode reliability in an advanced hetero-junction bipolar transistor," Proceedings of Bipolar/BiCMOS Circuits and Technology Meeting, pp. 1-4, Sept. 2012.

[5] W. L. Chan, J. R. Long and J. J. Pekarik, "A 56-to-65GHz Injection-Locked Frequency Tripler with Quadrature Outputs in 90nm CMOS," IEEE-ISSC, San Francisco, CA, 2008, pp. 480-629.

[6] B. Razavi, "A study of injection locking and pulling in oscillators," *IEEE-JSSC*, vol. 39, no. 9, pp. 1415-1424, Sept. 2004.

[7] L. Vera and J.R. Long, "A 40 Gb/s low-power 2<sup>11</sup>-1 PRBS with distributed clocking and trigger countdown output," Accepted for publication in TCASii, 2016.

[8] R.G. Meyer, "Low-power monolithic RF peak detector analysis," *IEEE-JSSC*, vol. 30, no. 1, pp.65-67, Jan 1995.

[9] B.A. Orner, Q.Z. Liu, B. Rainey, et al., "A 0.13 μm BiCMOS technology featuring a 200/280 GHz (fT/fmax) SiGe HBT," Proceedings of Bipolar/BiCMOS Circuits and Technology Meeting, pp. 203-206, Sept. 2003.

[10] D.E. Bockelman, W.R. Eisenstadt, "Combined differential and common-mode scattering parameters: theory and simulation," IEEE Transactions on Microwave Theory Technology, vol.43, no.7, pp. 1530-1539, Jul. 1995.

[11] Tektronix, "40 Gb/s Broadband Amplifier," PSPL 5882 datasheet, April 2013 [revised Sept. 2014]

[12] SHF Communication Technologies, "Datasheed SHF 810 Broadband Amplifier," SHF 810 datasheet, Feb. 2007.

[13] Qorvo, "TGA4942-SL 43Gb/s DPSK Modulator Driver," THA 4942 datasheet, Dec. 2012.

[14] Y. Baeyens, N. Weimann, P. Roux, et al., "High gain-bandwidth differential distributed InP D-HBT driver amplifiers with large (11.3 Vpp) output swing at 40 Gb/s," *IEEE-JSSC*, vol. 39, no. 10, pp. 1697-1705, Oct. 2004.

Chapter 6

[15] Y. Baeyens, P. Paschke, V. Houtsma, et al., "Compact high-gain lumped differential 40 Gb/s driver amplifiers in production  $0.15 \,\mu m$  PHEMT technology," Proc. of RFIC, pp. 67-70, June 2003.

[16] D.S. McPherson, F. Pera, M. Tazlauanu, S.P. Voinigescu, "A 3-V fully differential distributed limiting driver for 40-Gb/s optical transmission systems," *IEEE-JSSC*, vol. 38, no. 9, pp. 1485-1496, Sept. 2003.

Chapter 6

# 7 Conclusions and Recommendations

Continuous development of integrated circuits is required to satisfy the demand of data bandwidth for communications. In optical communications, an electronic driver circuit delivers the electrical signal used to encode data into the optical carrier. This circuit was implemented this thesis using in a digitally-controlled distributed amplifier (DA) to reduce limitations associated with the input transmission line of a conventional DA. Also, broadband benchmark circuits were examined and characterized separately. Challenges in the design of these broadband circuits included the operation from a reduced bias supply voltage, and optimization of their operating frequency. These challenges were addressed by innovations in circuit topologies and integration of custom-made passive components.

Techniques used to design the benchmark circuits were applied in the design of a 40-Gb/s digitally-controlled Mach-Zehnder modulator driver. The driver has built-in calibration capability using an energy detector circuit and a 3-step calibration sequence. Moreover, built-in self-test is incorporated with a 40-Gb/s  $2^{11}$ -1 PRBS generator. The driver achieves increased bandwidth (4x) and reduced power consumption (10% less) compared to previous work [1].

Details of the design and characterization of the benchmark circuits and the digitally-controlled modulator driver were presented in Chapters 2 to 6 of this thesis. The demonstrators were implemented in 90-nm and 130-nm SiGe-BiCMOS technologies [2] [3]. However, similar design techniques can be applied in other technologies (e.g., CMOS).

# 7.1 Major contributions

Contributions to the design of broadband circuits (amplifiers, frequency multipliers, frequency dividers, PRBS generator, energy detectors, and optical modulator drivers) are summarized in this section.

#### **Broadband amplifier**

Broadband amplifiers implemented using a Darlington pair with resistive feedback were investigated in Chapter 2. A signal flow analysis was used to derive equations for its low frequency gain and input/output matching. The optimum size of the Darlington pair transistors was investigated in a BiCMOS technology, and the trade-offs of inductive peaking and cascoding were considered. Moreover, three demonstrators implemented in IBM 90-nm SiGe-BiCMOS [3] were fabricated to verify the findings.

The sizes of the transistors in the first prototype were optimized for maximum bandwidth. This amplifier was used as reference. The second prototype implements series-peaking, which increases the bandwidth by 25 % respect to the reference amplifier. The third prototype combines series-peaking and cascoding to increase the bandwidth by 28 % more, i.e., a total of 53 % bandwidth improvement respect to the reference amplifier.

Different characteristics of the amplifier were reviewed and verified from measurements, including noise figure, stability, and intermodulation distortion. The study of the Darlington broadband amplifier circuit in this thesis resulted in the benchmark circuit with the highest gain-bandwidth product over DC power consumption reported among this type of amplifiers [4], i.e., 9.1 GHz/mW for the 12-dB gain and more than 110 GHz bandwidth, with a power consumption of 48 mW from a 2.1 V supply.

#### **Frequency multipliers**

Unbalanced cross-coupled differential pairs were used as the core of the frequency multiplier presented in Chapter 3. The topology can be implemented in bipolar or CMOS technologies [5], and it can be biased using a low supply voltage, e.g., 1.8 V when implemented in a BiCMOS technology (1.2 V for the core and 0.6 V for a MOS tail current source). Its transfer function generates even-order harmonics only, and it was used to implement a frequency doubler and a frequency quadrupler. The optimum input voltage for maximum conversion gain and minimum generation of undesired harmonics was estimated from the transfer function and verified from measurements. A narrowband and a broadband design were used to investigate the performance of the topology used for frequency multiplication. The input and output loops of the circuit were optimized to increase the bandwidth of the broadband design, and to reduce the power consumption of the narrowband design [6]. Measured conversion gain, of the fabricated broadband frequency doubler, is positive within DC-100 GHz [7]. Moreover, the frequency quadrupler prototype implemented using active tunable loads has 0 dB conversion gain, center at 89 GHz output, and 81-97 GHz 3-dB bandwidth. The performance of these prototypes confirm the potential of the topology for high-frequency broadband and narrowband frequency multiplication.

#### **Frequency divider**

Frequency dividers operate in either static or dynamic mode. Static dividers cover a frequency range from DC to a maximum toggle-frequency ( $f_{max\_toggle}$ ). Dynamic dividers operate at frequencies above  $f_{max\_toggle}$  using the principle of regenerative frequency division, but are limited to a minimum operating frequency ( $f_{dyn-min}$ ). In Chapter 4, a dual-mode frequency divider was proposed. The dynamic/ static, or dynastat divider concept was proven with the implementation of a stand-alone prototype that is biased using a 4.5 V supply [8]. The dual modes of operation
were verified by the measurement of the self-oscillation frequency in either mode (78 GHz in static mode and 133 GHz in dynamic mode). A low-voltage dynastat divider was also implemented using a -2.5 V DC supply. In the second implementation, the voltage that controls the operation-mode was used to increase the sensitivity for the external 40-GHz clock. The new divider topology maximizes the operating bandwidth and introduces the control of the divider input sensitivity.

#### **BiST, BiC, and digitally-controlled DA**

Replacing the input TL of a distributed amplifier with a digitally-controlled input interface was demonstrated to eliminate the dispersion, attenuation, ringing and pulse distortion [1]. Furthermore, integration of calibration and test capabilities, demonstrated in this thesis, increase the yield and reduce the time and complexity required during production testing.

Chapter 5 presented the design of a  $2^{11}$ -1, 40-Gb/s PRBS generator, which is based on linear feedback shift register operating at a half-rate clock, and a 2:1 interleaving output MUX. The design uses synthesized transmission lines for clock distribution, and it generates a trigger output via nine cascaded divided-by-two circuits. Furthermore, the registers adopt a new topology to operate from a -2.5 V supply voltage. The PRBS and trigger generator consume 250 mA and 220 mA, respectively, both from the same -2.5 V DC supply [9].

The design and characterization of a digitally-controlled MZM driver was presented in Chapter 6. Compared to previous work [1], innovations in the design of the new driver include: 1) built-in calibration capability for the digital input line facilitated by an on-chip energy detector and a 3-step calibration algorithm, 2) a shielded output line in standard aluminum top metal (not copper), 3) 10% less power consumption despite the 4x increase in data rate, 4) wideband operation across 28-48 Gb/s data rates enabled by a digitally-controlled clock phase generator, and

5) an on-chip 1-0 data sequence and a  $2^{11}$ -1 PRBS data source used for built-in self-test (BiST), calibration and characterization.

The driver produces up to 6  $V_{p-p}$  differential output voltage swing with excellent symmetry. It is the first to deliver edge rates trimmable from 6 to 12 ps rise/ fall times. The timing jitter added by the driver is 1.3% of the period at 40 Gb/s, and the output return measured across 57 GHz is below -20 dB. The prototype was fabricated in a 0.13-µm BiCMOS technology and it consumes 1.99 W, which can be reduced to 1.55 W using bias-Ts at the output of the circuit.

### 7.2 Recommendations for future work

Recommendations for future work on benchmark circuits, similar to the ones presented here, and digitally-controlled modulator drivers are outlined in the following sub-sections.

### **Broadband amplifiers**

Techniques used to extend the bandwidth of a Darlington pair feedback amplifier were reviewed in Chapter 2. However, variations of the Darlington pair result from combining bipolar and CMOS devices in a BiCMOS technology. The understanding of the trade-offs using different device combinations, and the implementation of the amplifier in a differential configuration is an area for future work. For example, the use of transformers to extend the operating frequency, or neutralization using non-linear capacitors (C-V response) to improve the linearity.

### **Frequency multipliers**

Broadband and narrowband frequency multipliers benefit from the low supply voltage required by unbalanced cross-coupled differential pairs. The narrowband frequency doubler presented in Chapter 3 should be implemented to demonstrate the low power capability of this topology, and the frequency quadrupler presented in the

### Chapter 7

same chapter should be improved further applying the same optimization used in the frequency doubler.

The input and output networks used for broadband operation increased the overall power consumption. Applications that benefit from broadband operation or reconfigurability (e.g., a multi-standard transceiver) require further research on reconfigurable input and output networks to be used together with the low-voltage multiplier core presented in Chapter 4.

### **Frequency dividers**

A frequency divider that can operate in dynamic or static mode (dynastat divider) overcomes the frequency range limitations of single mode dividers. Furthermore, control of the operation mode via an electrical signal can also be used to handle the divider sensitivity at frequencies between the static and dynamic self-oscillation. Automatic switching between the operation modes will allow the extension of the overall operating frequency without the need of external control. The implementation of automatic mode control remains to be investigated.

The dynastat divider concept was proven in a BiCMOS technology, however, frequency dividers implemented in CMOS technologies take advantage of the complementary devices. Therefore, further research is required to extend the concept of dual-mode dividers in other divider topologies.

#### **Distributed Amplifier for Multi-Gb/s Optical Modulator Drivers**

The 40 Gb/s driver with 6  $V_{p-p}$  output voltage demonstrated in this work introduced a new phase alignment approach compared to previous work. An energy detector circuit, 3-step calibration sequence, and 2<sup>11</sup>-1 PRBS BiST circuit were implemented successfully. However, integration of an analog-to-digital (ADC) converter, microcontroller, and memory, which facilitate autocalibration of the driver, was not realized in the same chip, and is proposed for future implementations.

For example, the microcontroller, data and instruction SRAM, and ADC implemented in a 0.13- $\mu$ m CMOS process in [10], which occupies 0.8  $\mu$ m x 0.8  $\mu$ m, can be integrated in the 0.13- $\mu$ m SiGe-BiCMOS process used to implement the driver.

The calibration circuit presented in this work requires a 1-0 data sequence, however, a pattern-independent calibration functionality should be investigated. For example, the retiming clocks of the digitally-controlled DA can be varied while the instantaneous power at the output of each gain stage ( $P_{DAgain_stage}$ ), and the DA output power ( $P_{DAout}$ ) are measured, then the code for minimum rise/fall times corresponds to the one for which  $|P_{DAout} - \Sigma P_{DAgain_stage}|$  is minimum.

The complexity of optical networks will continue to increase, and multistandards should be expected to coexist. Operation of the driver at lower frequencies is limited by the phase alignment circuitry, which derives the phases for the DA stages using weighted I/Q vectors generated from a reference clock. This approach creates frequency dependent phases, as explained in section 7.1.3 of Chapter 7. A frequency-independent clock phase generator is required. A circuit implementing a variable time response could be implemented using a reconfigurable ring oscillator that changes the number of stages according to the desired delay.

Silicon technologies have proven high integration capacity in radio systems. Overall, software defined radio (SDR) integrates digital processing and facilitates flexible designs (reconfigurability) [11]. Similarly, the integration of data processing can also benefit optical links, where integration increases the optical channel capacity and reduce problems associated with the interconnection of different ICs. Further research is needed to integrate the functionality of different components of the optical link, such as equalization, serial-to-parallel conversion, symbol mapping, etc.

Future designs must limit its maximum output voltage including the reliability of the driver, for example, degradation of the time-to-failure due to beta degradation.

This work was conservative because the output voltage remained below the device  $BV_{CBO}$ , however, the definition of a safe-operating-area (SOA) considering all degradation mechanisms remains to be used as the limit for maximum operation conditions. Not only the definition of a SOA, but also its implementation as part of the device compact model remain to be investigated.

Finally, scaling in CMOS technologies allow more functionality to be integrated in a given area, which makes them excellent candidates for the implementation of a digitally-controlled driver with higher functionality. A 10 Gb/s  $6-V_{pp}$  differential voltage driver in a 65 nm CMOS process was presented in [12], however, the limitations on the maximum speed and maximum output voltage need to be investigated remain to be investigated.

#### References

[1] Y. Zhao, L. Vera and J.R. Long, "A 10-Gb/s, 6-V p-p, digitally controlled, differential distributed amplifier MZM driver," *IEEE-JSSC*, vol. 49, no. 9, pp. 2030-2043, Sept. 2014.

[2] B.A. Orner, Q.Z. Liu, B. Rainey, et. al., "A 0.13  $\mu$ m BiCMOS technology featuring a 200/280 GHz (fT/fmax) SiGe HBT," Proc. of the IEEE-BCTM, Toulouse, France. pp. 203-206, Sept. 2003.

[3] J.J. Pekarik, J. Adkisson, P. Gray, et al., "A 90nm SiGe BiCMOS technology for mmwave and high-performance analog applications," Proceedings of the IEEE-BCTM, San Diego CA, pp. 92-95, Oct. 2014.

[4] L. Vera, J.R. Long, B.J. Gross, B.J., "A low-power SiGe feedback amplifier with over 110GHz bandwidth," Proceedings of IEEE-BCTM, San Diego CA, pp.1-4, Sept. 2014.

[5] K. Kimura, "A bipolar four-quadrant analog quarter-square multiplier consisting of unbalanced emitter-coupled pairs and expansions of its input ranges," *IEEE Journal of Solid-State Circuits*, vol. 29, no. 1, pp. 46-55, Jan. 1994.

[6] L. Vera, J.R. Long, "A DC-100 GHz active frequency doubler with a low-voltage multiplier core" *IEEE Journal of Solid-State Circuits*, vol. 50, no. 9, pp. 1963-1973, Sept. 2015.

[7] L. Vera, J.R. Long, and B.J. Gross, "An active frequency doubler with DC-100GHz range" Proc. of the IEEE-BCTM, San Diego CA, pp. 9-12, Oct. 2014.

[8] L. Vera and J.R. Long, "A dynastat frequency divider with DC-153 GHz range," *Electronic Letters*, vol. 51, no. 12, pp. 908-910, June 2015.

[9] L. Vera and J.R. Long, "A 40 Gb/s low-power 2<sup>11</sup>-1 PRBS with distributed clocking and trigger countdown output," Accepted for publication in TCASii, 2016

Chapter 7

[10] M. Khayatzadeh, Z. Xiaoyang, T. Jun Tan, et al., "A 0.7-V 17.4-/spl mu/W 3-lead wireless ECG SoC," Transactions on Biomedical Circuits and Systems, vol.7, no. 5, pp. 583-592, Oct. 2013.

[11] C. Moy, J. Palicot, "Software radio: a catalyst for wireless innovation," IEEE Communications Magazine, vol. 53, no. 9, pp. 24-30, September 2015.

[12] Y. Kim, W. Bae and D. K. Jeong, "A 10-Gb/s 6-Vpp differential modulator driver in 65-nm CMOS," 2014 IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne VIC, pp. 1869-1872, 2014.

Chapter 7

# **Appendix A**

A small-signal model for the Darlington resistive feedback amplifier is shown in Fig. A-1a. The model is valid at frequencies where the impedances are dominated by the real part of the impedance (i.e.,  $f \ll f_T$ ).



b. Signal flowgraph

Fig. A-1: Shunt feedback amplifier low frequency model.

A flowgraph for the circuit is shown in Fig. A-1b. The generator, transistor base node, and output node voltages are represented by  $E_g$ , V, and  $V_o$ , respectively. The transmittances from one node to another are represented by  $\alpha$ ,  $G_a$  (active gain),  $G_p$  (passive gain), and H (feedback). The transmittance  $\alpha = \frac{V}{E_g}\Big|_{V_0=0} = \frac{R_F||R_{be}}{R_G+R_F||R_{be}}$ , and the active gain  $G_a = \frac{V_o}{V}\Big|_{E_g=0} = -g_m \cdot (R_L||R_F)$ . The passive gain via bridging resistance  $R_F$  is  $G_p = \frac{V_o}{V}\Big|_{E_g=0} = \frac{R_L}{R_L+R_F}$ , where  $R_o$  is assumed >>  $R_L$ . Finally,  $H = \frac{V}{V_o}\Big|_{E_g=0} = \frac{R_G||R_{be}}{R_F+R_G||R_{be}}$ . Appendix A



Fig. A-2: Shunt feedback amplifier low frequency signal flow

The signal flow diagram of the circuit, shown in Fig. A-2, includes its transmittances, and it is used to calculate the circuit voltage gain using Mason's gain rule

$$A_{\nu} = \frac{\alpha(G_{P} + G_{a})}{1 - (G_{p} + G_{a})H} .$$
 (1)

Replacing the transmittances from Fig. A-2 into Eq. 1

$$A_{\nu} = \frac{R_L - g_m R_F R_L}{R_L + R_F + R_G + g_m R_G R_L} = \frac{R_L}{R_L + R_F + R_G + g_m R_G R_L} - \frac{g_m R_F R_L}{R_L + R_F + R_G + g_m R_G R_L} \quad , \tag{2}$$

and because the first term is negligible  $(1/(1 + R_F/R_L + R_G/R_L + g_mR_G) \ll 1)$ , the voltage gain reduces to

$$A_{v} = -\frac{g_m R_F R_L}{R_L + R_F + R_G + g_m R_G R_L} \tag{3}$$

If the amplifier input and output impedances are matched (i.e.,  $R_G = R_L = R$ ), the voltage gain equals

$$A_{v} = \frac{-g_{m}R_{F}R}{2R + R_{F} + g_{m}R^{2}} \approx \frac{-g_{m}R_{F}R}{2(R + R_{F})} = \frac{-g_{m}(R_{F} \parallel R)}{2} .$$
(4)

In a matched amplifier, half the voltage from the generator reaches the amplifier input (the other half is dissipated by the generator impedance). Therefore, the forward transfer coefficient  $(S_{21})$  of a matched amplifier equals twice the voltage gain, and from Eq. 4

$$S_{21} = 2A_v = -g_m(R_F || R) . (5)$$

The input and output impedance can be obtained from the feedback impedance divided by  $(1 - G_a)$ , assuming that the passive gain  $G_p$  is negligible, and the resistances  $R_{be}$  and  $R_o$  are much larger than the feedback impedance. Then,

$$Z_{in} = Z_{out} = \frac{R_F}{1 - S_{21}}$$
 (6)

Appendix A

166

# **Appendix B**

A small-signal model for a transistor embedded within a generator ( $E_g$  with series resistance  $R_G$ ) and a load ( $R_L$ ) is shown in Fig. B-1. The model is valid at frequencies where the transistor impedances are dominated by their imaginary component (i.e.,  $f \approx f_T/3$ ).



Fig. B-1: Small-signal circuit for frequency response analysis.

The frequency response of the circuit in Fig. B-1 can be estimated calculating its dominant pole:

$$s\{[C_{in} + C_F(1 + g_m R_L)]R_G + (C_L + C_F)R_L\}$$
  
=  $sC_{in}R_G + sC_F R_G(1 + g_m R_L) + sC_L R_L + sC_F R_L$  (1)

If resistive feedback is including in the circuit, as shown in Fig. B-2,  $sC_F$  is replaced by  $(sC_{\mu}R_F + 1)/R_F$  in Eq. 1, and



Fig. B-2: Simplified small-signal circuit for frequency response analysis.

Appendix B

$$sC_{in}R_{G} + (sC_{\mu} + 1/R_{F})R_{G}(1 + g_{m}R_{L}) + sC_{L}R_{L} + (sC_{\mu} + 1/R_{F})R_{L}$$

$$= sC_{in}R_{G} + sC_{\mu}R_{G}(1 + g_{m}R_{L}) + R_{G}/R_{F}(1 + g_{m}R_{L}) + sC_{L}R_{L} + sC_{\mu}R_{L} + R_{L}/R_{F}$$

$$= (1/R_{F})(R_{G}(1 + g_{m}R_{L}) + R_{L}) + s\{[C_{in} + C_{\mu}(1 + g_{m}R_{L})]R_{G} + (C_{L} + C_{\mu})R_{L}\}.$$
(2)

Then, the circuit frequency response can be calculated as

$$\frac{R_F}{g_m R_L R_G + R_G + R_L} s\{ [C_{in} + C_\mu (1 + g_m R_L)] R_G + (C_L + C_\mu) R_L \} , \qquad (3)$$

and the circuit time constant  $\tau_i$  is proportional to the components:

$$\tau_i \propto C_{in} R_G + C_{\mu} R_G (1 + g_m R_L) + C_L R_L + C_{\mu} R_L$$
(4)

### LIST OF PUBLICATIONS

### **Journal Papers**

**L. Vera** and J.R. Long, "A 40-Gb/s, 6-V<sub>pp</sub>, Digitally-Calibrated MZM Driver with On-Chip Calibration and 2<sup>11</sup>-1 PRBS Circuits," *IEEE Journal of Solid-State Circuits*, accepted for publication, 2016.

**L. Vera** and J.R. Long, "A 40 Gb/s Low-Power 2<sup>11</sup>-1 PRBS with Distributed Clocking and Trigger Countdown Output," *IEEE Transactions on Circuits and Systems II*, accepted for publication, 2016.

L. Vera and J.R. Long, "A DC-100 GHz Active Frequency Doubler with a Low-Voltage Multiplier Core," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 9, pp. 1963-1973, Sept. 2015.

L. Vera and J.R. Long, "A Dynastat Frequency Divider with DC-153 GHz Range," *Electronic Letters*, vol. 51, no. 12, pp. 908-910, June 2015.

Y. Zhao, **L. Vera**, J.R. Long, "A 10 Gb/s, 6 V p-p, Digitally Controlled, Differential Distributed Amplifier MZM Driver," *IEEE Journal of Solid-State Circuits*, vol. 49, no.9, pp. 2030-2043, Sept. 2014.

J. R. Long, Y. Zhao, W. Wu, M. Spirito, **L. Vera**, and E. Gordon, "Passive Circuit Technologies for mm-wave Wireless Systems On Silicon," *IEEE Transactions on Circuits and Systems I*: Regular papers, vol. 59, no. 8, pp. 1680–1693, Aug. 2012.

### **Conference Papers**

**L. Vera**, J.R. Long, and B.J. Gross, "A Low-Power SiGe Feedback Amplifier with Over 110 GHz Bandwidth" Proceedings of the IEEE Bipolar/BiCMOS Circuits and Technology Meeting, San Diego CA, pp. 1-4, Oct. 2014.

**L. Vera**, J.R. Long, and B.J. Gross, "An Active Frequency Doubler with DC-100GHz Range" Proceedings of the IEEE Bipolar/BiCMOS Circuits and Technology Meeting, San Diego CA, pp. 9-12, Oct. 2014.

J.J. Pekarik, J. Adkisson, P. Gray, Q. Liu, R. Camillo-Castillo, M. Khater, V. Jain, B. Zetterlund, A. DiVergilio, X. Tian, A. Vallett, J. Ellis-Monaghan, B.J. Gross, P. Cheng, V.; Kaushal, Z. He, J. Lukaitis, K. Newton, M. Kerbaugh, N. Cahoon, L. Vera, Y. Zhao, J.R. Long, A. Valdes-Garcia, S. Reynolds, W. Lee, B. Sadhu, D. Harame, "A 90nm SiGe BiCMOS Technology for mm-wave and High-Performance Analog Applications," Proceedings of the IEEE Bipolar/BiCMOS Circuits and Technology Meeting, San Diego CA, pp. 92-95, Oct. 2014.

Y. Zhao, **L. Vera**, J. R. Long, and D. L. Harame, "A 10 Gb/s 6  $V_{pp}$  Differential Modulator Driver in 0.18 µm SiGe-BiCMOS," in Technical Digest of IEEE International Solid-State Circuits Conference, Feb. 2013, pp. 132–133.

# FABRICATED ICs



DC-79 GHz Darlington broadband amplifier - BBA (BiCMOS 9HP)



DC-100 GHz Frequency doubler (BiCMOS 9HP)



DC-100 GHz Peaked Darlington BBA (BiCMOS 9HP)



DC-100 GHz Frequency doubler (BiCMOS 9HP)



DC-123 GHz Cascoded, peaked Darlington BBA (BiCMOS 9HP)



81-97 GHz Frequency quadrupler (BiCMOS 9HP)



DC-129 GHz Dynastat Frequency divider (BiCMOS 9HP)



10 Gb/s Digitally controlled DA and BiST PRBS (BiCMOS 7WL)



40 Gb/s PRBS (BiCMOS 8HP)



10 Gb/s Digitally controlled DA (BiCMOS 7WL)



40 Gb/s Digitally controlled DA, BiST PRBS, and BiC power detector (BiCMOS 8HP)

Fabricated ICs

## ACKNOWLEDGEMENTS

There has been many people supporting me through the realization of this thesis. My gratitude, affection and a humble *thank you* to all of them. Pursuing the understanding of circuit design has been made a more enjoyable experience thanks to my colleagues Simonetta, Wei Liat, Gunjan, Yi, Andres, Nitz, Luca, Yanyu, Vahid, Zhebin, Amir, Masouds, Iman, Augusto, Wuanghua and Akshay. Furthermore, I would like to thank my friends from the Huygens Talent Circle, with whom I could share the passion for what you do. Interesting, smart and genuine: Eszti, Zack, Yenni, Steffie, Genia, Knut and Brenda.

Marion and Rosario, a special thanks for your constant support, positive attitude and team spirit. Antoon, Ali, and Atef, are always there for anyone who needs their knowledge and skills, and anyone who gets to work with you appreciate it. IC design is not complete without the practical skills required to prepare a chip for testing, and everybody realize this, once you starts to work with Will, thanks for the uncountable times you were there to help and teach those skills.

During my years at TUDelft I shared the office with people who became friends. Thanks Armin for your friendship and always positive attitude. Thijmen and Janneke, you open your hearths to me, thank you very much for the wonderful friendship and for those many times you invited me into your life. Loek and Inneke you are terrific people and I feel blessed for having you in my life.

During my years in the Netherlands, I earn three new brothers: Patrick, Alejandro, and Marijn. Each of you with an incredible life partner on your side Giovanna, Ilse, and Lin, and now with your little ones Sebastian, Faye and Danya. All of you are in my mind and hearth. Part of my PhD was realized in the USA, where I made many professional and personal friends: Thomas, Jeff, Biz, Jack, Ginny, David, Mary Lou and Peter. A special couple became part of my world-wide family, Kurt and Donna. You (along with all my other family members) helped me to find a balance in my life.

Being from Bolivia, I come from a big family, more than thirty people who have supported me in more than one way. My family is always there for me and they know I am there for them. My two sisters (Paola and Fabiola), with their husbands (Hernan and Rodrigo) and my wonderful nephews and nieces (Camilo, Monserrat, Andres and Fernanda) are the world to me. My parents Carlos and Sonia who always set the example and inspiration for permanent improvement as human been. My wife Alejandra, the love of my life and who I admire and love, thanks for being on my side bringing happiness to our home.

TU Delft is a worldwide known university, and the people in this institution is the reason for that. From excellent level of academia there are plenty of professors who not only teach you but inspire you. Professors: Wouter, Marco, Leo, Chris, and Nick. Your lectures, discussions an excellent work contribute everyday to the education in our field, each of you with your own personality makes unique the experience in TUDelft.

Prof. John R. Long granted me the opportunity to pursue my Ph.D. I am not the person to summarize your knowledge and contributions to IC design, that is clear from the professional respect you have from all your colleagues. I want to thank the man, John, a private person whom I am privileged to know. Your consequence in life, incredible generosity and attitude to do what is right is more than inspiring. All these characteristics are difficult to find in a person, and that is why you have the admiration from the people that get to know you, I am grateful to be one of them. Thanks John.