

## M.Sc. Thesis

## Digital Cartesian feedback linearization of power amplifiers

Alejandro Enrique Viteri Vogel

#### Abstract

The efficient use of the power budget in mobiles and small satellite applications is of primary importance because of the reduced size of the power sources. The limited power supply has to be spent wisely. In a transmitter, the power amplifier is the main consumer of that budget, which is why it has to be as efficient as possible. Unfortunately, high efficiency in power amplifiers is strongly related to non-linearities and hence distortion. However, several techniques help to improve the linearity of the power amplifier. One of the most powerful means to linearize a system is using negative feedback.

In this work the analysis of a mixed-signal Cartesian feedback is carried out. Cartesian feedback offers two challenges: stability and phase shift. In a feedback system with a time delay in the loop, instability is likely to happen. Phase shift is the result of the time delay in the loop and non-linearities of the power amplifier.

First a model of the mixed-signal system is proposed, and the stability of the system is analyzed. Second, a model of the phase shift is proposed, and the conditions under which it can be reduced are given. The model is implemented in the digital domain. In order to realise this a design consisting of a phase shift detector, a signal rotation and a magnitude computation was created in VHDL and then synthesized, while targeting FPGA and 90nm CMOS technology.

An FPGA implementation shows a power consumption of 33.31[mW] for a total budget of 1.7[W] (1.96% of the total budget). The system reaches a 60° of phase margin with a loop gain of 10, for a bandwidth of 9.6[kHz]. The results show that it is possible to improve linearity at the expense of bandwidth when using Cartesian feedback.



## Digital Cartesian feedback linearization of power amplifiers

THESIS

submitted in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

in

MICROELECTRONICS

by

Alejandro Enrique Viteri Vogel born in Arica, Chile

This work was performed in:

Circuits and Systems Group Department of Microelectronics & Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology



**Delft University of Technology** Copyright © 2010 Circuits and Systems Group All rights reserved.

#### Delft University of Technology Department of Microelectronics & Computer Engineering

The undersigned hereby certify that they have read and recommend to the Faculty of Electrical Engineering, Mathematics and Computer Science for acceptance a thesis entitled "Digital Cartesian feedback linearization of power amplifiers" by Alejandro Enrique Viteri Vogel in partial fulfillment of the requirements for the degree of Master of Science.

Dated: July 5, 2010

Chairman:

prof.dr.ir. Edoardo Charbon, CAS, TU Delft

Advisor:

dr.ir. Nick van der Meijs, CAS, TU Delft

Committee Members:

ir Thijmen Hamoen, ELCA, TU Delft

dr. Amir Zjajo, CAS, TU Delft

## Acknowledgments

This adventure has come to an end for me. It was a long adventure, longer than I, and probably many others, had expected. But, finally, it is finished. At the end, what is left, are the pleasant feelings I have experienced while working on this thesis and all the knowledge that I have gained during this process.

As every adventure, it has had both its tough and delightful moments. Tough situations have taught me that quiting is not an option and that you sometimes have to fight to reach your goal. Delightful situations have made me realize that life is wonderful and I have to be thankful for the life that I am leading now.

None of my adventure experiences would have had a meaning without my accomplices:

- The Circuit and System group. Thanks for the opportunity to do my MSc. thesis work with you.

- Nick van der Meijs. Thanks for accepting me under your supervision. Your constant feedback in details related to the form and format were as important as the content of this work. Without your keen eye this thesis would not have had the proper presentation of an academic work.

- Thijmen Hamoen. Thanks for your dedication and time spent on reviewing the analog modeling and thanks for your valuable feedback.

- Amir Zjajo. Thanks for your interest in my work and your dedication in reviewing my document.

- Thanks to the Committee Members for their time and corrections.

A special thanks to my friend and colleague Ariel Leonardo Vera Villarroel. Thanks for all the hours spent in long discussions; your opinions and suggestions were always helpful and welcome. We have proved that Chileans and Bolivians can be friends and that we can contribute to a common future of happiness and prosperity.

Ilse, my friend, my partner, my love. Thank you, thank you, thank you. You have shown me that unconditional love really exists. You met me as a student, offering no certainty or stability for the future, but still you choose me. Thank you for your patience, for the long waiting. Part of this work is thanks to you. You have contributed to improving my written English. Even though you did not understand much of it, the rest will enjoy a fluent reading and well punctuated document.

And last but never least, my parents for their unconditional support given from a distance. Thanks for you love and care in every moment of my life. Without you I would not have embarked on this amazing adventure.

Alejandro Enrique Viteri Vogel Delft, The Netherlands July 5, 2010

### Contents

#### Acknowledgments $\mathbf{V}$ 1 1 Introduction 1.11 1.2Related work 1 $\mathbf{2}$ 1.3 $\mathbf{2}$ 1.45 $\mathbf{2}$ Background 6 2.12.1.1Parallel tuned Class E power amplifier simulation . . . . . . 8 2.2102.311 2.3.1Distortion effects due to the on-resistance . . . . . . . . . . . . . 11 2.3.2AM-AM distortion due to gate-drain capacitance . . . . . . . 122.3.3AM-PM distortion due to gate-drain capacitance . . . . . . . 132.4142.4.1Multi-Carrier CFB 152.4.2162.4.3172.5Summary 213 Mixed-signal Cartesian feedback 23233.1233.1.1The asymptotic-gain model 3.1.2Distortion in a feedback system 253.2Mixed-signal Cartesian feedback modeling 263.2.1263.2.2263.2.3273.2.4The mixed-signal model in the frequency domain ..... 31 3.2.536 3.3 3939 3.3.13.3.241 3.3.341 423.3.4Checking the phase shift detection and correction . . . . . . 3.4433.5Summary 44

| 4 | Mix | ed-signal Cartesian feedback digital design 45                                                                                                       |
|---|-----|------------------------------------------------------------------------------------------------------------------------------------------------------|
|   | 4.1 | Mixed-signal Cartesian feedback system description                                                                                                   |
|   | 4.2 | Determining the bit resolution                                                                                                                       |
|   |     | 4.2.1 Behavioral HDL co-simulation                                                                                                                   |
|   | 4.3 | Designing for low power consumption                                                                                                                  |
|   | 4.4 | Digital block design alternatives                                                                                                                    |
|   |     | $4.4.1  \text{Comparator}  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  $                                                          |
|   |     | $4.4.2 Phase detector \dots 50$                                                                                                                      |
|   |     | 4.4.3 Phase error correction (Rotation)                                                                                                              |
|   |     | 4.4.4 Magnitude                                                                                                                                      |
|   | 4.5 | Architecture comparisons                                                                                                                             |
|   | 4.6 | Architecture realization                                                                                                                             |
|   |     | 4.6.1 Datapath                                                                                                                                       |
|   |     | 4.6.2 Control blocks                                                                                                                                 |
|   | 4.7 | System specifications for twelve bits                                                                                                                |
|   | 4.8 | Components requirements                                                                                                                              |
|   | 4.9 | Summary                                                                                                                                              |
| _ | Б.  |                                                                                                                                                      |
| 5 | Dig | ital implementation 67                                                                                                                               |
|   | 5.1 | Component selection                                                                                                                                  |
|   |     | 5.1.1 Adders                                                                                                                                         |
|   |     | $5.1.2$ Multipliers $\ldots$ $69$                                                                                                                    |
|   | -   | 5.1.3 Adders and multipliers in FPGA 70                                                                                                              |
|   | 5.2 | Design flow $\dots \dots \dots$                      |
|   | 5.3 | Verification $\dots$                                                         |
|   |     | 5.3.1 Test bench                                                                                                                                     |
|   | - 1 | $5.3.2  \text{Simulation}  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  $                                                          |
|   | 5.4 | Implementation                                                                                                                                       |
|   |     | 5.4.1 FPGA                                                                                                                                           |
|   |     | 5.4.2 ASIC 90 nm CMOS technology $\dots \dots \dots$ |
|   | 5.5 | Summary 82                                                                                                                                           |
| 6 | Cor | clusions and recommendations 83                                                                                                                      |
|   | 6.1 | Conclusions                                                                                                                                          |
|   | 6.2 | Recommendations                                                                                                                                      |
|   | ~   |                                                                                                                                                      |
| Α | Car | tesian feedback background 87                                                                                                                        |
|   | A.1 | Study of a typical Cartesian feedback                                                                                                                |
|   |     | A.1.1 Mathematical representation of the delay in the frequency domain 88                                                                            |
|   |     | A.1.2 Stability analysis                                                                                                                             |
|   | A.2 | The problem of instability in a Cartesian feedback system 92                                                                                         |
| в | Sim | ulation environments 95                                                                                                                              |
|   | B 1 | MATLAB®/Simulink® model simulation environment 95                                                                                                    |
|   | B 2 | ADS® HDL co-simulation environment 97                                                                                                                |
|   | 1.4 | B.2.1 Analog part (Block <sub><math>\epsilon</math></sub> )                                                                                          |
|   |     |                                                                                                                                                      |

|              |       | B.2.2   | Lead compensator         | 101 |
|--------------|-------|---------|--------------------------|-----|
|              |       | B.2.3   | Data generator           | 102 |
| $\mathbf{C}$ | Deta  | ailed s | ummary tables            | 105 |
|              | C.1   | FPGA    | implementation summaries | 105 |
|              |       | C.1.1   | FPGA power               | 105 |
|              |       | C.1.2   | FPGA area summary        | 105 |
|              | C.2   | ASIC    | implementation summaries | 107 |
|              |       | C.2.1   | ASIC timing              | 107 |
|              |       | C.2.2   | ASIC power               | 108 |
|              |       | C.2.3   | ASIC area                | 109 |
| Bi           | bliog | raphy   |                          | 110 |

## List of Figures

| 2.1  | Class E power amplifier circuit diagram.                                | 6  |
|------|-------------------------------------------------------------------------|----|
| 2.2  | Class E power amplifier circuit diagram with an on-resistance in the    |    |
|      | switch model.                                                           | 8  |
| 2.3  | Time domain simulation of Class E power amplifier switch                | 9  |
| 2.4  | Envelope elimination restoration diagram.                               | 10 |
| 2.5  | Class E power amplifier circuit diagram with an on-resistance and gate- |    |
|      | drain capacitance in the switch model.                                  | 11 |
| 2.6  | AM-AM and AM-PM distortion caused by the on-resistance                  | 12 |
| 2.7  | AM-AM distortion caused by the $C_{GD}$ capacitance                     | 13 |
| 2.8  | AM-PM distortion caused by the $C_{GD}$ capacitance.                    | 13 |
| 2.9  | Power amplifier linearization techniques.                               | 14 |
| 2.10 | Cartesian feedback description.                                         | 15 |
| 2.11 | Multi-carrier CFB.                                                      | 16 |
| 2.12 | Dynamically biased CFB.                                                 | 16 |
| 2.13 | Dynamically biased CFB with separated dynamically bias circuit          | 17 |
| 2.14 | Cartesian feedback with self correcting phase delay.                    | 18 |
| 2.15 | Soft landing Cartesian feedback.                                        | 19 |
| 2.16 | Cartesian feedback with automatic phase alignent.                       | 19 |
| 2.17 | Automatic phase alignment for a fully integrated Cartesian feedback.    | 20 |
| 2.18 | Mixed-signal Cartesian feedback with automatic phase alignment          | 20 |
|      |                                                                         |    |
| 3.1  | Superposition model                                                     | 23 |
| 3.2  | Feedback system diagram with distortion sources                         | 25 |
| 3.3  | Mixed-signal model of the Cartesian feedback system                     | 27 |
| 3.4  | Reconstruction of the digital signal                                    | 27 |
| 3.5  | Frequency spectrum of the baseband signal and its images after sampling |    |
|      | and hold                                                                | 28 |
| 3.6  | Diagram of the information signal up and down conversion                | 30 |
| 3.7  | Frequency spectrum of the feedback signal after down-conversion         | 30 |
| 3.8  | Single path diagram of the mixed-signal Cartesian feedback              | 31 |
| 3.9  | Equivalent diagram of the mixed-signal model in the frequency domain.   | 31 |
| 3.10 | Root locus for a system with two poles                                  | 34 |
| 3.11 | Realization of an active lead compensator filter                        | 35 |
| 3.12 | Filter behavior with the addition of one zero in the forward path       | 36 |
| 3.13 | Automatic phase alignment for a fully integrated Cartesian feedback.    | 36 |
| 3.14 | Phase regulation diagram.                                               | 37 |
| 3.15 | Cartesian feedback stability plots                                      | 40 |
| 3.16 | Two-tones test Open and Closed loop comparison                          | 42 |
| 3.17 | Phase shift versus time                                                 | 43 |
| 11   | Mixed signal Cartesian foodback model                                   | 15 |
| 4.1  | wixed-signal Cartesian reedback model                                   | 40 |
|      | Error stop response                                                     | 16 |

| 4.3  | The effect of the bit length resolution in the frequency domain   | 47  |
|------|-------------------------------------------------------------------|-----|
| 4.4  | The effect of the bit length resolution in the time domain        | 48  |
| 4.5  | Scheme of the digital block                                       | 49  |
| 4.6  | Comparator realization                                            | 50  |
| 4.7  | Phase detection realization                                       | 51  |
| 4.8  | IIR in transposed direct form II                                  | 52  |
| 4.9  | Rotation realization with real multipliers and LUTs               | 53  |
| 4.10 | Complex multiplier realization.                                   | 53  |
| 4.11 | Realization of a CORDIC algorithm in circular rotation mode       | 55  |
| 4.12 | Square root realization for four bit input                        | 56  |
| 4.13 | Realization of a CORDIC algorithm in circular vectoring mode      | 58  |
| 4.14 | Absolute value realization.                                       | 58  |
| 4.15 | Architecture realization of the digital block.                    | 61  |
| 4.16 | Rotation computation timing diagram                               | 62  |
| 4.17 | Magnitude computation timing diagram                              | 63  |
| 5.1  | Power and area Pareto points for parallel adders architectures    | 68  |
| 5.2  | Power and area Pareto points for several multiplier architectures | 69  |
| 5.3  | Xilinx and ASIC design flow                                       | 71  |
| 5.4  | ADS test bench diagram                                            | 72  |
| 5.5  | ModelSim test bench.                                              | 73  |
| 5.6  | Test vectors signals.                                             | 73  |
| 5.7  | Output test vectors signals.                                      | 74  |
| 5.8  | Baseband frequency spectrum of the information signal             | 75  |
| 5.9  | FPGA critical path description                                    | 76  |
| 5.10 | FPGA floor plan after Place & Route                               | 78  |
| 5.11 | FPGA routing after Place & Route                                  | 79  |
| 5.12 | ASIC critical path description                                    | 80  |
| A.1  | Diagram of a typical Cartesian feedback system.                   | 87  |
| A.2  | Root locus of the delay approximation.                            | 89  |
| A.3  | Representation of $e^{-Ts}$ with different orders                 | 90  |
| A.4  | Equivalent diagram of a first-order Cartesian loop                | 91  |
| A.5  | Typical Cartesian feedback with phase misalignment                | 93  |
| B.1  | Simulink model of the Cartesian feedback system                   | 95  |
| B.2  | ADS model of the Cartesian feedback system                        | 98  |
| B.3  | ADS model of the analog block                                     | 99  |
| B.4  | Forward filter                                                    | 100 |
| B.5  | Feedback filter                                                   | 101 |
| B.6  | Lead compensator                                                  | 101 |
| B.7  | Data generator schematic                                          | 102 |
| B.8  | Input data.                                                       | 103 |

## List of Tables

| 2.1 | Comparison of the Class E power amplifier efficiency before and after tuning. | 9   |
|-----|-------------------------------------------------------------------------------|-----|
| 2.2 | Class E lumped elements design values used to analyze distortion              | 11  |
| 3.1 | Satellite transmitter specifications.                                         | 26  |
| 3.2 | Signal to noise ratio for several bit words length                            | 29  |
| 3.3 | Order of the forward path low pass filter for several sampling frequencies.   | 29  |
| 3.4 | Order of the feedback path low pass filter for several cutoff frequencies.    | 31  |
| 3.5 | Cartesian feedback system setup for stability                                 | 39  |
| 3.6 | Cartesian feedback system setup for different bandwidths                      | 43  |
| 4.1 | Bit resolution length comparison.                                             | 46  |
| 4.2 | IIR coefficients values.                                                      | 52  |
| 4.3 | Complex Multiplier maximum speed and power consumption for various            |     |
|     | supply voltages.                                                              | 54  |
| 4.4 | Systolic squarer speed and power consumption                                  | 56  |
| 4.5 | Square rooter power consumption comparison                                    | 57  |
| 4.6 | Architectures                                                                 | 59  |
| 4.7 | Architectures                                                                 | 60  |
| 4.8 | Cartesian feedback system specifications for a twelve bits architecture .     | 63  |
| 4.9 | Component requirements for a twelve bits architecture                         | 64  |
| 5.1 | Maximum and minimum test vectors signal values                                | 74  |
| 5.2 | FPGA timing summary.                                                          | 76  |
| 5.3 | FPGA digital design power consumption                                         | 77  |
| 5.4 | FPGA resource usage summary                                                   | 77  |
| 5.5 | Faraday FSD0A_A library characteristics                                       | 79  |
| 5.6 | ASIC critical path summary                                                    | 80  |
| 5.7 | ASIC Core power summary                                                       | 81  |
| 5.8 | ASIC Core area summary                                                        | 81  |
| 6.1 | Implementation results                                                        | 84  |
| C.1 | FPGA power summary                                                            | 105 |
| C.2 | FPGA resource usage by components                                             | 106 |
| C.3 | ASIC critical path summary                                                    | 107 |
| C.4 | ASIC Core power summary.                                                      | 108 |
| C.5 | ASIC Core area summary.                                                       | 109 |

# 1

#### 1.1 Motivation

In November 2004 the Delfi program was born. The mission: to put a Nanosatellite in orbit. This program, carried out by students of the faculties of Aerospace Engineering and Electrical Engineering, Mathematics and Computer Science, consists of several missions in which different payloads will be tested. The first mission, Delfi- $C^3$ , was successfully launched on April 2008. The next mission, the Delfi-n3Xt will consist of improvements on Delfi- $C^3$  as well as new payloads. One of these payloads is the ITRIX, an efficient and modular transceiver module.

One of the key aspects of the ITRIX is a highly efficient, power agile, switching power amplifier in the transmitter. The importance of having a high efficient power amplifier is that no power losses occur due to dissipation, improving the lifetime of the batteries and/or making a better use of the power delivered by the solar panels.

However, the high efficiency of the switching power amplifier has one important drawback, non-linearity. A highly efficient switching power amplifier is also highly non-linear. Therefore, to achieve a highly efficient switching power amplifier in the ITRIX's transmitter, a mechanism to linearize the power amplifier is required. One of the most powerful means to linearize a system is negative feedback. It will prove to be an excellent solution to the linearization problem, however some difficulties will have to be dealt with: stability, loop gain and power consumption. The system has to be stable in order to accomplish its function and sufficient loop gain has to be provided in order to improve linearity. The linearization system has to consume as little power as possible in order to use all the available power for amplification.

The work described in this thesis focuses on the design of a digital Cartesian feedback linearization system for a switched mode power amplifier.

The design will be implemented in a field programmable gate array (FPGA) and synthesized for ASIC in 90[nm] CMOS technology process. From both implementations power consumption will be estimated and compared.

#### 1.2 Related work

Feedback linearization technique has its origins back in the late 70's, early 80's. Polar-loop [1] as well as Cartesian [2] were two different alternatives of feedback linearization. Polar-loop decomposes the signal in phase and magnitude (polar coordinates) while in Cartesian feedback the signal is decomposed in its perpendicular coordinates. However these techniques did not became popular until two decades later. It was not until early 2000 that Cartesian feedback became a matter of study again [3][4]. With the advances in the the CMOS IC process fabrication with reduced production costs and improved technology it was possible to fabricate such types of systems.

Nowadays, with the increased demand of computation and the reduced size of mobile devices and satellites, power consumption is a main issue. The use of the power budget has to be as efficient as possible. This means that the most power hungry component, the power amplifier, has to be highly efficient. Most of the previous works on Cartesian feedback have focused on linear power amplifiers such as Class A, B, AB. These classes of power amplifiers are linear, but not sufficiently efficient. To improve efficiency a different class of power amplifier such as switched mode power amplifiers become a better selection. Recent work have tried to linearize a Class E switched mode power amplifier [5], but the linearization technique applied was polar-loop. In 2006 a digital implementation of a Cartesian feedback was presented in [6, 7].

#### 1.3 Project goal

This project aims at designing a digital Cartesian feedback to linearize a Class E Power Amplifier. In order to achieve this, several steps have to be made.

First, a model of the system that meets the specifications of the transceiver has to be developed. For that, a study of feedback theory applied to amplifiers is necessary.

Second, once a suitable model has been found, simulations will be carried out to validate the correct behavior of the system. In addition, the algorithms to detect and correct the phase will be tested.

Third, different digital architectures will be analyzed with respect to area usage and minimum delay. The selected digital architecture will be coded in VHDL and synthesized targeting an FPGA and ASIC in 90[nm] CMOS technology.

#### 1.4 Synopsis

This thesis is organized as follows:

**Chapter 2** presents the parallel tuned Class E power amplifier and the envelope elimination restoration (EER) technique. The main sources of distortion in the Class E PA are studied. As distortion leads to non-linear behavior, a linearization mechanism has to be applied to the Power Amplifier. Different Cartesian feedback linearization techniques are introduced.

**Chapter 3** is devoted to the study of the Cartesian feedback system. A linearization model for the Class E PA is presented followed by an analysis of stability. The main factors that cause the system to become unstable are addressed and the solutions of how to cope with them are presented. All the conditions and restrictions found during the analysis will be used to determine the digital architecture of the system.

**Chapter 4** is dedicated to the selection of the architecture. Different ways of implementation will be presented with their advantages and disadvantages. A combination

of arithmetic units such as CORDIC, multipliers, adders/subtractors and look-up tables will be used to come up with the digital signal processing circuit to implement the Cartesian feedback system.

**Chapter 5** is devoted to the implementation of the architecture that was selected in the previous chapter. Following the design flow presented in this chapter, the Register Transfer Level (RTL) architecture, written in a Hardware Description Language (HDL), will be simulated, synthesized and placed and routed. The resulting design will be implemented into an FPGA and the power will be estimated. The same design will be implemented in a 90[nm] technology and the power will be estimated.

**Chapter 6** Contains the conclusions and recommendations for future work in this topic.

## Background

A Radio Frequency (RF) system is composed of several electronic circuits blocks, that have a specific function. One of them is the Power Amplifier (PA) whose function is to amplify the signal before it is delivered to the antenna for transmission. The PA, as its name indicates, will amplify the power of the signal to be irradiated by the antenna. The amount of power will be determined by the requirements of the application.

As a consequence the PA is the main consumer of the battery power budget in a transmitter device. Power consumption is the utmost importance on mobile and satellite applications. In today's appliances lightweight and reduced dimension imposes a limitation to batteries and solar panels. In order to cope with a limited power budget the electronics must be as efficient as possible.

Efficiency, in terms of power, is defined as the relationship between the available power and the output power delivered to the load. For a PA to be efficient all of the available power has to be delivered to the output with no losses on the way.

The most power efficient types of RF PAs are generally non-linear. Non-linear means that there is no one to one relationship between the independent and dependent variable; in this case the output signal of the PA will differ from the input signal. The output signal is distorted.

Distortion of the envelope and phase generates intermodulation components out of the frequency band of interest and therefore pollutes adjacent frequency bands. Moreover, the characteristics of an amplifier do not remain static, the operating conditions such as temperature, both internal and external, and aging will affect the amplifier's characteristics. To solve this non-linear behavior a linearization mechanism must be applied to reduce the distortion, and, it has to be robust enough to cope with these changes.

The benefits of a linear PA are the capability to amplify signals with any combination of amplitude and phase modulation, broadening the selection of modulation schemes and using the power budget more efficiently.

When the efficiency is important, a switched-mode PA is the best choice. As the name suggests the transistor acts as a switch. Ideally a switch does not dissipate power, although in reality it does, however, with a well tuned network the power dissipated can be reduced to a minimum.

Among the switched-mode PA both the Class D and Class E power amplifiers can be found. In particular, the Class E power amplifier is of interest for this work. It is highly efficient for reasons that will be discussed later, but it has the drawback that it eliminates the envelope of the modulated signal. Therefore, it is a perfect choice for constant envelope modulation schemes. In order to use it for non-constant envelope modulation schemes the transistor's power supply has to be modulated. One alternative is to use another PA as a modulator. The technique is known as envelope elimination restoration (EER). A Class E power amplifier combined with the EER technique will bring a highly efficient PA for any type of known modulation scheme. However, linearity is an important concern that has to be solved. The switching behavior of the PA is not ideal. Parasitic capacitances and variable on/off resistance will introduce non-linearities. To cope with the non-linear problem a linearization technique will have to be applied to the PA.

Cartesian Feedback (CFB) is one of the techniques used to linearize PAs. It uses negative feedback of the modulation components. Different approaches to implement the technique have been developed depending on the application and the class of PA to be used.

In the following sections the Class E power amplifier and the EER technique are reviewed and the sources of distortion are explained. Different linearization techniques are introduced and the Cartesian feedback is analyzed in detail.

#### 2.1 The Class E power amplifier

The Class E power amplifier was first proposed in [8], in 1975. The basic circuit of a Class E power amplifier is depicted in Figure 2.1. The load network configuration consists of the shunt capacitance,  $C_1$ , series inductance,  $L_1$ , and series filter consisting of  $C_0$ ,  $L_0$  and  $R_{load}$ , tuned to the fundamental frequency to provide a high level of harmonic suppression. The loaded quality factor Q of the series resonant circuit, consisting the inductor  $L_0$  and the capacitor  $C_0$ , tuned to the fundamental frequency  $\omega_0 = 1/\sqrt{L_0C_0}$ should be sufficiently high for the output current to be sinusoidal.



Figure 2.1: Class E power amplifier circuit diagram. [8]

The transistor operates as an on-off switch. The shapes of the current and voltage waveforms provide a condition in which the high current and the high voltage do not overlap simultaneously; this minimizes the power dissipation, while maximizing the power amplifier efficiency.

The Class E power amplifier utilizes a technique called soft switching. This technique describes, in the time domain, the behavior of the current and voltage signal such that the transition of the transistor from ON to OFF (and vice-versa) will occur when current or voltage are at their minimum value. The PA is tuned to fulfill the soft switch conditions described as below:

- 1. Minimum voltage across the device when the current flows through it.
- 2. Minimum current through the device when the voltage exists across it.
- 3. Minimum switching time.
- 4. Voltage delay at switch turn off.
- 5. Voltage return to zero at switch turn on.
- 6. Zero voltage slope at switch turn on.
- 7. The voltage and current transient response waveforms should have a flat top.

These conditions are, mathematically, expressed as:

$$Class \ E \Leftrightarrow \begin{cases} v_{DS}(t_1) = 0\\ \frac{dv_{DS}(t)}{dt}\Big|_{t=t_1} = 0 \end{cases}$$

$$(2.1)$$

where  $t_1$  is the instant in which the switch closes, and  $v_{DS}(t_1)$  is the transistor drain source voltage at switching time.

When choosing the transistor it is important to consider the following:

1. The drain voltage that occurs when the switch is open is high [5],

$$v_{DS,max} = 2\pi \left[\frac{\pi}{2} - \arctan\left(\frac{\pi}{2}\right)\right] \times V_{DD}$$
  

$$\approx 3.5620 \times V_{DD}$$
(2.2)

where  $V_{DD}$  is the supply voltage.

2. A key property of the Class E power amplifier is time separated drain voltage and drain current. High voltage and high current never coincide and when the transistor starts to conduct current, the drain voltage is close to zero due to the external passive network. As such, the Class E amplifier is limited by the oxide breakthrough and junction breakthrough but not by hot carrier injection. The breakdown drain voltage of a CMOS technology, for a zero drain current, is typically two to three times the supply voltage [5].

In the parallel tuned Class E power amplifier,  $L_0$  and  $C_0$  are not part of the soft switch tuning. They are tuned to the fundamental switching frequency. The parallel tuned Class E power amplifier is the configuration selected to be linearized using the Cartesian feedback technique. The main reasons for that selection are [9]:

1. **Reliability**: The voltage across the switch is smaller, therefore the transistor is less likely to breakdown.

- 2. Integrable :  $L_1$  can be made small enough to fit into a chip.
- 3. **Distortion**: The suppression of the  $2^{nd}$  and  $3^{rd}$  harmonics is better due to the same output power during switch on and switch off.

#### 2.1.1 Parallel tuned Class E power amplifier simulation



Figure 2.2: Class E power amplifier circuit diagram with an on-resistance in the switch model.

The procedure to obtain  $L_1$ ,  $C_1$ ,  $L_0$  and  $C_0$  is described in [9]. The relationships obtained for these components are as follow:

$$L_0 = \frac{QR_{load}}{2\pi fc} \tag{2.3}$$

$$C_0 = \frac{1}{2\pi f c Q R_{load}} \tag{2.4}$$

$$L_1 = 0.7322 \frac{R_{load}}{2\pi f c}$$
(2.5)

$$C_1 = \frac{0.6858}{2\pi f c R_{load}}$$
(2.6)

Simulations, assuming the relationship described above, were carried out for a carrier frequency of 144 [MHz] with a load resistance of 50 [ $\Omega$ ]. The behavior of the current and voltage in the transistor were plotted and the efficiency was obtained. The results show that for an ideal switch model (no on-resistance) the efficiency is 99.7%. Using the tuning method described in [9], it is possible to optimize  $L_1$  and  $C_1$  for a maximum efficiency of the PA when the value of the on-resistance is included in the model. As a consequence the efficiency is reduced due to the more close to real behavior of the switch. Table 2.1 shows the comparison of the values before and after tuning.

The signals shown in Figure 2.3 were obtained by a time simulation of the circuit shown in Figure 2.2. The transistor was replaced by an ideal switch model with settable on-resistance, set to 0.5 [ $\Omega$ ]. The switch was driven with a 144 [MHz] frequency signal and the supply voltage,  $V_{DD}$ , was set to 12 [V]. From the figure it is possible to see the soft switching behavior. The overlap between  $V_{DS}$  and  $I_{DS}$  is small, in consequence the power dissipated in the transistor will be small. It is also possible to see the maximum value that  $V_{DS}$  reach, 43.24[V]. This is 3.6 times the supply voltage, which Table 2.1: Comparison of the Class E power amplifier efficiency before and after tuning.  $R_{load}=50[\Omega], f_c=144[\text{MHz}], \text{Q}=6.0 \text{ and } V_{SS}=12[\text{V}].$ 

| Component  | Before           | After            |
|------------|------------------|------------------|
| $r_{on}$   |                  | $0.5 \ [\Omega]$ |
| $L_0$      | $331.57 \; [nH]$ | $331.57 \; [nH]$ |
| $C_0$      | $11.57 \; [pF]$  | $11.57 \; [pF]$  |
| $L_1$      | $40.46 \; [nH]$  | $25.42 \; [nH]$  |
| $C_1$      | $15.16 \; [pF]$  | $33.04 \; [pF]$  |
| Efficiency | 99.7%            | 96.99%           |

agrees with (2.2). The breakdown voltage of the transistor is an important parameter at the moment of selecting the adequate transistor for this application.

It has been shown that the Class E power amplifier is highly efficient [8, 9]. However its switching characteristic discards the envelope which makes it suitable for modulation schemes in which the envelope is constant. Nevertheless, it is desired to use this high efficient power amplifier to transmit not only phase modulation but also amplitude modulation. The only way in which the output voltage can be changed is by changing the supply voltage  $(V_{DD})$  of the amplifier. The efficiency of the Class E power amplifier does not depend on the supply voltage, therefore changing the power supply of a Class E power amplifier is an efficient way to change the amplitude of the output signal without altering the efficiency. A technique such us envelope elimination restoration (EER) is required to restore the envelope of the input signal. The EER technique is reviewed in the following section.



Figure 2.3: Time domain simulation of Class E power amplifier switch.  $V_{DS}$  (left vertical axis) and  $I_{DS}$  (right vertical axis) for the parallel tuned Class E power amplifier.  $R_{load} = 50 \ [\Omega], f_c = 144 [MHz].$ 



Figure 2.4: Envelope elimination restoration diagram and its signals at different stages in the system. [5]

#### 2.2 Envelope elimination restoration (EER)

The envelope elimination and restoration, (EER) method, was first proposed in [10], in 1952. The advantage of this method is the use of non-linear RF power amplifiers which are more efficient than linear ones.

Figure 2.4 shows a realization of the EER technique and the signals at each stage in the system. The modulated RF signal,  $v_{in}(t)$ , is decomposed in phase and magnitude. The magnitude, A(t), is obtained via an envelope detector which will drive a low frequency power amplifier (LF-PA). Because the transistor of the PA works as a switch the amplitude of the signal is not important. A limiter is used to eliminate the envelope and drive the input of the PA with a signal that varies in phase.

The output of the LF-PA,  $V_{DD}(t)$ , is an amplification of the input signal's envelope. This signal becomes the supply voltage to the PA, as a result the output signal of the RF PA,  $v_{out}(t)$ , is modulated, restoring the envelope.

$$v_{in}(t) = A(t) \cdot \cos(\omega_c t + P(t))$$
$$V_{DD}(t) = \gamma \cdot A(t)$$

$$v_{out}(t) = \gamma \cdot A(t) \cdot \cos\left(\omega_c t + P(t)\right)$$

An important difference between the Class E power amplifier and the LF-PA is the large difference between the time constants of the two. The PA operates in the megahertz or gigahertz range while the LF-PA operates in the kilohertz range, a difference of three to six order of magnitude. On the one hand the PA, for a single period of the RF frequency, sees the voltage supply as a constant value and on the other hand the LF-PA sees the PA as a load.

For a detailed analysis of an EER solution see [5], where different approaches have been presented with their advantages and disadvantages.

| Component    | Value  | Units      |
|--------------|--------|------------|
| $L_0$        | 331.57 | [nH]       |
| $C_0$        | 11.57  | [pF]       |
| $\mathbf{Q}$ | 6.0    |            |
| $L_1$        | 25.42  | [nH]       |
| $C_1$        | 33.04  | [pF]       |
| $R_{load}$   | 50     | $[\Omega]$ |
| $f_c$        | 144    | [MHz]      |
| $r_{on}$     | 0.5    | $[\Omega]$ |
| $V_{DD}$     | 12     | [V]        |

Table 2.2: Class E lumped elements design values used to analyze distortion.

#### 2.3 Distortion in the Class E power amplifier

To analyze distortion, the ideal switch in Figure 2.1 has been replaced with a switch model that includes a on-resistance and a gate-drain capacitance to model parasitics. It was shown in the previous section that including a  $r_{on}$  produces a small loss in power and therefore in the efficiency. The effect of the on-resistance will also produce a distortion of the signal.

Including a small parasitic capacitance between drain and gate  $C_{GD}$  will create a direct path from input to output which will result in a AM-AM and AM-PM distortion. Figure 2.5 shows the circuit diagram of the Class E power amplifier with the transistor modeled as a switch with a series on-resistance and a capacitance in series with the input terminal.  $L_1$ ,  $C_1$ ,  $L_0$  and  $C_0$  where calculated in the previous section and correspond to the values obtained after tuning the PA. Table 2.2 list the values used to analyze distortion.

#### 2.3.1 Distortion effects due to the on-resistance

The on-resistance of a transistor is not constant and depends on the relationship between the drain-source voltage and the drain current. For a large voltage swing of



Figure 2.5: Class E power amplifier circuit diagram with an on-resistance and gate-drain capacitance in the switch model.



Figure 2.6: AM-AM and AM-PM distortion caused by the on-resistance: (a) Relationship between the output envelope and the on-resistance; (b) Relationship between the phase and the on-resistance.

the drain-source voltage the on-resistance will show a notorious variation. Of course, this variation will depend on the technology and the dimensions of the transistor.

To see the effect of the on-resistance in the Class E power amplifier, a signal with fundamental frequency,  $f_c$ , of 144 [MHz] is used to drive the circuit of figure 2.5, the supply voltage,  $V_{DD}$ , is kept at 12 [V] and the on-resistance is swept from 0 to 1 Ohm. The capacitance  $C_{GD}$  is short circuited.

Figure 2.6(a) shows the results of the simulation. Here the output envelope is normalized to the value obtained for a 0.5  $[\Omega]$  of the on-resistance. It is possible to see that the output signal changes for different values of the on-resistance. In Figure 2.6(b) the effect on the phase is observed, here the output phase is made relative to the one obtained for a on-resistance of 0.5  $[\Omega]$ . As a conclusion, the on-resistance creates an amplitude and phase distortion.

#### 2.3.2 AM-AM distortion due to gate-drain capacitance

To observe the effect of the gate-drain capacitance, the circuit of figure 2.5 is simulated. This time the on-resistance is kept at 0.5 [ $\Omega$ ] and the supply voltage is varied. Three simulation were done for three different values of  $C_{GD}$ , 1 [pF], 2 [pF] and 3 [pF].

Figure 2.7(a) shows the relationship between the supply voltage and the output voltage. It is possible to see that the relationship is linear for high values of the supply voltage for any of the three values of  $C_{GD}$ . A closer look shows that the linearity disappears for low values of the supply voltage (Figure 2.7(b)). The effect of the drain-source capacitance is more notorious at low supply voltages. If the supply voltage changes (which is the case when the amplitude is modulated by means of an EER), it is preferable to use a modulation scheme in which the trajectories of the transmitted symbols do not cross the origin, as is the case with  $\pi/4$ -DQPSK, this will reduce the AM-AM distortion originated by the Class E power amplifier.



Figure 2.7: AM-AM distortion caused by the  $C_{GD}$  capacitance: (a) Relationship between supply voltage and output voltage; (b) Closer look around the origin to show the non-linear behavior.

#### 2.3.3 AM-PM distortion due to gate-drain capacitance

The effect on the phase caused by the gate-drain capacitance is shown in Figure 2.8. The presence of a gate-drain capacitance (Figure 2.5) changes the behavior of the output phase. With  $C_{GD}$  a direct path from gate to drain exists.

Figure 2.8(a) shows the effect of gate-drain capacitance. When the ideal switch is closed by means of high voltage at the input, an RC circuit composed by  $C_{GD}$  in series with  $r_{on}$  is present. This circuit will add some dynamics to the behavior of the switch. For higher frequencies it becomes more notorious due to the small values of  $C_{GD}$ . As a consequence, a shift from the 180° will be present for low supply voltages. These results show that a gate-drain capacitance generates a distortion in the phase for low supply voltages.



Figure 2.8: AM-PM distortion caused by the  $C_{GD}$  capacitance: (a) Relationship between supply voltage (log scale) and output phase; (b) Relationship between supply voltage (linear scale) and output phase.

In the present section the distortion produced by the Class E PA has been shown. The on-resistance as well as the gate-drain capacitance cause AM-AM and AM-PM distortion. This distortion is more notorious at low supply voltage levels. When amplitude modulation is required these types of distortion will be present, therefore a mechanism to reduce distortion is required to improve linearity. The following section presents the existing linearization techniques using Cartesian feedback. Cartesian feedback is the linearization technique selected for the present work.

#### 2.4 Cartesian feedback linearization techniques

Figure 2.9 shows different linearization techniques that have been proposed. For a description of them see [4, 11]. The technique of interest for the present work is Cartesian feedback (red line flow in Figure 2.9). Cartesian Feedback is a technique used to linearize power amplifiers. It was first suggested in [12] to reduce the distortion of a quadrature delta modulator. As shown in Figure 2.9, Cartesian feedback is under baseband category due to the fact that the feedback loop is closed at baseband instead of at carrier frequency.

In [2], Cartesian feedback is used, for the first time, with conventional modulators. The results, for a experimental transmitter (a phasing type single-sideband (SSB) generator with the addition of feedback), with a 1 [W] peak envelope power (PEP) Class AB power amplifier, operating at a carrier frequency of 2.5 [MHz] and information signal bandwidth of 100 [kHz] showed a suppression of 70 [dB] for the third order products; meanwhile, without feedback, the suppression was only 42 [dB].



Figure 2.9: Power amplifier linearization techniques [4]



Figure 2.10: Cartesian feedback: (a) System diagram. (b) Symbol rotation due to phase shift.

**Cartesian feedback operates as follows:** The information signal is decomposed in its orthogonal components In-phase (I) and Quadrature (Q) as it is shown in Figure 2.10(a). These signals are compared with the feedback signals (I') and (Q') and then converted into a single modulated signal. The process of modulation and signal combination is called up-conversion. The modulated signal is then amplified and sent to the antenna. In the feedback path a sample of the output signal is down-converted in order to recover the in-phase and quadrature signals. A block called "Loop Compensator" in the forward path. This is necessary to indicate that some method of compensation could be required in order to ensure stability.

A consequence of the Cartesian feedback is the phase shift between the input signals and the feedback signals. This phase shift is caused by delays and non-linearities of the loop components. A correction mechanism will be required to reduce the phase shift. Figure 2.10(b) shows the symbol rotation due to the phase shift.

Cartesian feedback improves linearity by reducing the distortion of the non-linear elements that are present in the forward path, such as the up-converter and the PA. An in-depth review of Cartesian feedback will be given in Appendix A.

Different Cartesian feedback linearization techniques have been developed (Automatically supervised, Multi-loop and Dynamically biased) as Figure 2.9 shows. The following section describes them.

#### 2.4.1 Multi-Carrier CFB

The work presented in [13] uses CFB as a way to linearize a multi-carrier power amplifier. The main idea is to eliminate the cavity filter that is required for each of the carriers, before entering the combiner. This cavity filter allows filtering crossmodulation and intermodulation.

If the PA is linear, these type of distortions will be reduced and the cavity filter is not required. Cartesian feedback modules (CFBM), Figure 2.11(a), are deployed in each carrier to correct the phase and amplitude. It is possible to see that the scheme is similar to that shown in Figure 2.10. Figure 2.11(b) shows the complete system. The feedback signal is obtained from the output via a coupler, then a splitter separates each carrier to feed the corresponding CFBM. This technique requires a phase adjust which



Figure 2.11: Multi-carrier CFB.[13] (a) Single Cartesian feedback module (CFBM) for a single carrier linearization. (b) Scheme of several CFBMs combined to drive a multi-carrier PA.

can be done by delaying each local oscillator (LO) or using another technique such as the automatically supervised technique.

Time domain simulations in 2.11(b) show a suppression of unwanted intermodulation products of about 30 [dB] for a setup of five channels separated by 1250 [kHz] and a 50 [W] Class AB PA.

#### 2.4.2 Dynamically Biased CFB

Linearization schemes can be classified in two groups: one that modifies the power supply of the PA and the other that modifies the drive of the PA. The dynamically biased technique uses the two schemes to improve linearity and achieve high efficiency.

The work presented in [14], proposes a RF PA in which feedback is used to control amplitude and phase distortion at the same time. This technique was applied to a bipolar transistor at onset of gain saturation, sensing the output signal and comparing it to the input signal in order to correct the collector voltage supply. Here only envelope is used as a feedback to correct phase and voltage, as Figure 2.12(a) shows; however it is also possible to use phase feedback with small changes in the configuration.

An improvement is shown in Figure 2.12(b), the magnitude and sign of the RF input



Figure 2.12: Dynamically biased CFB.[14] (a) Envelope feedback. (b) Improvement for input impedance match.



Figure 2.13: Dynamically biased CFB with separated dynamically bias circuit [4].

signal is sensed to give an additional control signal to adapt the feedback conditions of the amplifier to maintain a constant impedance match in the full power range. Result of this implementation shows a -40[dB] reduction of spurious output for a two-tone test at VHF and 70% efficiency.

The work presented in [4], proposes an alternative implementation that is different than that presented in [14] in the sense that the linearization process is completely separated from the dynamics bias process, as is shown in Figure 2.13. The Dynamic Bias Circuits (dashed box) are included in the feedback system to provide bias and power supply. The magnitude is obtained from the orthogonal inputs  $(R = \sqrt{I_{in}^2 + Q_{in}^2})$ and two blocks, MAP C (collector) and MAP B (base) select the optimum amplifier RF power supply and base bias respectively.

Results from measurements in [4] show that, for a Class A and Class AB, the intermodulation distortion for a two tone test is improved by 36 [dB] and 44 [dB] for a  $\pi/4$ -QPSK (it is better because  $\pi/4$ -QPSK avoid the zero crossing) with a collector efficiency of 42%.

#### 2.4.3 Automatically Supervised CFB

Automatically Supervised CFB proposed in [15] aims to include a self correcting system in order to detect the phase difference between the feedback and input signals and use the error signal to control a phase-shifting network. Figure 2.14(a) describes the phase error detection and compensation control. The phase shifting network takes the output of the control signal. A sine and cosine signals, generated by two 8bit ROM, are multiplied by the local oscillator and added to form the output signal. The angle of the sine and cosine is the phase correction. The control block continuously monitors the phase error of the feedback signals. A digital phase detector, implemented with a XOR gate, is used to detect the phase variation. The output is proportional to the cosine of the phase difference. To continuously monitor the phase error, the DC value must be extracted. A low pass filter with 5 [Hz] cut-off frequency will extract the DC and switch a threshold detector to indicate the start of the phase compensation. The problem is that its delay is too high (135 [ms] step) to be used for a complete cycle of



Figure 2.14: High frequency Cartesian feedback with self correcting phase delay[15]. (a) High frequency Cartesian-loop transmitter. (b) Phase error detection and compensation control.

compensation (8.5 [s] for 64 steps), therefore a 2.5 [kHz] tone replaces the message once the DC has been detected, opening the loop. A second filter with a cut-off frequency of 1.9 [kHz] is used to detect the DC component, lowering the delay to 0.55 [ms] per step (40 [ms] for 64 steps), this detection switches to another threshold used to indicate the end of the phase compensation and closing the loop.

The routine followed to compensate starts with the detection of the DC by the 5 [Hz] filter. Then the message signal is replaced by the 2.5 [kHz] tone and the phase error is sensed. The phase-shifting network starts to rotate until the phase error falls below four degrees then the second threshold is activated halting the routine. The phase correction will remain unchanged until any phase error exceeds the set value for the threshold (25 degrees). With this scheme the phase error of the feedback signal is corrected to within 4 degrees and within a time of 40 [ms].

**Soft landing CFB** is another type of an automatically supervised technique. Developed for TDMA [16], soft landing CFB provides a method for adaptative tracking control of the local phase, as shown in Figure 2.15(a). An endless phase shifter (EPS), Figure 2.15(b), controls the phase of the local oscillator used in the demodulator and a variable gain amplifier (VGA) controls the loop gain by varying the gain of baseband error signals.



Figure 2.15: Soft landing Cartesian feedback [16]. (a) Configuration of the SL-CFB amplifier. (b) Endless phase shifter (EPS) control circuit.

Automatic phase alignment, presented in [18], proposes a scheme to automatically detect and correct the phase misalignment for a Cartesian feedback. A continuous control of the phase is developed in which the phase difference between the input signal and the demodulated feedback signal is computed as:

$$IQ' - QI' = rr'sin(\theta - \theta') \tag{2.7}$$

Where I and Q are the in-phase and quadrature input signals and I' and Q' are the in-phase and quadrature feedback signals; r and r' are the magnitude of the input and feedback signals respectively.

Figure 2.16(a) shows a Cartesian feedback transmitter with the automatic phase alignment. The phase correction is applied to the local oscillator that drives the upconversion system. The implementation of (2.7) is understood as a linearized phaselocked loop (PLL) model, where  $\Delta \theta$ , a voltage proportional to the phase misalignment (in radians), is computed and used to correct the signals previous upconversion, as shown in Figure 2.16(b).

In [17, 3] a fully integrated Cartesian feedback is designed. In this work the phase misalignment is not corrected in the LO, but directly in the quadrature signals via a



Figure 2.16: Cartesian feedback with automatic phase alignment[17]. (a) Typical CFB system with the inclusion of automatic phase alignment (in dashed lines). (b) System-level diagram of phase alignment system.



Figure 2.17: Phase alignment concept [17, 3].

rotation matrix as shown in Figure 2.17. Results show a 6[dB] reduction in the third harmonic for a 1[kHz] sine test signal driving the I channel, while the Q channel is grounded. Also, the phase regulation never exceeds nine degrees which results adequate to keep the CFB loop stable.

Mixed-signal with automatic phase alignment, presented in [6], to correct the phase misalignment in the digital domain. In this work the intermediate frequency (IF) has been eliminated, reducing intermediate stages and hardware costs at the same time that linearity is improved. In contrast to [17], the treatment of the signals is digital and the phase shift correction is applied to the feedback signals before making the comparison with the inputs signals as Figure 2.18(a) shows.

Figure 2.18(b) shows the architecture of the digital part. Three stages can be identified: in the phase error computation stage, Look-up tables (LUT) map the *arctanget* of the inputs and feedback signals quotient. The difference gives the phase error. The complex multiplier stage performs the matrix rotation. Two LUTs are required to map the cosine and sine of the phase error. The last stage, the subtractor, compares the input and feedback signals.



Figure 2.18: Mixed-signal Cartesian feedback with automatic phase alignment[6]. (a) Mixed-signal Cartesian feedback architecture. (b) Cartesian feedback digital part architecture.
Results of this work show linearization performances of 27[dB] Adjacent Chanel Power Reduction (ACPR) improvement at 5[MHz] offset and 15[dB] ACPR improvement at 10[MHz].

Most, if not all, of the techniques have been developed to linearize PAs of Class A, AB and C, but none of them gives an indication as to how well the Cartesian feedback could respond to a switched mode PA such as the Class E power amplifier. One work was found in which a Class E power amplifier with envelope elimination restoration was used for a transmitter, but the linearization technique used in that work was a polar feedback [5].

The present work will consist of a digital implementation of the automatic phase shift detection and correction for a Cartesian feedback system in baseband in which the target PA is a parallel tuned Class E with envelope elimination restoration.

## 2.5 Summary

In this chapter the motivation for this work has been presented. It was shown that amplification is required in order to transmit a RF signal. The amplification requires a power budget that usually is limited, therefore it has to be used efficiently. The most efficient way is to withdraw all the power into the antenna, this means the PA does not have to dissipate power, making it 100% efficient. It was shown that an efficient PA is also highly non-linear. Therefore a highly efficient PA will degrade the RF signal being transmitted. The PA selected for this work is the parallel tuned Class E that shows efficiencies around 95%. The Class E power amplifier has one disadvantage: it eliminates the envelope of the RF signal. This makes it a very good choice to transmit modulation schemes in which only the phase places a role in the RF information signal. However, the goal is to design a transmitter that supports modulation schemes in which phase and amplitude place a role in the RF information signal. To make this possible a mechanism to recover the envelope has to be attached to the Class E power amplifier. The EER technique is presented as a solution. The supply voltage of the Class E power amplifier is modulated in such a way that the original RF signal is amplified.

So far a highly efficient PA with phase and amplitude modulation has been selected, but linearity is still a problem. To linearize a PA understanding of the sources of distortion has to be achieved. A review of the most important sources of distortion was carried out. It was found that the on-resistance as well as the gate-drain capacitance creates AM-AM and AM-PM distortions, and they are more notorious at low levels of DC supply voltage. When the RF signal is being modulated in phase and in amplitude these distortions will be present degrading the transmitted information.

To achieve linearity, Cartesian feedback has been selected as the mechanism to linearize the Class E power amplifier. Different types of Cartesian feedback linearization techniques were presented and a selection for this work was made.

In the present work a Digital Cartesian feedback with automatic phase alignment will be implemented to linearize a Class E power amplifier. An important condition for a feedback system is stability. Designing such a system, the goal is to have as much loop gain as possible in order to reduce distortion. Unfortunately, achieving this goal is not free of problems. The stability condition for the system could limit the amount of loop gain that can be set.

There are three factors that determine the stability of the system: loop gain, bandwidth and system delay. The value of the first two factors is preferably as large as possible. However, the third factor will set a limitation on them. A trade-off has to be found. The bandwidth is known from the beginning as it is part of the specifications for which the system is designed. It has to be carefully selected though in order to achieve a feasible design. Values for the loop gain and system delay must be found in order to fulfill requirements such as distortion reduction, phase margin and gain margin.

In the following sections a review of the negative feedback is the starting point to understand its influence in the linearization of the Power Amplifier. The distortion is modeled to understand how Cartesian feedback helps to reduce it. With the understanding of the conditions that influence the Cartesian feedback system behavior a mixed-signal model is described and the conditions for stability and linearization are formulated.

# 3.1 Negative feedback

## 3.1.1 The asymptotic-gain model

Figure 3.1 shows a model for describing electronic feedback systems. The superposition model is presented in [19] and repeated here for convenience.

Superposition of signals is valid in linear systems, therefore, as long as the system is linear, the superposition holds and (3.1), (3.2) and (3.3) are valid.



Figure 3.1: Superposition model[19]

$$y(s) = r(s)G_{t0}(s) + v(s)\nu(s)$$
(3.1)

$$e(s) = r(s)\xi(s) + v(s)\beta(s)$$
(3.2)

$$v(s) = e(s)G(s) \tag{3.3}$$

Obtaining the transfer function y(s)/r(s) is straightforward:

$$\frac{y(s)}{r(s)} = G_{t0}(s) + \nu(s)\xi(s)\frac{G(s)}{1 - G(s)\beta(s)}$$
(3.4)

Where  $G_{t0}(s)$  is the direct transfer factor caused by parasitic coupling between the input and the output.  $\xi(s)$  is the non-ideal coupling between the input source and the controlled system and  $\nu(s)$  is the non-ideal coupling between the controlled system and the load. The value of these factors should be equal to one in a properly designed circuit.  $\beta(s)$  is the feedback transfer function and G(s) is the forward transfer function.

The transfer function of each block is described as a function of "s" to show its frequency dependance. If the coupling with the source and the load are well designed, then  $\nu$  and  $\xi$  become non-frequency dependant with unity gain ( $\nu = \xi = 1$ ).

The product  $G(s)\beta(s)$  is called the open loop transfer function or "loop gain" and is represented by L(s). Re-writing (3.4) leads to:

$$\frac{y(s)}{r(s)} = G_{t0}(s) + \frac{1}{\beta(s)} \frac{L(s)}{1 - L(s)}$$
(3.5)

Making the loop gain large, i.e.  $L(s) \to \infty$ 

$$\lim_{L(s) \to \infty} \frac{y(s)}{r(s)} = G_{t0}(s) - \frac{1}{\beta(s)} = G_{t\infty}$$
(3.6)

and replacing it in (3.5),

$$\frac{y(s)}{r(s)} = \frac{G_{t0}(s)}{1 - L(s)} - G_{t\infty} \frac{L(s)}{1 - L(s)}$$
(3.7)

the asymptotic-gain model is obtained. If the loop gain is considerably larger than one  $(L(s) \gg 1)$ , the first term can be neglected. Also, in the bandwidth on interest, assuming that the parasitic coupling,  $G_{t0}(s)$ , is not present or is small compared to L(s) and  $1/\beta(s)$ , the transfer function of the system reduces to:

$$\frac{y(s)}{r(s)} = \frac{1}{\beta(s)} \frac{L(s)}{1 - L(s)}$$
(3.8)

The gain of the transfer function is dominated by the reciprocal of the feedback transfer function times a correction factor dependant on the open loop gain. For a large value of L(s), in the frequency range of interest, the closed loop transfer function will be determined by  $1/\beta(s)$ .



Figure 3.2: Feedback system diagram with distortion sources

### 3.1.2 Distortion in a feedback system

In the previous section the concept of negative feedback was shown. This model is distortion free and does not make clear the effect of negative feedback on distortion. The following model used the asymptotic-gain model plus two sources of distortion, one in the forward path and the other in the feedback path. These sources models the distortion generated by non-linearities of the components in the forward and feedback path respectively.

Equation (3.9) express the transfer function of the system shown in Figure 3.2 to model the frequency response for a feedback system when distortion is present.

$$y(s) = \frac{1}{\beta(s)} \frac{L(s)}{1 - L(s)} r(s) + \frac{D_{fw}(s)}{1 - L(s)} + \frac{L(s)}{1 - L(s)} D_{fb}(s)$$
(3.9)

Where y(s) represents the transmitter output, r(s) represents the input signal,  $L(s) = G(s)\beta(s)$  is the loop gain. G(s) represents the forward transfer function in the Cartesian feedback: the dynamics of the baseband filters, up-converters and power amplifier.  $\beta(s)$  represents the feedback transfer function: the dynamics of baseband filters, attenuation and down-converter.

 $D_{fw}(s)$  models the distortion introduced by the components in the forward path and  $D_{fb}(s)$  models the distortion introduced by the components in the feedback path.

The first term in (3.9) is the closed loop transfer function where, for a large loop gain (L(s) >> 1) the gain is dominated by the reciprocal of  $\beta(s)$ . The second term represents the distortion introduced in the forward path. As can be seen the distortion is reduced by the loop gain. The last term represents the distortion introduced in the feedback path. For a large loop gain the term  $\frac{L(s)}{1-L(s)}$  is close to -1 making no contribution to the reduction of the distortion in the feedback path.

It is possible to see that when applying feedback with a large loop gain the distortion in the forward path is reduced. This will be an advantage because the biggest source of distortion, the PA, is in the forward path. At the same time, distortion in the feedback path is not affected by the loop gain, therefore, the only way to control the distortion in the feedback path is to use highly linear components on it.

Given the conclusions above it makes sense to take the following into consideration at the design level:

- 1. The inclusion of any non-linear component, required in the system, must be in the forward path.
- 2. In the feedback path all components should be as linear as possible.

# 3.2 Mixed-signal Cartesian feedback modeling

In the present section a model for the mixed-signal Cartesian feedback is defined. It has been called mixed-signal because all the filtering, quadrature modulation and demodulation and the amplification is in the analog domain meanwhile the correction of the signal is done digitally.

### 3.2.1 System specifications

The Cartesian feedback has to be implemented for a Class E power amplifier that will transmit digital data that have been generated in the micro satellite. Table 3.1 shows the specification of the transmitter.

| Description       | Value                              | Units               |
|-------------------|------------------------------------|---------------------|
| Data rate         | 1.2 - 9.6                          | [Kbps] <sup>1</sup> |
| Carrier frequency | $145.9^{-2}$                       | [MHz]               |
| Power output      | $1.7^{-3}$                         | [W]                 |
| Supply voltage    | 3.3 <sup>4</sup> - 12 <sup>5</sup> | [V]                 |
| Modulation scheme | Constant envelope                  |                     |
| IMD suppression   | -40                                | [dBc]               |
| Data bit length   | $8^{-6}$                           | bits                |
| Antenna load      | 50                                 | $[\Omega]$          |

Table 3.1: Satellite transmitter specifications.

### 3.2.2 Mixed-signal model

Figure 3.3 shows a modification of the typical Cartesian feedback system of Figure 2.10(a). The mixed-signal Cartesian feedback includes the digital to analog (DAC), the analog to digital (ADC) converter and the digital circuitry for signal processing (in grey color). The low pass filter in the forward path (anti-imaging filter) is used to smooth the sampled and held signal delivered by the DAC. The low pass filter in the feedback path (anti-aliasing filter) is used to eliminate second order harmonics of the carrier frequency that arise as a result of the down-conversion.

<sup>&</sup>lt;sup>1</sup>Kilo bits per second. The bandwidth of the information signal depends on the modulation scheme and the type of pulse shaping used.

 $<sup>^{2}</sup>$ Within the VHF spectrum, 144 to 146 [MHz].

<sup>&</sup>lt;sup>3</sup>Total power, Tx/Rx. Of that 1[W] budget for the PA.

<sup>&</sup>lt;sup>4</sup>DC supply for the digital circuitry.

<sup>&</sup>lt;sup>5</sup>DC supply for the PA.

<sup>&</sup>lt;sup>6</sup>It is the desired word length, but it will depend on the resolution.



Figure 3.3: Mixed-signal model of the Cartesian feedback system.

At this level of the design there is not yet a complete idea of how the digital circuit for signal processing will be. What is known is that the digital processing of the signals will produce some latency.

The digitalized feedback signals has to be compared with the inputs and some correction have to be made before converting them back to analog signals. All this digital signal processing will take a certain amount of time that can be modeled as a delay.

### 3.2.3 Filters in the loop

### 3.2.3.1 The anti-imaging filter

The output signal of the DAC in the forward path looks like the one shown in Figure 3.4(a). This signal is the result of sampling the transcoded data and hold the value for one sampling period. In order to reconstruct the signal as is shown in Figure 3.4(b) the higher frequency components must be removed.

Figure 3.5 shows the frequency spectrum of the sampled signal at baseband. A sampled signal is periodic in the frequency spectrum with periods equal to the sampling



Figure 3.4: Reconstruction of the digital signal. (a) Sampled & held signal. (b) Reconstructed signal (black line).



Figure 3.5: Frequency spectrum of the baseband signal and its images after sampling and hold.

frequency,  $f_s$ , (light grey in the figure). The sample & hold process reduces the magnitude of the images but does not eliminate them completely [20]. In order to eliminate the images a low pass filter is required.

The purpose of the low pass filter is to preserve the spectrum of the baseband signal. The filter's magnitude,  $A_{pass}$ , has to be flat along the signal bandwidth or pass band,  $f_B$  and then roll-off with certain slope. Before the first image starts, this is at  $f_s - f_B$ , called the stop band, the magnitude  $A_{stop}$  should be minimum.

$${}^{[20]}Filter's \ order = \frac{A_{TB|dB}}{20log_{10}\left(\frac{f_s - 2f_B}{f_B}\right)}$$
(3.10)

Equation (3.10) is used to determine the order of the low pass filter. The filter's order depends on: the amount of attenuation required,  $A_{TB|dB}$ , the sampling frequency,  $f_s$  and the baseband bandwidth,  $f_B$ .  $A_{TB|dB}$  is the attenuation in decibels: the difference between  $A_{pass}$  and  $A_{stop}$ .

When the attenuation is related to the SNR of the converter it is possible to obtain a relationship between the numbers of bits and the order of the filter. See also [20].

#### 3.2.3.2 Bit resolution and the signal to noise ratio

Ideally each bit of resolution added improves the SNR by 6.02 [dB]<sup>7</sup>. Table 3.2 shows the SNR for different bit words length.

Equation (3.10) requires the sampling frequency to find the order of the filter. The minimum sampling frequency is two times the bandwidth of the information signal, which is called the Nyquist frequency.

<sup>&</sup>lt;sup>7</sup>6.02 [dB] when a sinusoid signal is used as reference. 6 [dB] when is a triangular reference signal.

| Bit length | SNR [aB] |
|------------|----------|
| 8          | 48.16    |
| 10         | 60.2     |
| 12         | 72.24    |
| 14         | 84.28    |
| 16         | 90.3     |
|            |          |

Table 3.2: Signal to noise ratio for several bit words length. Bit length | SNR [dR]

From the system specifications, the input data is provided in eight bits word. Assuming a conversion of 8 bits, the SNR in the converters will be 48.16[dB]. If the analog signal's SNR is equivalent to the converters SNR it is possible to assume that  $A_{TB|dB}$ is equivalent to 48.16 [dB]. This assumption is valid because the quantization error has to be, at least, equal to the noise, otherwise the LSB of the converted signal will be digitalized noise instead of relevant information.

Table 3.3: Order of the forward path low pass filter for several sampling frequencies.<sup>8</sup>

| N⁰   | $\operatorname{SNR}$ | $f_s$ [kHz] |          |         |         |         |         |         |
|------|----------------------|-------------|----------|---------|---------|---------|---------|---------|
| Bits | [dB]                 | 192.0       | 384      | 768     | 1152    | 1536    | 1920    | 2496    |
| 8    | 48.16                | 1.92(2)     | 1.52(2)  | 1.27(2) | 1.16(2) | 1.1(2)  | 1.05(2) | 1.00(1) |
| 10   | 60.20                | 2.40(3)     | 1.91(2)  | 1.59(2) | 1.45(2) | 1.37(2) | 1.31(2) | 1.25(2) |
| 12   | 72.24                | 2.88(3)     | 2.29(3)  | 1.91(2) | 1.74(2) | 1.64(2) | 1.57(2) | 1.50(2) |
| 14   | 84.28                | 3.36(4)     | 2.67(3)  | 2.23(3) | 2.03(3) | 1.92(2) | 1.83(2) | 1.75(2) |
| 16   | 90.30                | 3.84(4)     | 3.05~(4) | 2.55(3) | 2.32(3) | 2.19(3) | 2.1(3)  | 2.00(2) |
| over | sampling             | 10          | 20       | 40      | 60      | 80      | 100     | 130     |

Table 3.3 shows the required order of the filter for different oversampling factors (when the sampling frequency is higher than the Nyquist frequency then the signal is oversampled). The signal bandwidth,  $f_B$  is 9.6 [kHz]. It is important to note that (3.10) computes a fractional number. Therefore, to obtain the order of the filter the result has to be rounded up to the nearest higher integer (shown between parenthesis).

#### 3.2.3.3 The anti-aliasing filter

In the feedback path, anti-aliasing is ensured by a low pass filter. This eliminates high frequency components before converting the baseband information signal into a digital signal. Figure 3.6 shows that, as a result of down-conversion, a spurious image at two times the carrier frequency is present. To eliminate this unwanted frequency spectrum the same mechanism used for anti-imaging is applied here.

Figure 3.7 shows the frequency spectrum of the down-converted signal. Because the image is at much higher frequency (second harmonic of the carrier frequency), the cutoff frequency of the low pass filter can be much higher than the signal bandwidth  $(f_{co1}$  in the figure). Selecting a higher cutoff frequency  $(f_{co2} \text{ or } f_{co3})$  will increase the

 $<sup>^{8}</sup>$ The values shown in the table correspond to the result obtained using (3.10). The order of the filter will be the nearest higher integer value



Figure 3.6: Diagram of the information signal up and down conversion. The down-converted signal includes a high frequency component at  $2\omega_c$ 

slope of the roll-off as well as the order of the filter. The advantage if this selection is that the poles introduced by this low pass filter are not dominant.

The order of the low pass filter is obtained as follow:

$$Filter \ order = \frac{A_{TB|dB}}{20log_{10}\left(\frac{2f_c - (f_B + f_{co})}{f_{co}}\right)}$$
(3.11)

were  $A_{TB|dB} = A_{pass} - A_{stop}$  is the attenuation required,  $f_c$  is the carrier frequency,  $f_B$  is the information signal bandwidth and  $f_{co}$  is the filter cutoff frequency.

Table 3.4 shows the order of the low pass filter (shown between parenthesis), for a information signal bandwidth of 9.6 [kHz]. For a cutoff frequency that is two orders of magnitude higher than the signal bandwidth, a second order filter is a good choice.

The sampling frequency of the ADC can be made equal to the one found for the DAC. With an oversampling of 10 and a baseband signal bandwidth of 9.6 [kHz] it is ensured that no spectrum images overlapping will occur in the feedback path.



Figure 3.7: Frequency spectrum of the feedback signal after down-conversion.

 $<sup>^{9}</sup>$ The values shown in the table correspond to the result obtained using (3.11). The order of the filter will be the nearest higher integer value

| N⁰   | SNR   |          |          | $f_{co}$ [ | kHz]     |          |          |
|------|-------|----------|----------|------------|----------|----------|----------|
| Bits | [dB]  | 960      | 1152     | 1344       | 1536     | 1728     | 1920     |
| 8    | 48.16 | 0.971(1) | 1.003(1) | 1.032(1)   | 1.058(1) | 1.083(1) | 1.106(2) |
| 10   | 60.2  | 1.214(2) | 1.254(2) | 1.290(2)   | 1.323(2) | 1.354(2) | 1.382(2) |
| 12   | 72.24 | 1.456(2) | 1.505(2) | 1.548(2)   | 1.588(2) | 1.624(2) | 1.659(2) |
| 14   | 84.28 | 1.699(2) | 1.755(2) | 1.806(2)   | 1.852(2) | 1.895(2) | 1.935(2) |
| 16   | 96.32 | 1.942(2) | 2.006(2) | 2.064(2)   | 2.117(3) | 2.166(3) | 2.212(3) |

Table 3.4: Order of the feedback path low pass filter for several cutoff frequencies.<sup>9</sup>

## 3.2.4 The mixed-signal model in the frequency domain



Figure 3.8: Single path diagram of the mixed-signal Cartesian feedback.

In the previous sections the order of the anti-imaging and anti-aliasing filter were established and the sampling frequency was determined. In the present section a model for the mixed-signal Cartesian feedback will be developed. The new model includes the filter in the forward path. The filter in the feedback path does not contain dominant poles, therefore it is left out of the model.

Assuming that there is no cross-coupling between I and Q, a single path can be analyzed as is shown in Figure 3.8 for the In-phase signal. The same block diagram can be used to analyze the Quadrature signal as well.

Figure 3.8 resembles Figure A.4, but there are two differences: there is a delay included in the feedback path and the forward path includes an additional pole. Figure 3.9 shows an equivalent diagram of the system with its frequency domain characteristic.



Figure 3.9: Equivalent diagram of the mixed-signal model in the frequency domain.

• Open-loop transfer function:

$$L(s) = \frac{A\beta}{\left(\frac{s}{p_1} + 1\right)\left(\frac{s}{p_2} + 1\right)} e^{-(T_1 + T_2)s}$$
(3.12)

• Closed-loop transfer function:

$$H(s) = \frac{y(s)}{r(s)} = \frac{Ap_1p_2e^{-T_1s}}{s^2 + (p_1 + p_2)s + p_1p_2 + A\beta p_1p_2e^{-(T_1 + T_2)s}}$$
(3.13)

• Magnitude (or Gain):

$$Gain(s) = |L(s)| = \frac{A\beta p_1 p_2 e^{-(T_1 + T_2)\sigma}}{\sqrt{(\sigma + p_1)^2 + \omega^2}\sqrt{(\sigma + p_2)^2 + \omega^2}}$$
(3.14)

• Phase:

$$Phase(s) = \angle L(s) = -\left[tan^{-1}\left(\frac{\omega}{\sigma+p_1}\right) + tan^{-1}\left(\frac{\omega}{\sigma+p_2}\right) + \omega\left(T_1 + T_2\right)\right]$$
(3.15)

were  $s = \sigma + j\omega$ . Equation (3.15) was obtained by using trigonometric identities [21].

A important condition has to be ensured, the loop gain, L(s), has to contribute with enough gain in all the signal bandwidth. Using the -3 [dB] criterion the pair of poles should be located, at least, 1.5569 times the signal bandwidth  $(f_{B-eff} = 1.5569 \cdot f_B)$ . Making  $p_1 = p_2 = 2\pi \cdot f_{B-eff}$  the loop gain condition is ensured. Then, the gain and phase margin are given by:

• Gain margin:

$$GM = \frac{(2\pi f_B)^2 + \omega_\pi^2}{A\beta (2\pi f_{B-eff})^2}$$
(3.16)

• Phase margin:

$$PM = -\left[2 \cdot tan^{-1} \left(\frac{\omega_{UG}}{2\pi f_{B-eff}}\right) + \omega_{UG} \left(T_1 + T_2\right)\right] + \pi \qquad (3.17)$$

were  $\sigma = 0$ ,  $\omega_{\pi}$  is the frequency at which the phase is equal to  $\pi$  and  $\omega_{UG}$  is the frequency at which the gain is unity.

Setting the gain equal to unity and solving for  $\omega_{UG}$ :

$$\omega_{UG} = 2\pi f_{B-eff} \sqrt{A\beta - 1} \tag{3.18}$$

Evaluating (3.18) on (3.17)

$$PM = -\left[2 \cdot tan^{-1}\left(\sqrt{A\beta - 1}\right) + 2\pi f_{B-eff}\sqrt{A\beta - 1}\left(T_1 + T_2\right)\right] + \pi$$
(3.19)

Remember that in Section A.1.2, the assumption of  $A\beta \gg 1$  was made, with  $A\beta$  large enough to approximate the term  $tan^{-1}(A\beta) \approx \pi/2$ . If this assumption is applied to (3.19) then

$$PM = -2 \cdot \frac{\pi}{2} - 2\pi f_{B-eff} \sqrt{A\beta - 1} (T_1 + T_2) + \pi$$
  
=  $-2\pi f_{B-eff} \sqrt{A\beta} (T_1 + T_2)$  (3.20)

The phase margin becomes negative, giving the indication that the system is unstable. From this analysis the following conclusions are reached:

- 1. Two poles plus a delay contributes with negative phase.
- 2. For a high value of  $\sqrt{A\beta 1}$  this contribution becomes the maximum.
- 3. The system is always unstable.

If a pole contributes with negative phase, a zero does the opposite. Therefore, including a zero in the open loop transfer function will add more phase, compensating the effect of the poles in the phase margin. Moreover, limiting the product  $A\beta$  will also reduce the negative phase contribution to the phase margin.

#### 3.2.4.1 Compensation with one zero

• Open-loop transfer function:

$$L(s) = \frac{A\beta\left(\frac{s}{z}+1\right)}{\left(\frac{s}{p}+1\right)^2} e^{-(T_1+T_2)s}$$
(3.21)

• Magnitude (or Gain):

$$Gain(s) = |L(s)| = \frac{A\beta p^2 e^{-(T_1 + T_2)\sigma} \sqrt{(\sigma + z)^2 + \omega^2}}{z \left((\sigma + p)^2 + \omega^2\right)}$$
(3.22)

• Phase:

$$Phase(s) = \angle L(s) = \tan^{-1}\left(\frac{\omega}{\sigma+z}\right) - 2 \cdot \tan^{-1}\left(\frac{\omega}{\sigma+p}\right) - \omega \left(T_1 + T_2\right) \quad (3.23)$$

were  $s = \sigma + j\omega$ .

To check for stability the same procedure as applied in Section A.1.2 should be used. However, the calculations becomes complicated and tedious with the addition of a zero in the loop gain. In the present work, a simpler alternative is followed. Two main conditions have to be fulfilled to ensure stability: the first condition is to have enough phase margin that allows a fast system response without excessive overshoot. The second condition is to ensure that the system poles (the poles of the closed loop



Figure 3.10: Root locus for a system with two poles: (a) Root locus two poles without delay. (b) Root locus two poles with delay.  $p_1 = p_2 = p$  are the open loop poles,  $p_{s1}$  and  $p_{s2}$  are closed loop or system poles. (c) Root locus with the addition of one zero in the loop gain. In dashed line the old root locus.

transfer function), are located in the left hand side of the S-plane, hopefully in a Butterworth position. This means that each complex pole should form an angle of  $45^{\circ}$  with respect to the real axis. Because poles attract zeros, the insertion of a zero will cause the poles to move towards the zero as is shown in figure 3.10(c).

The system delay plays an important role because it pushes the system poles towards the right hand side of the S-plane. Figure 3.10(a) shows the open loop poles and the system poles without delay and Figure 3.10(b) shows the consequences of the delay.

To find an optimum location of the zero the following procedure was carried out: first, the loop gain is set to a value (as high as possible). Second, the system delay is also set to a high value. Third, the zero is located in the real axis (in the left hand side of the S-plane) in such a way that the phase margin shows a value higher than 60°. Fourth, if the phase margin is lower than 60° then the loop gain or the system delay is adjusted (reduced) and the exercise is repeated from the third step.

After several iterations the loop gain was reduced to 10 and the system delay to 400[ns]. The zero was located at a frequency three times higher than  $f_{B-eff}$  to reach a phase margin higher than  $60^{\circ}$ .

So far is known that at least one zero has to be inserted to compensate the system.

The question that arises is where to locate this zero. Two alternatives are investigated: inserting the zero in the forward path and inserting the zero in the feedback path.

**Inserting a zero in the forward path** will affect the behavior of the frequency in the forward path. Initially the pair of real poles, p, contributed with 40 [dB/dec] in the roll-off. At 48 [dB] of attenuation the cross-over frequency  $(f_s - f_B)$  is found as shown in Figure 3.5.

The insertion of a zero could be done by using an active circuit such as the one shown in Figure 3.11.



Figure 3.11: Realization of an active lead compensator filter.

This active lead compensator has a transfer function given by:

$$H(s) = \frac{\frac{s}{z_f} + 1}{\frac{s}{p_f} + 1}$$
(3.24)

in which

$$z_f = -\frac{1}{2\pi (R_1 + R_2)C_1} \tag{3.25}$$

$$p_f = -\frac{1}{2\pi R_1 C_1} \tag{3.26}$$

The inserted zero,  $z_f$ , will reduce the roll-off to 20 [dB/dec] until the pole,  $p_f$ . As a consequence the cross-over frequency has been moved to higher frequencies as shown in Figure 3.12. The sampling frequency has to be increased in order to ensure the filtering of the residual images.

**Inserting a zero in the feedback path** has the advantage of not affecting the sampling frequency selection. As the active lead filter is in the feedback path, the filter in the forward path will have a roll-off of 40[dB/dec] as shown in Figure 3.5. With a low oversampling, the power consumption of the converter will be lower. The zero insertion in the feedback is also advantageous for the DSP circuit, since the clock frequency will not have to be increased, reducing the power consumption in the converter as well.



Figure 3.12: Filter behavior with the addition of one zero in the forward path.

## 3.2.5 Phase shift detection and correction

In the previous section a mixed-signal Cartesian feedback model was found and the stability and linearity were determined. However, the phase shift caused by the system delay and non-linearities of the PA have not been taken into account so far. A method that allows to compensate the phase misalignment will be required.

In Chapter 2 different techniques applied to Cartesian feedback were introduced. These techniques can be divided into two groups: those that apply the phase correction directly into the local oscillator (LO) of the down-converter and those that apply the phase correction to the baseband information signal. The advantage of correcting the phase misalignment at baseband is the reduction in power consumption.

The present work follows the automatically phase alignment technique. This technique was first proposed in [17], in 2000. In this work a mechanism to continuously detect and correct the phase shift is designed. Although the design is in the analog domain, the analysis of the phase behavior is helpful in order to develop a digital implementation.



Figure 3.13: Phase alignment concept [17, 3].

Figure 3.13 shows the automatic phase alignment concept. A phase detection is performed by taking the In-phase and Quadrature signals of the input and feedback (I,Q and I',Q', respectively) and applying a sum of products QI' - IQ':

$$I = \kappa_1 \cos (\omega t + \theta)$$

$$Q = \kappa_1 \sin (\omega t + \theta)$$

$$I' = \kappa_2 \cos (\omega t + \theta')$$

$$Q' = \kappa_2 \sin (\omega t + \theta')$$
(3.27)

$$QI' - IQ' = \kappa_1 \kappa_2 sin(\theta - \theta') \tag{3.28}$$

Equation (3.28) shows that the sum of products is proportional to the phase difference,  $\Delta \theta = \theta - \theta'$ , by the product  $\kappa_1 \kappa_2$ , the magnitude of the input and feedback signals. Its unit is radians square volt,  $[rad \cdot V^2]$ . The phase is proportional to the square of the voltage. When the phase difference is small, as will be the case when the phase difference is corrected, (3.28) becomes

$$QI' - IQ' = \kappa_1 \kappa_2 \Delta \theta \tag{3.29}$$

Equation (3.29) allows to find the instant phase difference (phase error between the input signals and the feedback signals).

A mechanism is required to keep track of the phase difference. The simplest method is to accumulate the instant phase difference in such a way that, in the steady state, the error signal is reduced. Feedback is implicitly involved. An error signal has to be controlled in such a way that it is reduced.

This resembles much the Phased-Locked Loop (PLL), in which the output of a phase detector drives a Voltage Controlled Oscillator (VCO) in a direction that reduces the phase difference. Figure 3.14 shows a model for the phase loop. It is similar to a Linearized PLL model in which a phase detection drives a VCO. In the linearized model the VCO is modeled as an integrator, since phase is the integral of frequency [22].



Figure 3.14: Phase regulation diagram in the frequency domain.

The block H(s) is included to take into account certain gain or filtering that can be required in the loop. The system dynamics block, consisting of the dynamics of the phase correction circuit, the PA and the up and down converters, is modeled as a delay. The gain of the integrator,  $C_o$ , describes what change in output frequency results from a specified change in control voltage. The phase feedback is given by:

$$\theta_{fb}(s) = H(s)\frac{C_0 e^{-Ts}}{s}\theta_{error} + PhaseDistortion(s) + Drift(s)$$
(3.30)

The error transfer function is given by:

$$\theta_{error}(s) = \frac{s \cdot (\theta(s) + PhaseDistortion(s) + Drift(s))}{(s + C_0 e^{-Ts} H(s))}$$
(3.31)

Assuming H(s) = 1, a constant gain in the loop and the input signal is a sinusoid with frequency  $\omega_{in}$ . The phase is a ramp signal with a slope equal to  $\omega_{in}$  and its Laplace representation is:

$$\theta(s) = \frac{\omega i n}{s^2} \tag{3.32}$$

If the same assumption is made for the phase distortion and the drift, then:

$$PhaseDistortion(s) = \frac{\omega_{PD}}{s^2}$$
(3.33)

and

$$Drift(s) = \frac{\omega_{Dr}}{s^2} \tag{3.34}$$

Equation (3.31) becomes

$$\theta_{error}(s) = \frac{\omega_{in} + \omega_{PD} + \omega_{Dr}}{\kappa_1 \kappa_2 s \left(s + C_0 e^{-Ts}\right)} \tag{3.35}$$

The steady-state phase error is found applying the final value theorem [23] as follows:

$$\lim_{t \to \infty} \theta_{error}(t) = \lim_{s \to 0} s \cdot \theta_{error}(s)$$
$$= \frac{\omega_{in} + \omega_{PD} + \omega_{Dr}}{\kappa_1 \kappa_2 C_0}$$
(3.36)

Then, to reduce the phase error, the product  $C_0$  must be higher than the sum of the input, the phase distortion and the drift signal frequencies.

$$C_0 \gg 2\pi \frac{f_{in} + f_{PD} + f_{Dr}}{\kappa_1 \kappa_2} \tag{3.37}$$

# 3.3 Mixed-signal model simulation

In this section simulations are carried out to validate the model found in previous sections. Firstly, the system stability and frequency behavior is evaluated. Secondly, a two-tone test is carried out to validate the linearity of the system. A third order polynomial is used as the PA transfer function. Finally, the phase shift detection is simulated to validate the correct phase difference calculation.

### 3.3.1 System stability validation

The system stability validation was performed for the worst case (compensation in the forward path) with an input signal bandwidth of 9.6 [kHz]. The phase margin is set to 60°. For those conditions the loop gain and system delay are maximized. Table 3.5 shows the results for the input conditions.

From the simulations results, it was possible to determine a relationship between dominant poles and the zero. Also, as the compensation system is a lead network, the pole must be at higher frequency than the zero. A reasonable frequency for  $p_f$  is one order of magnitude higher than  $z_f$ .

$$z_f = 3 \cdot f_{B-eff} \tag{3.38}$$

$$p_f = 10 \cdot z_f \tag{3.39}$$

where the effective bandwidth,  $f_{B-eff}$ , as derived in Section 3.2.4, corresponds to the required bandwidth to ensure a cutoff frequency,  $f_B$ , of 3[dB] below the flat band for a second order low pass filter. The constant in (3.38) where obtained as a result of the procedure explained in Section 3.2.4.1, and the constant in (3.39) is assumed for non-dominant pole in a practical implementation of a lead compensator.

The most important result from this analysis is the determination of the system delay. A system delay of 400[ns] was obtained for a loop gain of 10. For this system, the sampling frequency is close to 500[kHz] due to the correction that has to be applied when the compensation is in the forward path.

The time between samples is  $2[\mu s]$ , which is five times larger than the system delay. It is valid to say that the digital system will be busy one fifth of the sampling period time.

Figure 3.15 shows the simulation results for the stability analysis. Figure 3.15 (a) shows the open loop frequency response. It is possible to see a phase margin of 60° and

| Input c     | onditions    | Output conditions |              |  |
|-------------|--------------|-------------------|--------------|--|
| Bandwidth   | Phase Margin | Loop Gain         | System Delay |  |
| 9.6[kHz]    | 60°          | 10                | 400[ns]      |  |
| Initia      | al poles     | Comp              | ensation     |  |
| $p_1$       | $p_2$        | $z_f$             | $p_f$        |  |
| 14.945[kHz] | 14.945[kHz]  | 44.839[kHz]       | 448.39[kHz]  |  |

Table 3.5: Cartesian feedback system setup for stability.



Figure 3.15: Cartesian feedback stability plots: (a) Bode plot of the open loop. (b) Bode plot of the closed. (c) Bode plot of the attenuation. (d) Step response.

a gain margin of 15 [dB]. Figure 3.15(b) shows the closed loop frequency response for the two cases studied (compensation in the forward path (solid bold line) and compensation in the feedback path (solid line)). The flat band has a magnitude close to 20 [dB]. The small overshoot (solid bold line), in the frequency response is produced by the compensation filter when is located in the forward path. It is also reflected in the step response of the closed loop system, Figure 3.15(d). A notorious overshoot can be seen (13.7%). The step response is faster in comparison to the step response in the system with the compensation located in the feedback path at the expense of that big overshoot.

Figure 3.15(c) shows the frequency response of  $\frac{1}{1+L(s)}$ , the distortion attenuation component. It is possible to see that for frequencies below 9.6 [kHz] an attenuation below the -20 [dB] is present. There, the distortion is being attenuated around 10 times.

### 3.3.2 The phase shift effect on stability

As described in Chapter 2 a Class E PA produces AM-PM distortion, a change in the phase as a consequence of the voltage supply. The lower the voltage supply, the higher the phase shift.

Fortunately, when constant envelope modulation such as GMSK or  $\pi/4$ -DQPSK are used, the changes in the phase will be small for a high voltage supply as shown in Figure 2.8. Therefore, ensuring a phase margin for the highest phase shift with constant envelope will be sufficient to keep the system under stability.

On the other hand, if a modulation scheme in which the phase and magnitude are modulated, AM-PM distortion will be present, threatening the system stability. The best way to cope with this problem is to use a modulation scheme in which the trajectories do not cross the origin in the I-Q plane and stays away from it as much as possible.

### 3.3.3 Checking the linearity - Two-tones test

To check the linearity of the system a mathematical model was used to model nonlinearities in the power amplifier. A third order polynomial was used with function:

$$PA(v) = K_1 \cdot v + K_2 \cdot v^2 + K_3 \cdot v^3 \tag{3.40}$$

where

$$K_1 = 100$$
$$K_2 = 5$$
$$K_3 = -15$$

An input signal with two tones, one at  $f_1 = 6[\text{kHz}]$  and the other at  $f_2 = 8[\text{kHz}]$  was used. Only one path was simulated at baseband. The goal of the simulation was to show that the closed loop system is capable of reducing distortion within the conditions



Figure 3.16: Two-tones test Open and Closed loop comparison: (a) Output signal in the time domain. (b) Output signal in the frequency domain.

shown in Table 3.5.

Figure 3.16(b) shows the frequency response for a closed loop transfer function (blue line) and the open loop transfer function (red line). It is possible to see that for the closed loop transfer function the unwanted frequency component  $(2 \cdot f_1 - f_2, 2 \cdot f_2 - f_1$  and others) are significantly attenuated.

### 3.3.4 Checking the phase shift detection and correction

A simulation was performed using a one volt input sinusoid signal with frequency 4.8 [kHz] for the Quadrature component, while the In-phase component was set to zero volt. A constant envelope is assumed, therefore the phase distortion frequency,  $f_{PD}$ , can be neglected for a high supply voltage in the PA. The drift frequency,  $f_{Dr}$ , can be considered much smaller than the input frequency. Drift is primarily related to thermal behavior which involves very low frequencies.

The setting of 4.8 [kHz] for the frequency is taking into consideration that a half cycle of the sinusoid corresponds to one bit period. A phase offset of  $30^{\circ}$  was set into the LO of the downconverter. Given the input signal, the condition for  $C_0$  is evaluated in Equation (3.37)

$$C_0 \gg 30159$$
 (3.41)

 $C_0$  is chosen to be 40000, which agrees with (3.41).

Figure 3.17 shows the response in time for the phase shift,  $\theta_{shift}$ . It is possible to see that in steady-state the phase shift tends to -0.5236 [rad] which is equivalent to



Figure 3.17: Phase shift versus time

 $-30^{\circ}$ . The sign indicates the direction in the correction, the phase offset was added to the LO, and therefore the phase shift is in the opposite direction.

Increasing the value of  $C_0$  over 40000 will produce a faster response of the phase shift with the consequence of a bigger overshoot.

The phase correction is straightforward as depicted in Figure 3.13. The phase shift is applied to the error signals  $I_{error}$  and  $Q_{error}$ . The block shows a counterclockwise matrix rotation operation. Methods to implement a rotation will be introduced in the next chapter.

# 3.4 Behavior of the system for higher bandwidths

So far the stability of the Cartesian feedback has been analyzed, however the used input signal bandwidth was low. The following question arises: how much bandwidth a system like the one studied can above support?

Assuming a loop gain of 10 and a phase margin of 60° the following table shows the maximum system delay attainable and the system parameters.

| the phase inc | . sm 10 00 . |         |         |                   |                        |
|---------------|--------------|---------|---------|-------------------|------------------------|
| Bandwidth     | n Filter     | $z_f$   | $p_f$   | $\mathbf{System}$ | Sampling <sup>10</sup> |
|               | poles        |         |         | delay             | frequency              |
| [kHz]         | [kHz]        | [kHz]   | [kHz]   | [ns]              | [MHz]                  |
| 9.6           | 14.945       | 44.839  | 448.39  | 400               | 0.5                    |
| 20            | 31.138       | 93.414  | 934.14  | <b>200</b>        | 1.0                    |
| 40            | 62.276       | 186.828 | 1868.28 | 100               | 2.0                    |
| 80            | 124.552      | 373.656 | 3736.56 | 50                | 4.0                    |

Table 3.6: Cartesian feedback system setup for different bandwidths. The loop gain is fixed to 10 and the phase margin is  $60^{\circ}$ .

<sup>10</sup> The sampling frequency is obtained for a mixed-signal Cartesian feedback with an 8 bit ADC/DAC. For a larger bit word ADC/DAC the sampling frequency will increase.

From Table 3.6 it is possible to see that as the bandwidth doubles the sampling frequency also doubles, on the other hand the system delay reduces to half.

The conclusions above shows the limitations for a digital implementation. A digital implementation requires time to convert and process the signals. As the bandwidth increases the system delay has to be reduced to meet the phase margin ( $\geq 60^{\circ}$ ) for a fixed loop gain of 10. This limits the available time for digital processing. If more bandwidth is desired, then, phase margin and/or loop gain has to be scarified. Reducing the phase margin will jeopardize the system stability. Reducing the loop gain decreases the linearity of the system.

# 3.5 Summary

In this chapter the Cartesian feedback stability has been examined. A model for the open loop transfer function was proposed consisting of a number of dominant poles and zero and a delay modeled by the exponential function. As the system is in the digital domain DACs and ADCs are required. In order to eliminate aliasing and imaging, filters are required in the forward and feedback loop. The inclusion of these filters adds poles to the open loop transfer function. Only those poles of the forward path are dominant. Poles in the feedback path are not dominant and can be discarded for stability analysis purposes. The stability of the closed loop system is compromised when a delay is present. Compensation is required to improve the phase and gain margins of the system. Adding one zero allows to improve the stability of the system. The zero can be inserted in the forward or in the feedback path. Inserting it in the forward path will affect the closed loop transfer function and the sampling frequency will have to be increased. Inserting it in the feedback path does not affect the sampling frequency which reduces the power consumption of the converters.

Non-linear elements will distort the information signal. To suppress distortion, sufficient loop gain has to be applied in the frequency range of the information signal. Delays originated by the elements in the path can bring the system to instability. The delay produce a phase shift of the signal in the loop. To correct the phase shift, first it has to be detected and second it has to be corrected. A model for the phase shift detection as well as for the correction was developed and evaluated.

In previous chapters a complete analysis of the Cartesian feedback system was presented. The result of that analysis made it possible to find a system that compensates stability and phase shift. It was shown that to compensate the phase shift, mathematical and trigonometrical operations must be performed.

The domain selected to implement the compensation is digital because in this way the power consumption is lowered. The digital compensator's main operations are: vector rotation and vector magnitude. These operations involve sine, cosine, squaring and square rooting operations; their basics arithmetics operations are implemented using adders/subtractors and multipliers. There are many ways to implement a digital compensator and each one of them can have different types of digital elements involved that will produce the same result. What will differentiate them from each other is the trade-off between area, delay and power consumption. These metrics will also help to decide on how many elements to use and how to organize them in time in order to meet the requirements in the most efficient way.

# 4.1 Mixed-signal Cartesian feedback system description

Figure 4.1 shows (in grey) a block diagram of the digital signal processing part. A phase detection will compute the instant phase difference. An integrator computes the phase shift. The rotation block performs the phase correction on the error signals. A



Figure 4.1: Mixed-signal Cartesian feedback model.



Figure 4.2: Error step response for both types of compensation: zero in the forward path (solid line) and zero in the feedback path (dashed line).

magnitude block computes the magnitude of the error signal. The error signals are obtained by an addition(subtraction) between input signals and feedback signals.

# 4.2 Determining the bit resolution

Before starting with the implementation, it is important to determine the bit resolution for the digital signal processing part. It is expected that the error signals will be small under a stable Cartesian feedback system with enough DC loop gain. As a consequence only the less significant bits (LSB) will contain valuable information. A reduced bit word will limit the information resolution obtained during the process. Figure 4.2 shows that the signal settles, at steady state, to  $0.0909[V]^{-1}$ .

For an eight bits word, only the four less significant bits will contribute with information with a 5.46% of error. However, for a twelve bits word, the eight LSB contain relevant information and the error percentage is reduced significantly.

|         | 0.0909            | Truncated value  | Error (%) |
|---------|-------------------|------------------|-----------|
| 8 bits  | 0.0001011         | 0.08593750000000 | 5.46      |
| 10 bits | 0.000101110       | 0.08984375000000 | 1.16      |
| 12 bits | 0.00010111010     | 0.09082031250000 | 0.09      |
| 14 bits | 0.0001011101000   | 0.09082031250000 | 0.09      |
| 16 bits | 0.000101110100010 | 0.09088134765625 | 0.02      |

Table 4.1: Bit resolution length comparison for a steady state value of the error.

<sup>&</sup>lt;sup>1</sup>System with 9.6[Kbps] bandwidth, loop gain of 10, phase margin of 60° and system delay of 400[ns]

To demonstrate the effect of the word bit length and its resolution, an HDL cosimulation model was implemented in ADS<sup>®</sup>. It consists of an analog model of the Up and Down converters, the power amplifier and the filter in the forward and feedback loop. A behavioral description of the digital design in which the length of the input and output signals can be varied is modeled. See Appendix B.2 for a detailed description of the HDL co-simulation environment.

### 4.2.1 Behavioral HDL co-simulation

The behavioral HDL co-simulation helps to validate the required bit resolution for the digital compensator. A pseudo random bitstream, with a bit rate of 19.2 [Kbps] is modulated in  $\pi/4$ -DQPSK and used as input. The signal is first passed through the open loop system and demodulated to base band at the load and plotted. The Power Amplifier is modeled by the Amplifier2 system model, provided in the ADS® libraries [24]. To model distortion, the third order intercept (TOI) parameter was setup to 40 [dBm]. Then the process is repeated for the system in closed loop configuration. The output signal of both simulations are compared in the frequency spectrum and time domain. The effect of the bit resolution is shown in Figures 4.3 and 4.4.

The result of this simulation shows that increasing the bit length improves the frequency spectrum. Also, the signal presents a smaller ripple when the resolution is higher.



Figure 4.3: The effect of the bit length resolution in the frequency domain. (a) Frequency spectrum for an eight bits word length. (b) Frequency spectrum for a twelve bits word length.



Figure 4.4: The effect of the bit length resolution in the time domain. (a) Eight bits word input, feedback and rotated signals of the In-phase component. (b) Twelve bits word input, feedback and rotated signals of the In-phase component.

## 4.3 Designing for low power consumption

Power consumption in an integrated circuit (IC) is derived from

$$[25]P_{total} = P_{dynamic} + P_{static}$$

where

$$P_{static} = (I_{DC} + I_{Leak}) \cdot V_{DD} \tag{4.1}$$

and

$$P_{dynamic} = \alpha \cdot (C_L + C_{SC}) \cdot V_{swing} \cdot V_{DD} \cdot f \tag{4.2}$$

The static power in (4.1) is related to biasing and technology;  $V_{DD}$ , the supply voltage,  $I_{DC}$  the static current and  $I_{Leak}$ , the leakage current.

Sub-threshold conduction is a consequence of biasing. It is caused by the gradual reduction of the threshold voltage forced by the reduction in the supply voltage. The leakage current presents an exponential growing behavior for linear reduction of the threshold voltage [25], [26].

Causes of static power dissipation that depend on the technology are the gate leakage and the junction leakage currents. The gate leakage is caused when a thin layer of  $SiO_2$ , separating the gate from the silicon substrate, causes a reduction in the gate resistance,



Figure 4.5: Scheme of the digital block

allowing the current to leak through the dielectric. The gate leakage also presents an exponential behavior as a function of the supply voltage and the dielectric thickness.

The junction leakage current is caused by the reverse-bias currents in  $n^+$ -p substrate or  $p^+$ -n well (for decreased depletion regions thickness), diffusion of minority carriers through junction and highly doped p-n junctions. It is a strong function of the temperature, but is much smaller than the other two leakages described above.

The dynamic power in (4.2) is related to the activity of the IC;  $\alpha$ , the switching activity: charging and discharging load capacitors (empirically determined to be closer to 0.1),  $C_L$ , the load capacitance (usually referred to a fanout of four (FO4) [26]), short-circuit currents (pMOS and nMOS "on" at the same time), modeled as  $C_{SC}$  the short circuit capacitance,  $V_{swing}$ , the voltage swing (usually equal to  $V_{DD}$ ), the supply voltage and f the clock frequency.

Dynamic hazards, also know as glitches, are contributors to the dynamic power. Glitches are originated by the disparity of arrival times to a logic gate. This will cause unwanted charging/discharging of the load capacitance seen by the logic gate and therefore a flow of current.

Actions to take into account to reduce the dynamic power:

- Reduce the activity factor. Example: Clock gating to stop portions of the circuit that are idle.
- Reduce capacitance. Use small transistors and careful floorplanning. This is only possible in full custom ASIC. In predefined cell libraries or FPGAs the transistor size is fixed for each technology.
- Reduce supply voltage.
- Reduce clock frequency (less clock cycles more parallelism).

# 4.4 Digital block design alternatives

Figure 4.5 shows a scheme of the digital block. Six digital structures are required to perform the phase detection and correction: two comparators, a phase detection circuit, an integrator filter, a rotation algorithm circuit and a magnitude algorithm circuit. The function of each structure and alternative architecture implementations are explained below.

### 4.4.1 Comparator

The comparator performs a subtraction between the input signal and the feedback signal. The subtraction is implemented with an adder in which one of the inputs is inverted to perform the subtraction.

The Class E PA inverts the signal due to the nMOS transistor; the output signal has a phase shift of 180°. Taking that into consideration, the comparator is just an addition. This simplifies the hardware implementation when no sign inversion is required.

The bit word length is twelve bits. The operation could be serial or parallel. It will be shown later on that parallel operation is preferred over serial due to power consumption reduction.



Figure 4.6: Comparator realization

#### 4.4.2 Phase detector

In Section 3.2.5 the phase detection concept was introduced. Equation (3.28) shows the operations required to obtain the instant phase error: two multiplications and one subtraction. Figure 4.7 shows the realization of the phase detector. The input signals I, Q, I' and Q' are Q1.11<sup>2</sup> fixed point numbers within the range [-1 0.99951171875].

The product  $I \cdot Q'$  and  $I' \cdot Q$  will generate a Q2.22 fixed point numbers within the range [-2 1.999999761581421]. It is possible to see that the range  $[-\pi/2 \pi/2]$ , which is the range of interest for the phase shift, is contained in [-2 1.999999761581421]. The resulting number,  $\theta_{error}$ , is a twenty four bits length word. As the adder, there are many architectures of multipliers to select from. The selection of the multiplier will be treated in the next chapter.

 $<sup>^{2}</sup>$ Q is a fixed point format where a number of fractional and integer bits are specified. Qm.n, in which m is the number of bits used to designate the two's complement integer portion of the number, exclusive of the sign bit. n is the number of bits used to designate the two's complement fractional portion of the number.



Figure 4.7: Phase detection realization

### 4.4.2.1 Integrator filter

In Section 3.2.5, Figure 3.14 shows that an integrator is required to obtain the phase shift. The most practical way to design a digital filter from its analog counterpart is to use the bilinear transformation [27]. This transformation is a mapping from the s-plane to the z-plane using the following relation:

$$s = \frac{2}{T_s} \left( \frac{1 - z^{-1}}{1 + z^{-1}} \right) \tag{4.3}$$

The transfer function of the discrete filter will be given by:

$$H(z) = H(s)|_{s = \frac{2}{T_s} \left(\frac{1-z^{-1}}{1+z^{-1}}\right)}$$
(4.4)

For an integrator of the form

$$H(s) = \frac{C_0}{s} \tag{4.5}$$

the discrete transfer function H(z) will be:

$$H(z) = \frac{C_0 T_s}{2} \left( \frac{1+z^{-1}}{1-z^{-1}} \right)$$
(4.6)

The direct form of an Infinite Impulse Response (IIR) filter of order N is given by:

$$H_{DF}^{N}(z) = \frac{P(z)^{N}}{D^{N}(z)} = \frac{p_{0} + p_{1}z^{-1} + p_{2}z^{-2} + \dots + p_{N}z^{-N}}{1 + d_{1}z^{-1} + d_{2}z^{-2} + \dots + d_{N}z^{-N}}$$
(4.7)

It is possible to see, from (4.7) that, for N = 1, the value coefficients  $p_0$ ,  $p_1$  and  $d_1$  are found when  $H_{DF}^1(z) = H(z)$  from (4.6):

were  $C_0 = 60318$  for a bandwidth of 9.6[kHz](see Section 3.2.5) and  $T_s = 1.58$  [µs] is the sampling period.

The realization of the filter, in transposed direct form II, is shown in Figure 4.8



Figure 4.8: IIR in transposed direct form II. (a) General realization. (b) Simplified realization due to  $p_0 = p_1$  and  $d_1 = -1$ .

Table 4.2: IIR coefficients values.

| Coef. | Relation    | Value  |
|-------|-------------|--------|
| $p_0$ | $C_0 T_s/2$ | 0.0477 |
| $p_1$ | $C_0 T_s/2$ | 0.0477 |
| $d_1$ | -1          | -1     |

## 4.4.3 Phase error correction (Rotation)

The phase error correction is carried out by a counterclockwise rotation of the quadrature signals.

$$\begin{pmatrix} I_{rot} \\ Q_{rot} \end{pmatrix} = \begin{pmatrix} \cos(\theta_{shift}) & \sin(\theta_{shift}) \\ -\sin(\theta_{shift}) & \cos(\theta_{shift}) \end{pmatrix} \cdot \begin{pmatrix} I_{error} \\ Q_{error} \end{pmatrix}$$
(4.8)

This rotation can be implemented in several ways which are explained below.

#### 4.4.3.1 Multipliers and LUTs

To obtain the rotated signals  $cos(\theta_{shift})$  and  $sin(\theta_{shift})$ , it is necessary to perform some operations. The most simple way would be to use look-up tables (LUTs) to map the input binary value of  $\theta_{shift}$  to tables that have the sine and cosine function preloaded. The size of the LUT will depend on the bit length. For a twelve bit length the table will contain  $2^{12} = 4096$  registers. The next step is to perform the multiplication with the input signals  $I_{error}$  and  $Q_{error}$  to end with and addition/subtraction as is shown in (4.9) and (4.10)

$$I_{rot} = I_{error} \cdot \cos(\theta_{shift}) + Q_{error} \cdot \sin(\theta_{shift})$$

$$(4.9)$$

$$Q_{rot} = -I_{error} \cdot sin(\theta_{shift}) + Q_{error} \cdot cos(\theta_{shift})$$
(4.10)

Figure 4.9 shows the realization of the phase correction using LUTs. This architecture will prove to be fast, but area consuming due to the requirement of four multipliers and large size LUTs.



Figure 4.9: Rotation realization with real multipliers and LUTs.

### 4.4.3.2 Complex multiplier

An alternative to using multipliers and LUTs is the use of a complex multiplier (CM). The advantage of a CM is its design in which distributed arithmetic (DA) is applied on bit-level operations, reducing the area and increasing the speed [28]. Figure 4.10 shows the realization of a complex multiplier with distributed arithmetics.



Figure 4.10: Complex multiplier realization. (a) LSR (left), MSC (right). (b) MID (left), ADD32 (right). (c) MSR (left), COR1 (right). (d) Global structure of bit parallel complex multiplier using DA.[28]

| Supply           | Maximum            | Power              | Attainable              |
|------------------|--------------------|--------------------|-------------------------|
| Voltage          | Clock Frequency    | Consumption        | Signal Bandwidth        |
| 1.5 V            | 10 MHz             | $35 \mathrm{~mW}$  | 0.2-0.4 MHz             |
| 1.8 V            | $20 \mathrm{~MHz}$ | $86 \mathrm{mW}$   | $0.5-0.8 \mathrm{~MHz}$ |
| $2.7 \mathrm{V}$ | $50 \mathrm{~MHz}$ | 410  mW            | 1.2-2.0 MHz             |
| $3.3 \mathrm{V}$ | $60 \mathrm{~MHz}$ | $825 \mathrm{~mW}$ | 1.5-2.4 MHz             |
| $5.0 \mathrm{V}$ | $60 \mathrm{~MHz}$ | $2 \mathrm{W}$     | 1.5-2.4 MHz             |

Table 4.3: Maximum speed and power consumption for various supply voltages [29].

In [29] a chip for the linearization of a RF PA using pre-distortion based on a bitparallel CM was implemented. Table 4.3 shows the power consumption, the maximum frequency and their respective bandwidth ranges. I and Q were 14 bit 2's complement.

#### 4.4.3.3 CORDIC algorithm - circular rotation mode

An alternative to using multipliers with LUTs and complex multipliers is CORDIC. CORDIC stands for COordinate Rotation DIgital Computer. It was first introduced in [30], in 1959. The simplicity of this technique makes it attractive to compute rotations and other operations such as magnitude and Arctangent.

The basic concept is to decompose the rotation angle into smaller partial rotations of predefined angles in such a way that the rotation through each predefined angle can be accomplish with shift-and-add operations.

$$I_{i+1} = I_i - \sigma_i \cdot I_i \cdot 2^{-i} \tag{4.11}$$

$$Q_{i+1} = Q_i + \sigma_i \cdot Q_i \cdot 2^{-i} \tag{4.12}$$

$$\theta_{i+1} = \theta_i - \sigma_i \cdot tan^{-1}(2^{-i}) \tag{4.13}$$

in which  $I_0 = I_{error}$ ,  $Q_0 = Q_{error}$ ,  $\theta_0 = \theta_{shift}$  and  $\sigma_i$  is computed in such a way that  $\theta_{i+1}$  tends to zero.

In a folded implementation of the CORDIC each partial rotation will take one clock cycle and the total operation will depend on the amount of precision bits required. A twelve bit length word will take twelve clock cycles to compute the rotation.

$$I_{rot} = I_{scaling} \cdot I_{N-1} \tag{4.14}$$

$$Q_{rot} = Q_{scaling} \cdot Q_{N-1} \tag{4.15}$$

$$\theta_{rot} = 0 \tag{4.16}$$



Figure 4.11: Realization of a CORDIC algorithm in circular rotation mode

where

$$I_{scaling} = \frac{1}{\prod_{i=0}^{N-1} \sqrt{1 + \sigma_i^2 \cdot 2^{-2i}}}$$
(4.17)

$$Q_{scaling} = \frac{1}{\prod_{i=0}^{N-1} \sqrt{1 + \sigma_i^2 \cdot 2^{-2i}}}$$
(4.18)

where N is the number of steps or micro rotations. For an twelve bits word N = 12.

When the CORDIC is implemented in radix-2,  $\sigma_i$  can take only two values, -1 or 1 this means that  $I_{scaling}$  and  $Q_{scaling}$  are constant with the value  $I_{scaling} = Q_{scaling} = 1.646760258$ .

Figure 4.11 shows the folded CORDIC realization. Three adders, two barrel shift registers and one small LUT, used to store the predefined angles, are required. Compared to a Multiplier and LUTs realization the CORDIC is cheaper in hardware and area costs, but much slower.

#### 4.4.4 Magnitude.

Recall that magnitude of the information signal is required to drive the EER in order to recover the envelope. As the signals I and Q are perpendicular, Pythagoras' theorem can be applied to compute the magnitude.

$$Magnitude = \sqrt{I_e \times I_e + Q_e \times Q_e} \tag{4.19}$$

As (4.19) shows, two multipliers or two squarer and one square root are required.

| Design     | MOS  | Technology         | Тр   | PD/F                         |
|------------|------|--------------------|------|------------------------------|
|            | num. |                    | (ns) | $(\mathrm{uW}/\mathrm{MHz})$ |
| Boot       | 4408 | 0.6um              | 13.1 | 454                          |
| Parallel   | 5426 | $0.6 \mathrm{um}$  | 13.9 | 541                          |
| Parallel   | -    | $0.35 \mathrm{um}$ | 2.8  | -                            |
| Redundant  | 4410 | $0.35\mathrm{um}$  | 6.4  | 50                           |
| Our design | 2832 | $0.35 \mathrm{um}$ | 1.3  | 62.5                         |

Table 4.4: Comparison of the Proposed Squarer [31].



Figure 4.12: A new square-root circuit (N-SR) for four bit input [33]

### 4.4.4.1 Squarer and square rooting

Using a multiplier to squarer  $I_{error}$  and  $Q_{error}$  signals proves to be useful when resources are shared. On the other hand it would be cheaper in area costs to use a squarer because of the savings in hardware. The advantage of multiplying the same value reduces the partial products by half. In [31], a low power systolic squarer was presented. In this work, five groups of cells are carefully designed to reduce the gate count of full adders and D flip flops. This allowed them to reduce the clock cycle at the same time the area and power were reduced. Table 4.4 shows their results in comparison to other proposed squarers.

There are many methods to compute the square rooting. Starting from the pencil and paper algorithm it is possible to evolve to more sophisticated algorithms such as restoring shift/subtract, binary non-restoring and square rooting by convergence [32]. A parallel square root with low dynamic power was proposed in [33]. Figure 4.12 shows the circuit realization. Table 4.5 shows the results of the square rooter compared with a conventional square rooter (C-SR).
| SR circuits                                 | C-SR | N-SR |
|---------------------------------------------|------|------|
| No. of logic gates G                        | 189  | 95   |
| No. of logic gates critical path            | 60   | 30   |
| Supply voltage $V_{SS}$ [V]                 | 1.0  | 0.77 |
| Dynamic power $[\mu W]$ , $f_{clk}$ 570 MHz | 484  | 131  |
| Leakage power [nW]                          | 1147 | 276  |

Table 4.5: Characteristics of conventional square rooter and the new square rooter [33].

### 4.4.4.2 CORDIC algorithm - circular vectoring mode

Another alternative is to use the CORDIC in circular vectoring mode. In this mode there is no input for the rotation angle and the algorithm computes

$$Magnitude = Q_{scaling}\sqrt{I_{error}^2 + Q_{error}^2}$$
(4.20)

$$I_{i+1} = I_i - \sigma_i \cdot I_i \cdot 2^{-i} \tag{4.21}$$

$$Q_{i+1} = Q_i + \sigma_i \cdot Q_i \cdot 2^{-i} \tag{4.22}$$

in which  $I_0 = I_{error}$ ,  $Q_0 = Q_{error}$  and  $\sigma_i$  is computed in such a way that  $I_{i+1}$  tends to zero.

$$Magnitude = Q_{mag} = Q_{scaling} \cdot Q_{N-1} \tag{4.23}$$

$$I_{mag} = 0 \tag{4.24}$$

where

$$Q_{scaling} = \frac{1}{\prod_{i=0}^{N-1} \sqrt{1 + \sigma_i^2 \cdot 2^{-2i}}}$$
(4.25)

For a twelve bit word the computation of the magnitude requires twelve clock cycles. The CORDIC requires much less area than a square rooting with the penalty of longer latency.

Figure 4.13 shows the realization of the CORDIC to compute the magnitude. It is similar to the rotation CORDIC with the only difference that no angle input is required.

The range of the magnitude CORDIC is limited to  $[0 \ \pi]$ , therefore for the third and fourth quadrant the magnitude still can be computed, but the result will be negative. Adding to the output an adder with a sign detection it is possible to obtain the absolute value of the magnitude value computed by the CORDIC. Figure 4.14 shows the realization of the absolute value.



Figure 4.13: Realization of a CORDIC algorithm in circular vectoring mode



Figure 4.14: Absolute value realization.

## 4.5 Architecture comparisons

In this section four different architectures for the rotation and magnitude computation are presented. Only one architecture for the phase detection proved to be a good solution. See section Section 3.2.5. This can be used to supply the input angle in any of the four rotation architectures presented in this section.

From the architectures presented here, many combinations of rotation and magnitude architectures can be selected as the final digital system. However there are restrictions in the design that have to be fulfilled. First of all the digital implementation has to be low power consumption, therefore a minimum of resources has to be used.

Time restriction is important. It was shown in Chapter 3 that in order to have a stable system a compromise between the system delay, the open loop gain and the signal bandwidth has to be taken. As the bandwidth of the information signal is in the order of the tenth kilohertz and with an open loop gain of 10, the system delay is 400[ns] for a stable system with phase margin of 60°. The demand for fast processing is not high, therefore, using fast parallel multiplier to perform a rotation or a magnitude computation may be over dimensioned for the requirements.

Table 4.6 shows a comparison of different implementation alternatives for the block composing the digital compensator.

| Operation  | Case | Architecture                  | N⁰ | $N^{O}$ Basic Units |    |            | Transistors |                   |
|------------|------|-------------------------------|----|---------------------|----|------------|-------------|-------------------|
|            |      |                               |    | FA                  | HA | PP         | PG cells    |                   |
|            |      | Real Multipliers <sup>3</sup> | 4  | 396                 | 44 | $576^{-4}$ | -           | $14984$ $^{5}$    |
|            | 1    | Adders (HCA)                  | 2  | -                   | -  | -          | $44^{-6}$   | 792               |
|            |      | LUT $(2^{12}x12)$             | 2  |                     |    | -          |             | 98304             |
| Rotation   | 9    | Complex Multiplier            | 1  | 262                 | -  | -          | -           | 9628              |
|            |      | LUT $(2^{12}x12)$             | 2  |                     |    | -          |             | 98304             |
|            | 3    | uCORDIC-p <sup>7</sup>        | 1  | 16                  | -  | -          | 560         | 10528             |
|            | 4    | CORDIC                        | 1  | -                   | -  | -          | 76          | 2568 <sup>8</sup> |
|            |      | Real Multipliers              | 2  | 198                 | 22 | 288        |             | 7492              |
|            | 5    | Adder (HCA)                   | 1  | -                   | -  | -          | 22          | 396               |
|            |      | Root squarer                  | 1  | 156                 | -  | -          |             | 4368              |
| Magnitude  |      | Squarer                       | 2  | 72                  | -  | -          |             | 2016              |
|            | 6    | Adder (HCA)                   | 1  | -                   | -  | -          | 22          | 396               |
|            | 0    | Root squarer                  | 1  | 156                 | -  | -          |             | 4368              |
|            | 7    | uCORDIC                       | 1  | -                   | -  | -          | 528         | 9504              |
|            | 8    | CORDIC                        | 1  | -                   | -  | -          | 44          | 1308 <sup>8</sup> |
| Dhaga      | 0    | Real Multipliers              | 2  | 198                 | 22 | 288        |             | 7492              |
| detection  | 9    | Adder (HCA)                   | 1  | -                   | -  | -          | 22          | 396               |
| Integrator | 10   | Adder (HCA)                   | 2  | -                   | -  | -          | 44          | 792               |

Table 4.6: Estimation of the area for a parallel architecture with 12 bits data word length.

The fifth column indicates the required amount of full adders (FA), half adders (HA), partial products (PP) and propagate generate cells (PG). The last column indicates the number of transistors required. Using multipliers and LUTs for the rotation is the most resource demanding alternative (case 1). The LUT size required to store the sine and cosine translation is large. The unfolded CORDIC with precomputed vector rotation (uCORDIC-p, case 3) is considerably smaller (in area) than case 1, and the folded CORDIC (case 4) is even smaller. The same occurs for the magnitude. The folded CORDIC requires less transistors.

Table 4.7 shows the delay in terms of levels of logic. Root squarer (case 4 and 5) introduces a large critical path, therefore is not a good alternative when latency is a critical issue. Case 1 is the fastest architecture given its parallelism, however it is the most demanding in area usage. Cases 3 and 4, the folded CORDIC and cases 7 and 8, the unfolded CORDIC (uCORDIC) are the preferred alternatives because of their small area usage. The uCORDIC has 60 logic levels because twelve Han-Carlson adders are connected in series. The folded CORDIC has five logic levels plus some delay resulting from the barrel shifters. The folded CORDIC has to operate twelve times, therefore

<sup>&</sup>lt;sup>7</sup>Dadda reduction tree; 3:2 counters =  $N^2 - 4 \cdot N + 3$ , 2:2 counters N - 1. Final stage Han-Carlson Adder (HCA);  $2 \cdot N - 2$ 

<sup>&</sup>lt;sup>8</sup>Partial Product, logic AND: 6 transistors

<sup>&</sup>lt;sup>9</sup>FA: 28 transistors, HA: 10 transistors, PP: 4 transistors (CMOS)

<sup>&</sup>lt;sup>10</sup>PG cell: 18 transistors.

<sup>&</sup>lt;sup>11</sup>Unfolded CORDIC based on Han-Carlson Adder (HCA) with precomputation of the rotation direction

<sup>&</sup>lt;sup>12</sup>Includes transistor of the barrel shifters and one LUT

|            | 0    | 1                  |              |
|------------|------|--------------------|--------------|
| Operation  | Case | Architecture       | Logic levels |
|            |      | Real Multipliers   | 21           |
|            | 1    | Adders (HCA)       | 5            |
|            |      | LUT $(2^{12}x12)$  |              |
| Rotation   | 9    | Complex Multiplier | 21           |
|            | Z    | LUT $(2^{12}x12)$  |              |
|            | 3    | uCORDIC-p          | 60           |
|            | 4    | CORDIC             | 5+           |
|            |      | Real Multipliers   | 21           |
|            | 5    | Adder (HCA)        | 5            |
|            |      | Root squarer       | 156          |
| Magnitude  |      | Squarer            | 23           |
|            | 6    | Adder (HCA)        | 5            |
|            | 0    | Root squarer       | 156          |
|            | 7    | uCORDIC            | 60           |
|            | 8    | CORDIC             | 5+           |
| Dhaga      | 0    | Real Multipliers   | 21           |
| detection  | 9    | Adder (HCA)        | 5            |
| Integrator | 10   | Adder (HCA)        | 5            |

Table 4.7: Estimation of the latency for a parallel architecture with 12 bits data word length.

the total latency is slightly larger than the uCORDIC.

With the tables shown above and the restriction imposed for the design it is possible to determine the best architecture selection for the digital system. From the area usage perspective the folded CORDIC is the best selection, and, even though the latency is high, it is sufficient to fulfill the system requirements.

### 4.6 Architecture realization

Figure 4.15 shows the digital block architecture diagram. It consists of two main blocks: the datapath and the control block.

### 4.6.1 Datapath

The datapath is actually a double datapath, one datapath computes the rotation and the other datapath computes the magnitude. This double datapath shares a common input signal provided by the comparators. The rotation datapath is more complex because it requires the result of the phase detection and the integrator filter to provide the angle value for the rotation algorithm. The magnitude datapath is straightforward, once the comparators have computed the results, the magnitude algorithm can be executed.



Figure 4.15: Architecture realization of the digital block

#### 4.6.1.1 Optimization

Besides finding the optimum adder and multiplier, other optimizations are carried out. First the integrator filter, presented in Section 4.4.2.1, requires a multiplication for the p coefficient. This coefficient is a function of  $C_0$ , the integration constant. It was shown in Section 3.2.5 that  $C_0$  has to be much higher than the information signal bandwidth in order to reduce the phase shift error. The coefficient p has to be higher than 0.048. Choosing  $p = 2^{-4} = 0.0625$  this condition is ensured. The multiplication can be substituted by a four bit wired right shift to implement the division by 16. With this optimization one multiplier is saved and a reduction in area, latency and power is achieved.

The second optimization is the elimination of the scaling factor computation in the CORDIC algorithm. It was shown in Section 4.4.3.3 and Section 4.4.4.2 that CORDIC scales the output values. Usually that scaling factor has to be removed (by means of a division or multiplication) in order to obtain the exact values of the components. However, because the digital block is part of a loop it can contribute with gain. If the scaling factor is considered as a gain provided by the CORDIC block, then, the scaling factor will contribute with a factor of 1.646760258 to the loop gain and three multipliers, required to adjust the components for the magnitude and rotation are saved.

Pipeline is not possible in this implementation because input data arrives every  $T_s$  period of time (sampling period) and the output data must be ready before new input samples arrive. This means that the system is latency-constrained.

Gating the clock of the registers allows to perform one operation and then keep that part of the system idle, meanwhile the rest continues operating. This saves power because the internal signals do not toggle, reducing the activity factor and avoiding dynamics hazards.

#### 4.6.2 Control blocks

Two finite state machines (FSM) are implemented. One generates the CORDIC rotation control signals and the other generates the CORDIC magnitude control signals. Two four bits binary counters are used to keep track of the iteration steps of the CORDICs. Once the counter reaches "1011" (count 11 in decimal), the CORDIC algorithm has finished it operation and the result is ready. The ready (RDY) signal is triggered by a comparator circuit (not shown in the Figure 4.15).

The FMSs operates with the falling edge of the clock (CLK), the rest of the sequential logic operate with the rising edge. The output changes only when the register is enabled and a rising edge of the clock occurs.

Figure 4.16 shows the timing diagram of the rotation computation. When a new data arrives (ND) and the system is ready (RDY high), the input registers are enabled  $(En_0)$ . During that cycle the registers store the input data. The phase detection is performed and the result is captured by the register in the next cycle when  $En_1$  is high. Starting from cycle two, the next twelve cycles are dedicated to compute the CORDIC algorithm. Signal  $Sel_2$  changes to high after the first cycle allowing the multiplexers to feedback the output of the CORDIC. Once the counter has reached "1011" the RDY signal is set high and the counter is disabled  $(En_Cnt_2 \text{ low})$ . In the next cycle  $En_5$  is enabled and the output registers loads the results.

Figure 4.17 shows the timing diagram of the magnitude computation. When a new data arrives (ND) and the system is ready (RDY high), the input registers are enabled  $(En_0)$ . This is controlled by the rotation FMS. The first cycle is to compute the



Figure 4.16: Rotation computation timing diagram.



Figure 4.17: Magnitude computation timing diagram

comparison between the input and feedback values. Starting from cycle two, the next twelve cycles are dedicated to compute the CORDIC algorithm. Signal  $Sel_1$  changes to high after the first cycle allowing the multiplexers to feedback the output of the CORDIC. Once the counter has reached "1011" the RDY signal is set high and the counter is disabled  $(En_{-}Cnt_1 \text{ low})$ . In the next cycle  $En_4$  is enabled and the output register loads the result. The absolute value is computed and the result is loaded into the output register. This is also controlled by the rotation FSM.

## 4.7 System specifications for twelve bits

In Chapter 3 the analysis of stability was made assuming eight bits length data and a SNR of 48.16. In Section 4.2 was shown that twelve bits are necessary to improve the resolution. Twelve bit contributes with 72.24 [dB] of SNR. Assuming that the quantization error is equal to the signal noise, the sampling frequency has to be increased.

Table 4.8 shows the system specifications for a twelve bits implementation when the compensation filter is in the forward path and when it is in the feedback path.

| Input conditions   |                        | Output conditions          |              |  |
|--------------------|------------------------|----------------------------|--------------|--|
| Bandwidth          | Phase Margin           | Loop Gain                  | System Delay |  |
| 9.6[kHz]           | 60°                    | 10                         | 400[ns]      |  |
| Ini                | tial poles             | Compensation               |              |  |
| $p_1$              | $p_2$                  | $z_f$                      | $p_f$        |  |
| 14.945[kHz]        | $14.945[\mathrm{kHz}]$ | 44.839[kHz]                | 448.39[kHz]  |  |
| Sampling frequency |                        |                            |              |  |
| Forward pa         | th compensation        | Feedback path compensation |              |  |
| 633                | 3.35 [kHz]             | 3.0628 [MHz]               |              |  |

Table 4.8: Cartesian feedback system specifications for a twelve bits architecture.

It is possible to see that depending on the location of the compensator filter the sampling frequency will have a large difference. The suggestion is to locate the compensator filter in the feedback path. This will allow to have a simpler system with relaxed timing restrictions in the digital block and a smaller oversampling on the converters, reducing their power consumption.

## 4.8 Components requirements

Table 4.9 lists the components requirements for the digital design. In the next chapter the optimum components will be found to implement the design.

| Component         | Qty | Description                                                  |
|-------------------|-----|--------------------------------------------------------------|
| Comparator        |     |                                                              |
| Adder             | 2   | Two's complement 12 bit input                                |
| Binary Counter    | 2   | 4 bits binary counter                                        |
| Phase Detector    |     |                                                              |
| Multipliers       | 2   | Two's complement 12 bit input                                |
| Subtractor        | 1   | Two's complement 12 bit input                                |
| Register          | 1   | 16 bits type D with asynchronous reset and rising edge clock |
| Integrator        |     |                                                              |
| Adder             | 1   | Two's complement 16 bit input                                |
| Registers         | 2   | 16 bits type D with asynchronous reset and rising edge clock |
| Wired Right Shift | 1   | 4 bit right shift in exchange of a multiplier                |
| CORDIC rotation   |     |                                                              |
| Registers         | 2   | 12 bits type D with asynchronous reset and rising edge clock |
| Register          | 1   | 16 bits type D with asynchronous reset and rising edge clock |
| 2:1 Mux           | 2   | 12 bits multiplexers                                         |
| 2:1 Mux           | 1   | 16 bits multiplexers                                         |
| Barrel shifter    | 2   | 12 bits, 4 bits selector                                     |
| Adder/Subtractor  | 2   | Two's complement 12 bit input with carry input               |
| Adder/Subtractor  | 1   | Two's complement 16 bit input with carry input               |
| CORDIC magnitude  |     |                                                              |
| Register          | 2   | 12 bits type D with asynchronous reset and rising edge clock |
| 2:1 Mux           | 2   | 12 bits multiplexers                                         |
| Barrel shifter    | 2   | 12 bits, 4 bits selector                                     |
| Adder/Subtractor  | 2   | Two's complement 12 bit input with carry input               |
| Absolute value    |     |                                                              |
| Adder/Subtractor  | 1   | Two's complement 12 bit input with carry input               |
| Registers         | 8   | 12 bits type D with asynchronous reset and rising edge clock |

Table 4.9: Component requirements for a twelve bits architecture.

## 4.9 Summary

In this chapter the realization of the digital block is carried out. Based on the analysis results obtained in previous chapters a digital signal processing architecture is created.

Initially the bit length selected was 8. However, simulations show that 12 bits is a better selection, as it improves the distortion reduction with a little penalty of larger multipliers and adders.

A review of the power consumption in digital system was made. The sources of power leakage and dissipation are shown and solutions to cope with them are proposed. The goal is to find an architecture that consumes as less power as possible. For that different architectures were evaluated and the one that requires less resources and therefore less transistors, was selected for implementation.

The architecture selected to perform the rotation and the magnitude computation is the folded CORDIC. To obtain the phase shift a digital version of the model proposed in [17] was developed. Some optimization were applied to the architecture in order to increase the processing speed and reduce the power consumption: in the phase shift detector, the integrator filter required a multiplier which was changed for a wired right shift. The CORDICs required, in total, three multipliers to adjust the scaling factor. The scaling factor was not adjusted and was left as a contribution to the loop gain, saving three multiplications which results in a speed up of the system and area and power reduction. In the previous chapter the architecture for the digital block shown in figure 4.5 was defined. The present chapter is devoted to implement this architecture, targeting FPGA and ASIC. Because power is the most important restriction, the basic arithmetic operations required will have to be low power consumption.

Finding the optimum adder and multiplier becomes a challenge. In order to simplify the search in the design space, a group of adders and multipliers were selected. Among them, those that present the optimum power-delay product and area-delay product will be selected to implement the digital block.

From the idea to the implementation, the designing process follows a well defined flow. This is called the design flow. The design flow followed in this work is top-down [34]. The digital block of Figure 4.5 will be coded in VHDL, in a register transfer level (RTL) description. The design will be simulated and synthesized for FPGA and ASIC. The Place and Route (P&R) will be carried out only for an FPGA. The generation of the ASIC layout is left for future work.

Design simulation is an important process in the design flow. It will help to verify the design at different stages of the design flow. Performing a reliable simulation for this design is not an easy task when the digital block is part of a mixed-signal system with feedback. The feedback signals are required to make a comparison with the input signals and perform the adequate correction. The complete system has to be modeled to obtain the required test vectors for the feedback signals.

A mixed-signal model is implemented. The use of ADS® HDL-co-simulation allows to simulate the complete system and generate tests vectors for later simulation.

Targeting FPGA is different to ASIC. In an FPGA the main building blocks are already built in the chip, but they have to be programmed. Therefore, the synthesis has to translate the VHDL code and adapt it to that blocks. Some FPGA include dedicated hardware for fast computing, like built-in multipliers or fast carry logic to implement adders. This improves the resource usage and reduce power consumption. However, as the building blocks are fixed in a regular pattern in the die, the routing becomes complicated due to a fixed amount of networks. In ASICs instead, nothing is predefined and there is total freedom to implement the design. It is possible to select the adequate technology and the fabrication process. Smaller dimensions and lower power consumption are reached in contrast to FPGA.

## 5.1 Component selection

The basic units required in the design are adders and multipliers. There is a vast variety of adders and multipliers architectures that can be chosen for the implementation [32]. Those that are fast and have low power consumption will be the preferred ones. Serial architectures are discarded because of the latency increase. In a serial architecture one bit is computed per clock cycle, therefore, an increase in the clock frequency is required to meet the time constraint. This will affect the dynamic power due to the direct dependance on the clock frequency in (4.2).

Parallel architectures are faster than serial, because more bits are processed per clock cycle. The trade off here is area; parallelism requires more transistors. However, it is possible to find an architecture the of adder and multiplier that requires less area and consumes less power among parallel architectures.

The adders and multipliers architectures, used for evaluation in this work, were generated with ARITH [35]. The synthesis was done with Synopsys Design Compiler B-2008.09-SP1-1.

#### 5.1.1 Adders

Architectures for two operand adders are: Ripple carry, Carry look-ahead, Rippleblock carry look-ahead, Block carry look-ahead, Ladner-Fischer, Kogge-Stone, Brent-Kung, Han-Carlson, Conditional sum, Carry select and Carry-skip adder. Those that can perform 12 bit addition were selected.



Figure 5.1: Pareto points for parallel adders architectures  $^{3}$ . (a) Area versus delay. (b) Power versus delay.

 $<sup>^{3}</sup>$ The values obtained for the power, area and delay only takes into account logic cells. The synthesis was done with Synopsys Design Compiler.

Figure 5.1(a) shows the Pareto points in the area v/s delay design space. It is possible to see that the Brent-Kung parallel prefix adder is the optimum in area-delay product. However, when the design space is power v/s delay, the Han-Carlson parallel prefix adder is the optimum, as is seen in Figure 5.1(b). The selection of the adder is done based on the lowest power-delay product, therefore Han-Carlson parallel prefix adder is the architecture selected in this work.

#### 5.1.2 Multipliers

Multipliers are more complex structures than adders. Parallel multipliers consist of Partial Product Generator (PPG), Partial Product Accumulator (PPA), and Final Stage Adder (FSA). The PPG stage generates partial products from the multiplicand and multiplier in parallel. The PPA stage then performs multi-operand addition for all the generated partial products and produces their sum in carry-save form. Finally, the carry-save form is converted into the corresponding binary output at FSA.

The partial product generator can be a simple AND gate or a Radix-4 modified Booth recoding. The Radix-4 Booth recoding reduces the number of partial products, which reduces the amount of hardware and the execution time. The partial product accumulator can be implemented using a 4:2 reduction tree, Dadda or Wallace tree, a simple array or a balanced array. The final stage adder uses any of the adders described above.

Combining different PPSs, PPAs and FSAs allow to find an optimum parallel multiplier. However, the number of combinations is large and finding the optimum becomes a tedious work. Choosing the Han-Carlson adder as the FSA reduces the problem. The selection of the PPG and PPA is done via a Pareto optimum in the area-delay and power-delay space.



Figure 5.2: Pareto points for parallel multipliers architectures <sup>3</sup>. (a) Area versus delay. (b) Power versus delay.

Figure 5.2 shows the design space for the parallel multiplier with the Han-Carlson adder as the FSA. The smallest area-delay product corresponds to the Radix-4 PPG with a 4:2 reduction tree for the PPA. The same architecture is found when searching for the optimum power-delay product. Therefore, the selected parallel multiplier is composed of a Radix-4 PPG with a 4:2 reduction tree as the PPA and a Han-Carlson adder.

#### 5.1.3 Adders and multipliers in FPGA

When the implementation target is an FPGA, the arithmetic units described above are not the optimum. An FPGA is an array of configurable blocks that can be programmed. This means that the logic is fixed and the routing is limited. The logic functions are implemented via look-up tables (LUTs) as in the case of Xilinx ( $\mathbb{R}$ ) and Altera ( $\mathbb{R}$ ). The use of LUTs to implement adders and multipliers results in a slow signal processing. Most of the FPGAs vendors have included dedicated logic to allow fast arithmetics as par of the configurable logic blocks.

The design is implemented in a Xilinx® Spartan-3 FPGA. The Spartan X3s50-4tq144 [36] is a 50K gates FPGA in a Thin Quad Flat Package. It is built in a 90[nm] process technology, with 1.2[V] power supply for the core, 1.2 to 3.3 [V] for the I/Os and 2.5[V] for auxiliary purposes. The reason why this is selected is because the design does not requires much resources and it is the smallest FPGA in that family.

The X3s50 has 192 configurable logic blocks CLB (16 rows and 12 columns). Each CLB contains four slices and each slice contains two four input Look-Up Tables and two dedicated storage elements that can be used as a flip flop or as latches. The slice also contains dedicated hardware for a fast carry logic and PPG (AND gate) to speed up the multiplication. Moreover, the Spartan-3 family has dedicated four 18x18 two's complement multipliers. The use of these multipliers allow a reduction of general-purpose resources usage and therefore a reduction of the power consumption.

Taking advantage of that logic there is no need to implement other types of adders and multipliers, but the one that are optimized for the FPGA. With the help of the Xilinx  $(\mathbb{R})$  CORE Generator<sup>TM</sup> the implementation of adders and multipliers is quite simple [37],[38].

## 5.2 Design flow

The design flow consists of three main processes: design entry, synthesis, and place & route (P&R). A fourth process is simulation. Simulation can take place after any of the main processes to verify the results. Figures 5.3(a) and 5.3(b) shows the Xilinx and ASIC design flow respectively. It is possible to see that both flows are similar, both have the same main processes (light blue boxes). Slight differences can be seen due to the fact the Xilinx is a specific brand with proprietary software tools. On the other hand, ASIC uses industry standard tools. Some clear differences are: vendor specific core generator (Xilinx  $\mathbb{R}$  CORE Generator<sup> $\mathbb{M}$ </sup>) allows to implement components that are optimized for a particular FPGA. The BitGen translates the output native circuit description file (NCD) into a bitstream file that contains the configuration information to program the



Figure 5.3: Design flow: (a) Xilinx ISE Design suite 11.1. (b) ASIC.

FPGA. Back-annotation processes extract from the NCD file the timing information and translate the data into a VHDL/Verilog for a post-P&R timing simulation.

The ASIC, on the other hand, uses standard file formats as outputs from each process (VHDL/Verilog, SDF, etc.). The standard cell libraries are provided by the foundries (TSMC, UMC, FARADAY, etc.). Depending on the requirements a particular technology is selected (90[nm], 65[nm], etc.). This is part of a Desingkit that contains a large group of logic cells (AND, OR, flip flop, etc.) and I/O cells. The cells are selected by the synthesis depending on the constraint given by the designer. Synopsys Design Compiler (DC) is used for logic synthesis.

The design is coded in VHDL in a hierarchical RTL description. Timing and area constraint commands guide DC to reach the optimum synthesized design. Timing information in SDF and VHDL/Verilog netlist are generated for post-synthesis simulation.

ModelSim is used to simulate the RTL model and the post-synthesis VHDL gatelevel netlist. By means of the test benches (described in the next section) the verification is carried out.

Cadence SoC Encounter is used for P&R. Here, the netlist model obtained from synthesis is translated into a geometric realization of the gates and nets.

The result of a P&R is a file with the information of the layout in hierarchical form (Graphic Data system II, GDS II).



Figure 5.4: Test vectors generation diagram.

## 5.3 Verification

#### 5.3.1 Test bench

Simulation is an important process in the design flow. It allows to validate the design at different stages (pre-synthesis, post-synthesis, post-P&R). For that it is important to have good and reliable data that will be used as input for the design.

The test bench is a virtual environment created to put the design into test. In a test bench, usually described in VHDL, several components and signals can be created to facilitate the testing of the design. It is also possible to import data from files and convert them into the desired data type, or export the results to a file to generate reports or graphs.

Because the system that is designed is part of a feedback system, the generation of the test vectors requires a more complex environment. A behavioral model of the Cartesian feedback system was created and simulated in Agilent ADS 2009. The use of the HDL co-simulation feature allowed to simulate hardware description language with analog models. A description of the HDL co-simulation environment is presented in appendix B.2. In this environment the feedback signals are generated and the test vectors can be created.

Figure 5.4 shows a diagram of the test bench used with ADS HDL co-simulation. I\_in and Q\_in are the Cartesian components of a pseudo random serial bit stream modulated in  $\pi/4$ -DQPSK and pulse shaped with a raised cosine filter (see Section B.2). I\_fb and Q\_fb are the baseband samples of the output signal. The analog to digital converter is modeled in VHDL. The output signal I\_rot, Q\_rot and Mag are converted to analog signal by means of a digital to analog converter modeled in VHDL as well. The input and output digital signals are stored in files for later use. Every rising edge of ND triggers the save of these signals. These test vectors are used to validate the post-synthesis, the post-P&R and to estimate the dynamic and static power in the Spartan-3 FPGA. It was also used to validate the post-synthesis of the ASIC implementation.

### 5.3.2 Simulation

Post-synthesis and post-P&R simulation use a test bench as the one shown in Figure 5.5. The input test vectors (created with ADS HDL-co-simulation) are read from a file and passed into the unit under test (the design). Signals Clk, Rst and ND are created



Figure 5.5: ModelSim test bench.

in the test bench. The output signals are stored in a file to generate graphics. When the test bench is used for power analysis, another output file is created. A switching activity interchange format (SAIF) file contains information about the activity of all of the signals in the design. This file is used as input for the Xilinx XPower analyzer to estimate the power consumption of the design in the FPGA.

Figures 5.6(a) and 5.6(b) show the analog representation of the input and feedback test vectors created with the test bench shown in Figure 5.4. Figure 5.7 shows the analog representation of the output signals I\_rot, Q\_rot and Mag after post-P&R for the FPGA implementation. These signals are 12 bits long in two's complement fixed point representation. The binary range for these signals is [1.1111111111 0.111111111]. In decimals this is a range of  $[-1 \times V_{ref}, 0.995 \times V_{ref}]$ , where  $V_{ref}$  is the reference voltage of the converter.

Table 5.1 shows the maximum and minimum values achieved by the signals. Looking at the valid range of the output signals, the eight less significant bits (LSB) are the contributors to the resolution. It was shown in Figure 4.3 and in Figure 4.4 that a low resolution decreases the digital Cartesian feedback system capability to reduce distortion.



Figure 5.6: Test vector signals: (a) Analog representation of the in-phase input and feedback test vectors. (b) Analog representation of the quadrature input and feedback test vectors.



Figure 5.7: Output test vector signals: analog representation of the output test vectors.

The sixth column shows the reference voltage used for the signal in the simulation. The rotated signals have a lower reference voltage due to the scaling factor (4.17) and (4.18). The scaling factor in a CORDIC algorithm is approximately 1.64. This value is divided from the output signals by means of the converters reference voltage. It is important to take this value into consideration because it contributes to the loop gain. Miscalculation can drive the system into instability.

Figure 5.8(a) shows the frequency spectrum of the input signal before amplification and linearization. Figure 5.8(b) shows the frequency spectrum of output signals for the open loop and closed loop cases. Looking at both figures it is possible to draw some conclusions. First, the amplification of the output signal in the bandwidth of interest (9.6[kHz]): open loop and closed loop systems amplify the input signal. However, the open loop frequency spectrum (red line) shows sides bands with an increase in magnitude. This is caused by the third order intermodulation distortion (IMD) component that was explained in Section B.1. On the other hand, the closed loop frequency spectrum (blue line) shows a reduction of the side bands. The consequence of this is an improvement on the linearity of the system. The low decreasing of the magnitude for higher frequencies can be explained as the quantization error introduced by the converters an the digital circuit itself. The error is further reduced when the bit resolution is augmented. This was shown in Section 4.2.

| Signal | Binary        |               | Dec    | $V_{ref}$ |      |
|--------|---------------|---------------|--------|-----------|------|
|        | Max.          | Min.          | Max.   | Min.      | [V]  |
| I_in   | 0.10001110111 | 1.01100101101 | 1.1054 | -1.1934   | 1.98 |
| I_fb   | 0.1000001011  | 1.01110101101 | 1.0010 | -1.0704   | 1.98 |
| I_rot  | 0.00010111001 | 1.11100101101 | 0.1094 | -0.1247   | 1.21 |
| Q_in   | 0.10001100011 | 1.01110110100 | 1.0861 | -1.0641   | 1.98 |
| Q_fb   | 0.01111111100 | 1.1000001001  | 0.9870 | -0.9815   | 1.98 |
| Q_rot  | 0.00001110010 | 1.11101001101 | 0.1105 | -0.1058   | 1.21 |
| Mag    | 0.00011011001 | 0.00000000000 | 0.1283 | 0.0       | 1.21 |

Table 5.1: Maximum and minimum signal values.



Figure 5.8: Baseband frequency spectrum of the information signal: (a) At the input of the system. (b) At the output after amplification. In red: the open loop system, in blue: the closed loop system.

## 5.4 Implementation

In the present section the results of the implementation are presented. First, the FPGA implementation and the results of logic synthesis, P&R, timing analysis and power consumption are shown. Second, the results from ASIC implementation in UMC 90[nm] process technology are shown.

### 5.4.1 FPGA

In the following sections the results from the analysis are shown. The static timing analysis reports the minimum clock period. The power consumption is estimated with the Xilinx XPower Analyzer. Finally the resource usage in the FPGA is shown hierarchically, indicating the amount of Slices and LUTs utilized.

The input timing constraints set for the FPGA are: clock period 12[ns] with 50% duty cycle, global OFFSET IN with 3.5[ns] for valid data duration after a rising clock edge and 3.5[ns] of offset before the rising clock edge. The output timing constraints was set to 8[ns] offset after a rising clock edge.



Figure 5.9: FPGA critical path description, shown by the red line.

### 5.4.1.1 Timing analysis

Table 5.2 shows the summary of the static timing analysis result. For a minimum period of 11.988[ns], the maximum clock frequency is 83.471[MHz]. The maximum path delay (critical path) is found between the in-phase register in the circular rotation mode CORDIC and the output register  $Q_{out}$  as is shown in Figure 5.9.

Table 5.2: Timing summary  $^{5}$ .

| Design statistics:                        |            |
|-------------------------------------------|------------|
| Minimum period:                           | 11.988[ns] |
| Maximum path delay from/to any node:      | 10.982[ns] |
| Minimum input required time before clock: | 3.459[ns]  |
| Minimum output required time after clock: | 7.126[ns]  |

#### 5.4.1.2 Power analysis

The power consumption is found by running the Xilinx XPower analyzer. After Place & Route a simulation model is created in VHDL which is used to validate the final stage. Also a timing simulation netlist is created in Standard Delay Format (SDF).Running a simulation of the VHDL model with the SDF file, a detailed switching activity for all the signals is obtained. Then, the Switching Activity Interchange Format (SAIF) is used as an input to the XPower analyzer.

Table 5.3 shows a summary of the XPower analyzer results. For this analysis a simulation of 15[ms] was carried out with a 1[ps] of simulation interval.

The total power dissipated by the digital design in a Spartan X3s50-4tq144 FPGA is 33.31[mW]. The estimation is made with a clock running at 83.3[MHz].

 $<sup>^5\</sup>mathrm{Constraints}$  cover 34215 paths, 0 nets, and 1963 connections

| Table 5.3: Digital design power consumption. |       |  |  |  |
|----------------------------------------------|-------|--|--|--|
| Units                                        | [mW]  |  |  |  |
| Total estimated power consumption            | 33.31 |  |  |  |
| Percentage                                   | %     |  |  |  |
| For 1.7[W], total power budget               | 1.96  |  |  |  |
| For 0.7[W], without PA                       | 4.75  |  |  |  |

Considering that the power budget for the transceiver is 1.7[W], from which 1[W] is dedicated to the power amplifier power supply, then 4.75% of the remaining power budget has to be addressed to the digital design.

#### 5.4.1.3 Resource usage analysis

Table 5.4 shows the results of the synthesis. The table indicates the usage of the FPGA in terms of Slices and LUTs as well as the dedicated arithmetic logic. All of the input and output registers use flip flops located in the I/O blocks instead of using Slices. This improves the area usage.

| Logic Utilization                              | Used | Available | Percentage |
|------------------------------------------------|------|-----------|------------|
| Number of Slice Flip Flops                     | 126  | 1,536     | 8%         |
| Number of 4 input LUTs                         | 428  | 1,536     | 27%        |
| Logic Distribution                             | Used | Available | Percentage |
| Number of occupied Slices                      | 340  | 768       | 44%        |
| Number of Slices containing only related logic | 340  | 340       | 100%       |
| Number of Slices containing unrelated logic    | 0    | 340       | 0%         |
| Total Number of 4 input LUTs                   | 439  | 1,536     | 28%        |
| Number used as logic                           | 428  |           |            |
| Number used as a route-thru                    | 11   |           |            |
| Number of bonded IOBs                          | 87   | 97        | 89%        |
| IOB Flip Flops                                 | 84   |           |            |
| Number of MULT18x18s                           | 2    | 4         | 50%        |
| Number of BUFGMUXs                             | 1    | 8         | 12%        |
| Average Fanout of Non-Clock Nets               | 3.09 |           |            |

Table 5.4. Resource usage summary

In total eighty four flip flops corresponding to seven 12 bits registers (I\_IN, I\_FB, I\_ROT, Q\_IN, Q\_FB, Q\_ROT and MAG) use I/O storage elements. A hierarchical resource usage summary is shown in Section C.1.2

### 5.4.1.4 Place & Route

Figure 5.10 shows the resulting floor plan of the design after P&R. It is possible to see how well organized the components in the floor plan area are with respect to the input and output registers. In the upper left side the 12 bits of Q\_in are located, in



Figure 5.10: Floor plan after Place & Route

the top left side the 12 bits of Q\_fb. Close to them the comparator for the quadrature component is found. The same occurs for the in-phase component. The comparator is located close to the I\_in and I\_fb input registers. On the right side it is possible to find the circular rotation CORDIC. It is located close to the I\_rot and Q\_rot registers. For the magnitude computation, the vectoring CORDIC is close to the comparators and distributed to the top of the floor plan area where the magnitude register is located. Also, the adder used to compute the absolute value is located close to the output magnitude register. The phase detection component is found just in the middle of the floor plan area and close to the 18x18 built-in multipliers to the left and close to the circular rotation CORDIC to the right. As expected, the finite state machine, for each of the CORDICs, is found close to that component. Figure 5.11 shows the resulting routing.



Figure 5.11: Routing after Place & Route.

### 5.4.2 ASIC 90 nm CMOS technology

The ASIC follows the design flow of Figure 5.3(b). The synthesis tool used in this work is Synopsys Design Compiler and the DesignKit is provided by Faraday Technology Corporation. The library used is the standard cells FSD0A\_A. The FSD0A\_A library is a 90 nm standard cell library for UMC's 90 nm logic SP-RTV (Low K) process. This library has been optimized for applications requiring low operation power consumption and ultra high density.

Table 5.5 shows the general characteristics of Faraday FSD0A\_A cells.

| Table 5.5: Faraday FSD0A_A library characteristics [39] |                                |                                          |            |             |                           |  |
|---------------------------------------------------------|--------------------------------|------------------------------------------|------------|-------------|---------------------------|--|
| Oper                                                    | ating conditions               | Min.                                     | Typ.       | Max.        | Unit                      |  |
| VCC                                                     | Core cells                     | 0.9                                      | 1.0        | 1.1         | V                         |  |
|                                                         | 2.5  v I/O                     | 2.25                                     | 2.5        | 2.75        | V                         |  |
| $T_J$                                                   | Junction operation temperature | -40                                      | 25         | 125         | °C                        |  |
| Char                                                    | acteristic                     | Descri                                   | ption      |             |                           |  |
| Suppl                                                   | y voltage                      | For the core cells: 0.9 V $_{\sim}1.1$ V |            |             |                           |  |
|                                                         |                                | For $2.5$                                | V I/O d    | cells: 2.25 | V ${\sim}2.75~\mathrm{V}$ |  |
| Perfor                                                  | mance                          | $T_d = 18$                               | .2  ps/sta | age         |                           |  |
| Gate                                                    | density                        | 400k ga                                  | $te/mm^2$  | 2           |                           |  |
| Power                                                   | · consumption                  | 5.0  nW/MHz/gate                         |            |             |                           |  |

The design, implemented for FPGA, was modified in order to cope with requirements for the ASIC design flow. The adder and multiplier that best fit the requirements of

| Start point                                | $Q_FB_REG_U8/DO_reg[5]$                      |        |  |  |  |
|--------------------------------------------|----------------------------------------------|--------|--|--|--|
| Endpoint                                   | PHASE_DET0/PHASE_DET_REG_U11/DO_reg          | s[15]  |  |  |  |
| Point                                      |                                              | Path   |  |  |  |
| clock CLK (rise edge)                      |                                              |        |  |  |  |
| clock networ                               | k delay (ideal)                              | 0.00   |  |  |  |
| Q_FB_REG_                                  | $U8/DO\_reg[5]/CK$                           | 0.00 r |  |  |  |
| PHASE_DET                                  | $\Gamma0/PHASE\_DET\_REG\_U11/DO\_reg[15]/D$ | 6.02 f |  |  |  |
| data arrival time                          |                                              |        |  |  |  |
| clock CLK (rise edge)                      |                                              |        |  |  |  |
| clock networ                               | k delay (ideal)                              | 7.00   |  |  |  |
| PHASE_DET0/PHASE_DET_REG_U11/DO_reg[15]/CK |                                              |        |  |  |  |
| library setup                              | time                                         | 6.83   |  |  |  |
| data required time                         |                                              |        |  |  |  |
| data required time                         |                                              |        |  |  |  |
| data arrival time                          |                                              |        |  |  |  |
| slack (MET)                                |                                              | 0.81   |  |  |  |

power and area usage found in section 5.1.1 and 5.1.2 respectively are included in the hierarchy. The I/O pads are included in the top level, and timing and area constraints are given to the synthesis tool. The following tables summarizes the results.

#### 5.4.2.1 Timing analysis

The timing constraints for the ASIC implementation are: clock period of 7[ns], input delay of 2[ns], output delay of 3[ns] and clock uncertainty of 0.4[ns]. Table 5.6 shows the critical path summary. From the feedback register for component Q to the output of the difference, passing through a multiplier and a subtractor is the longest path, as shown in Figure 5.12.



Figure 5.12: ASIC critical path description, shown by the red line.

#### 5.4.2.2 Power analysis

Table 5.7 shows the summary of the power consumption. It is important to mention that the power estimated here considers only the logic cells. In order to have a reliable value of the power consumption the P&R has to be carried out. In this way the power estimation will include power consumption of the networks.

The core, this is the digital system without the I/O pads, consumes only 0.451[mW].

| rabie offer power building.       |               |                    |                 |            |       |  |  |  |  |
|-----------------------------------|---------------|--------------------|-----------------|------------|-------|--|--|--|--|
| Global Oper                       | ating Voltage | 0.9                |                 |            |       |  |  |  |  |
| Power-specific unit information : |               |                    |                 |            |       |  |  |  |  |
| Voltage Units                     |               | 1[V]               |                 |            |       |  |  |  |  |
| Capacitance Units                 |               | $1.000000 \; [pF]$ |                 |            |       |  |  |  |  |
| Time Units                        |               | 1[ns]              |                 |            |       |  |  |  |  |
| Hierarchy                         | Switch        | Int                | Leak            | Total      |       |  |  |  |  |
|                                   | Power $[mW]$  | Power [mW]         | Power $[\mu W]$ | Power [mW] | %     |  |  |  |  |
| Top Level                         | 0.984         | 1.170              | 61.6            | 2.216      | 100.0 |  |  |  |  |
| I/O Pads                          | 0.0593        | 0.83               | 10.9            | 0.9        | 40.7  |  |  |  |  |
| CFB_Core                          | 0.0857        | 0.325              | 40.5            | 0.451      | 20.4  |  |  |  |  |

#### 5.4.2.3 Area analysis

The area only counts the logic cells, therefore, to have a reliable estimation of the area, the P&R has to be carried out. Considering only the logic cells, the area of the core is only 0.019699 square millimeter. The I/O pads are the ones that requires more area.

| Table 5.6. Core area Summary.  |                                                         |                       |                             |           |       |  |  |  |
|--------------------------------|---------------------------------------------------------|-----------------------|-----------------------------|-----------|-------|--|--|--|
| Number of ports:               | 87                                                      |                       |                             |           |       |  |  |  |
| Number of nets:                | 274                                                     |                       |                             |           |       |  |  |  |
| Number of cells:               | 111                                                     |                       |                             |           |       |  |  |  |
| Number of references:          | 16                                                      |                       |                             |           |       |  |  |  |
| Combinational area:            | 456966.74                                               | $[\mu m^2]$           |                             |           |       |  |  |  |
| Noncombinational area:         | $315895.58 \ [\mu m^2]$                                 |                       |                             |           |       |  |  |  |
| Total cell area:               | 772862.32                                               | 772862.32 $[\mu m^2]$ |                             |           |       |  |  |  |
| Hierarchical area distribution |                                                         |                       |                             |           |       |  |  |  |
|                                | Hierarchical area distributGlobal cell area $[\mu m^2]$ |                       | Local cell area $[\mu m^2]$ |           |       |  |  |  |
| Hierarchical cell              | Absolute                                                | Percent               | Combi-                      | Noncombi- | Black |  |  |  |
|                                | Total                                                   | Total                 | national                    | national  | boxes |  |  |  |
| Top level                      | 772862.5                                                | 100                   | 1262                        | 0         | 0     |  |  |  |
| CFB_Core                       | 19699                                                   | 2.5                   | 3                           | 0         | 0     |  |  |  |

Table 5.8: Core area summary.

## 5.5 Summary

In this chapter the implementation in FPGA and synthesis in ASIC was carried out. To implement the digital system first, the optimum adder and multiplier, that best perform in an FPGA and ASIC, were found. It was shown that designing for FPGA has some differences than designing for ASIC, especially in the arithmetics units. For an FPGA it is better to use the proprietary cores that are optimized for the FPGA because they take advantage of the built-in logic created for arithmetic operations.

For an FPGA implementation, the smallest FPGA of the Xilinx Spartan-3 family was chosen to implement the digital design. For ASIC the technology chosen was 90[nm] and the logic cell library from Faraday Technology Corporation.

To implement the design several process were executed: design entry, synthesis, place and route and simulation. Special emphasis was put on simulation due to the characteristics of the system. As it is a feedback system, reliable feedback signals are required to verify the design.

An HDL co-simulation environment was created to simulate the mixed-signal Cartesian feedback. With it, test vectors of the inputs and outputs were created and used in the test bench for pre-synthesis, post-syntheses and post-place and route verification.

The results for an Xilinx Spartan-3 FPGA (X3s50-4tq144) implementation are:

- Clock frequency: 83.33[MHz].
- Power consumption: 33.31[mW].
- Latency: 180[ns]

The system delay obtained in Section 4.7 is 400[ns], therefore the available 220[ns] are left for the latency introduced by the converters.

The results for a 90[nm] ASIC implementation are:

- Clock frequency: 142.8 [MHz].
- Power consumption: 2.216[mW].
- Latency: 105[ns]

However, these are not the definitive values. There is still one process that has to be done in order to finish the ASIC implementation and get final results. In order to find the definitive values, the layout has to be done. A Radio Frequency (RF) system is composed of several electronic circuits blocks, that have a specific function. One of them is the Power Amplifier (PA) whose function is to amplify the signal to be transmitted. As a consequence the PA is the main consumer of the battery power budget in a transmitter device.

Power consumption is the utmost importance on mobile and satellite applications. In today's appliances lightweight and reduced dimension imposes a limitation to batteries and solar panels. In order to cope with a limited power budget the electronics must be as efficient as possible. For a PA to be efficient all of the available power has to be delivered to the output with no losses on the way. The most power efficient types of RF PAs are generally non-linear and the amplified signal is distorted.

Distortion of the envelope and phase generates intermodulation components out of the frequency band of interest and therefore pollutes adjacent frequency bands. Moreover, the characteristics of an amplifier do not remain static, the operating conditions such as temperature, both internal and external, and aging will affect the amplifier's characteristics.

To solve this non-linear behavior a linearization mechanism must be applied to reduce the distortion, and, it has to be robust enough to cope with these changes.

In this thesis work the study, modeling and implementation of a mixed signal Cartesian feedback was carried out. Cartesian feedback is a powerful mean to achieve linearity, however, it is not exempt of complications. The biggest threaten of feedback is stability, therefore, to have a robust linearization with feedback, stability has to be ensured for the bandwidth of interest. Finding the adequate values for the three most important parameters in the system (system delay, loop gain and bandwidth), allows to find the condition for stability. It was shown that these variables are reciprocal, the increase in one requires the decrease of the others to maintain stability.

Having the system stable is not the only problem to solve. The system delay and cross-coupling of the Cartesian components will cause a shift of the phase that also threat the stability. A method to detect and correct the phase shift at baseband was analyzed and a digital solution was implemented.

## 6.1 Conclusions

A model for the mixed-Cartesian feedback was developed and implemented. The requirement of an anti-imaging filter of second order, in the forward path, made the system unstable. The stability analysis proved that compensation was required. In order to compensate the system a zero had to be included in the open loop transfer function. In a practical implementation a zero is always accompanied of, at least, one pole. A lead compensator was proposed as a solution. From the point of view of the open loop transfer function, the zero can be locate in the forward or in the feedback path, therefore, two compensation alternatives were analyzed: compensation in the forward path and compensation in the feedback path. The compensation in the feedback path was selected since it produces a better response (in time and in frequency), of the system. It was also proved that compensation in the forward path forces to increase the sampling frequency due to the interaction of the zero in the system, as a consequence the power consumption will increase.

For a signal bandwidth in the order of tenth kilo hertz, it was found that to achieve a 60° of phase margin requires a compromise between the loop gain and the system delay. The primary contributor to the system delay was the digital implementation of the phase detection and correction, and the secondary contributor, the ADC and DAC converters. In order to have enough system delay the compromise found between the loop gain and the system delay was 10 and 400[ns] respectively.

The architecture selected to perform the rotation and the magnitude computation was the folded CORDIC. The phase shift detection was obtained by applying  $I \cdot Q' - Q \cdot I'$ followed by an integration. Optimizations made to the architecture allowed to discard four multipliers: two in the rotation CORDIC, one in the magnitude CORDIC and one in the integrator. As a consequence the use of resources is reduced and hence the power consumption.

The results are summarized in the following table:

| Target          | $\mathbf{FPGA}^{-1}$ | ASIC $^2$    |                      |
|-----------------|----------------------|--------------|----------------------|
| Item            | Value                | Value        | Units                |
| Bandwidth       | 9.6                  |              | kHz                  |
| Loop Gain       | 10                   |              |                      |
| System delay    | 400                  |              | ns                   |
| Bit word        | 12                   |              | $\operatorname{bit}$ |
| Clock frequency | 83.33                | 142.8        | MHz                  |
| Latency         | 180                  | 105          | ns                   |
| Power           | 33.31                | $2.216^{-3}$ | $\mathrm{mW}$        |

Table 6.1: Implementation results.

It is important to mention that the FPGA implementation consumes only 1.96% of the power budget. The ASIC implementation looks potentially favorable but still the layout has to be performed to have a complete estimation of the power consumption.

<sup>&</sup>lt;sup>3</sup>Xilinx Spartan-3, X3s50-4tq144.

 $<sup>^3\</sup>mathrm{UMC}$  90nm process technology.

 $<sup>^{3}</sup>$ The power only takes into account the logic cells and I/O pads.

## 6.2 Recommendations

The implementation using 12 bits word requires 84 I/O pads. A pad requires a considerable portion of area in the die, making the design more expensive. A simple improvement that can be applied is to replace the digital to analog converter of the magnitude signal for a pulse width modulator (PWM) to drive a LP-FA [5]. This solution increases the hardware in the FPGA/ASIC but discard 11 pins. Also, the power consumed by the converter is reduced to that consumed by the PWM.

In this work the layout of the ASIC implementation was left out. Therefore until now it is not clear how much power will be consumed by the ASIC. The estimation obtained so far only includes the I/O pads and the logic cells. The next step would be to include the PWM, discarding 11 I/O pads. This account for 0.235[mW], (10.6% of the total power) less power consumed and 0.0951[mm<sup>2</sup>], (12.3% of the total area) less area in the die.

The library used was the standard cells FSD0A\_A from Faraday Technology Corporation. A UMC's 90nm logic SP-RTV (Low K) process (Standard Performance, Regular  $V_{th}$ ). However Faraday also provide another library that can be used to reduce more the power consumption. The UMC's 90nm logic LL-RTV (Low K) (Low Leakage, Regular  $V_{th}$ ).

# A.1 Study of a typical Cartesian feedback

Figure A.1 introduced in chapter 2 and repeated here for convenience shows a typical configuration of a Cartesian feedback system. The input signal has been transformed in its In-phase (I) and Quadrature (Q) components (Cartesian components). These signals are compared to its respective feedback components (I') and (Q') which were obtained from an attenuated sample of the output signal and down-converted to baseband.

The error obtained by the comparison is then fed to an up-converter in which the components are modulated by a carrier frequency and then combined prior to entering the PA. As the error tends to zero the output signal tends to be equal to the input signal. Linearity is achieved by a high loop gain as was shown in previous sections.

The loop compensation block is included to make clear that in some cases the system gets unstable when the loop is closed and therefore some mechanism of compensation is required to ensure stability.

In order to come up with a simple model for the Cartesian feedback system, in which analysis of stability can be done easily with enough accuracy, the following assumptions are made:

### Assumptions

- 1. "The modulation bandwidths are narrowband (10's to 100's kHz) relative to the RF components bandwidth in the loop (100's to 1000's MHz). It is therefore reasonable to assume that for low frequencies the loop response will be dominated by the compensation filter." [4]
- 2. The Cartesian feedback components are wideband, linear and no cross-coupling exists between the I and Q paths.



Figure A.1: Diagram of a typical Cartesian feedback system.

3. The asymptotic-gain model holds, this means that the feedback transfer function gain,  $\beta(s)$  ( $\beta(s) < 1$ ), in the frequency range of interest, determines the gain of the closed loop transfer function when the gain in the forward path is large, (ideally infinite).

The above assumptions tell us that only frequencies around baseband are dominant therefore contribution of poles at high frequencies can be neglected. Usually the loop compensation and low pass filters that limit the frequencies to the bandwidth of interest will contribute with dominant poles. Assuming that no cross-coupling exists between I and Q paths, allows to treat each path as an independent one. This assumption simplifies the analysis to a single input single output (SISO) system.

Even though high frequencies are neglected, a more accurate model of the system will have to consider the dynamics of the forward and feedback path components (up/down converters and PA).

"The simplest way to reproduce low frequency requirements and high frequency characteristics is to model the loop compensation directly combined with a time delay" [4]. In this way, modeling the dynamics of the system with a time delay allows to introduce accuracy in the representation of the system. It will be shown later that including a delay will help to model the mixed-signal system more accurate.

The open loop transfer function can be expressed as follows:

$$L(s) = \frac{Ke^{-Ts} \prod_{i=1}^{n} (s/z_i + 1)}{\prod_{i=1}^{n} (s/p_i + 1)},$$
(A.1)

where  $K = G(0)\beta(0)$  represents the DC loop gain which includes the gain of the compensation and baseband filters, the gain of the RF stages (up and down converters and PA) and the gain of the feedback path. The delay  $(e^{-Ts})$  models the phase shift introduced by the dynamics of the open loop. See appendix A.1.1. The poles  $p_i$  and zeros  $z_i$  represent the dominant absolute poles and zeros.

### A.1.1 Mathematical representation of the delay in the frequency domain

Before continuing with the Cartesian feedback study, a mathematical representation of the exponential  $e^{-Ts}$  is required. The motivation is to find a suitable expression for a quantitative analysis. Three equivalent representations for the exponential function are presented:

• The Euler formula [40]:  $e^{j\omega} = cos(\omega) + j \times sin(\omega)$ . This representation is useful when calculating the phase and magnitude (Section A.1.2).

$$e^{-Ts} = e^{T\sigma} e^{-T\omega j}$$
  
=  $e^{T\sigma} (\cos(\omega) + j \times \sin(\omega))$  (A.2)

• The Maclaurin series [40]: The Maclaurin series is the Taylor series centered at zero:

$$e^{-Ts} = 1 - Ts + \frac{T^2}{2!}s^2 - \frac{T^3}{3!}s^3 + \dots + \frac{T^n}{n!}s^n$$
 (A.3)

The Maclaurin series of the exponential only adds zeros to the open loop transfer function. Using an adequate order in the series, the order of the characteristic polynomial is preserved.

• The Padé approximant [41]: is an approximation of a function by a rational function:

$$e^{-Ts} = \frac{1 - \frac{Ts}{2} + \frac{(Ts)^2}{10} - \frac{(Ts)^3}{120}}{1 + \frac{Ts}{2} + \frac{(Ts)^2}{10} + \frac{(Ts)^3}{120}} \dots$$
(A.4)

Padé approximant gives a much better approximation than Maclaurin series, because it adds the same order of zeros and poles to the loop gain transfer function.

Figure A.2 shows the root locus for the Padé approximant and the Maclaurin series for order four. In both plots each pole will be pushed to the right side of the S-plane by a zero.

The conclusion that can be obtained from Figure A.2 is that the presence of a delay in a system will tend to make the system less stable.

The Padé approximant of the exponential function is modeled as n poles in the left side of the S-plane and n zero in the right side of the S-plane whereas the Maclaurin series is modeled as n zeros only in the right side of the S-plane with n the order of the approximation. As the order increases the approximation approaches to the exponential function  $e^x$ .

For the Maclaurin series or the Padé approximant a question arises: what should be the minimum value of n to achieve a good approximation? A large value of n will increase the complexity of the loop gain transfer function when it is represented as a polynomial or rational polynomial.

Figure A.3 shows a comparison of the exponential function  $e^{-Ts}$  with four different Padé approximations (first, second, third and sixth order) and four Maclaurin series (first, second, third and sixth order). The x-axis is normalized to s/T. For second order and higher, the Padé approximation shows a small difference from the  $e^{-Ts}$  function.



Figure A.2: Root locus of the delay approximation: (a) Fourth order Padé approximant. (b) Fourth order Maclaurin series.



Figure A.3: Comparison of the representation of  $e^{-Ts}$  with four approximation: (a) Maclaurin series: first, second, third and sixth order. (b) Padé approximant: first, second, third and sixth order.

The Maclaurin, on the other hand, became similar to the  $e^{-Ts}$  function for the sixth order.

The system delay represented by the exponential function involves infinite numbers of poles and zeros as can be seen with the Padé approximant. When using software tools such as MATLAB( $\hat{\mathbf{R}}$ ) the Padé approximant with third order is used.

### A.1.2 Stability analysis

In (A.1) a general representation of the open loop transfer function was determined based on the mentioned assumptions. In order to determine the effective number of poles and zeros a more detailed inspection of the system must be carried out.

There are two alternatives to determine the open loop transfer function of the system. The first is to check the distortion behavior of the system with a spectrum analyzer, then find the distortion transfer function and include its reciprocal in the open loop transfer function in order to cancel the distortion [11]. This method is a customized compensation, but it does not take into account the stability of the loop.

The second alternative is based on stability as a main restriction. Ensuring sufficient amount of loop gain to reduce the distortion without bringing the system to instability. In the previous section, it was shown that a delay in the system loop could lead the system towards instability, therefore compensation may be required. The following analysis assumes the most simple and general case of compensation, a single dominant pole. It will also allow us to understand the relationship between important variables that play a mayor role in the stability of the system.

Figure A.4 shows an equivalent diagram of the Cartesian feedback system of Figure A.1 after applying the assumptions established in section A.1. Here, A is the gain provided by the PA and  $\beta$  is the attenuation in the feedback path. It is assumed that the loop compensation does not provide any gain, only a dominant pole at a frequency



Figure A.4: Equivalent diagram of a first-order Cartesian loop (one channel only) [11].

given by  $p_1$ . The system delay, represented by  $e^{-Ts}$ , is expressed in the frequency domain and T represent the delay of all the components in the loop. Only one channel (I or Q) is analyzed under the assumption that no cross-coupling exists between I and Q.

The open loop and closed loop transfer function are as follows:

• Open-loop transfer function:

$$L(s) = \frac{A\beta}{\frac{s}{p_1} + 1} e^{-Ts} \tag{A.5}$$

• Closed-loop transfer function:

$$H(s) = \frac{y(s)}{r(s)} = \frac{Ap_1 e^{-Ts}}{p_1 + s + A\beta p_1 e^{-Ts}}$$
(A.6)

The characteristic polynomial CP(s) = 1 + L(s) determines the condition for stability of the closed loop system. Equating CP(s) = 0 the following equations for the magnitude and phase are obtained:

• Magnitude (or Gain):

$$Gain(s) = |L(s)| = \frac{A\beta p_1 e^{-T\sigma}}{\sqrt{(\sigma + p_1)^2 + \omega^2}}$$
 (A.7)

• Phase:

$$Phase(s) = \angle L(s) = -\left[tan^{-1}\left(\frac{\omega}{\sigma+p_1}\right) + \omega T\right]$$
(A.8)

were  $s = \sigma + j\omega$ .

The gain margin is the reciprocal of the magnitude at the frequency in which the phase is  $\pi$ . For all  $\sigma = 0$ .

• Gain Margin:

$$GM = \left| \frac{1}{L(\omega_{\pi})} \right|$$
$$= \frac{\sqrt{p_1^2 + \omega_{\pi}^2}}{A\beta p_1}$$
(A.9)

The phase margin is the amount of phase available at gain crossover frequency. The crossover frequency ( $\omega_{UG}$ ) is obtained from the gain when it reaches unity.

• Phase Margin:

$$PM = \angle L(\omega_{UG}) + \pi$$
$$= -\left[tan^{-1}\left(\frac{\omega_{UG}}{p_1}\right) + T\omega_{UG}\right] + \pi$$
(A.10)

Calculating the gain and phase margin is done as follows:

• **GM:** Setting the phase to  $\pi$ ,  $\omega_{\pi}$  is found  $(\omega_{\pi} \approx \frac{\pi}{2T})$ . Evaluating in (A.9) and for small values of  $p_1T$  such that  $p_1T \ll \pi/2$ ,

$$GM \approx \frac{\pi}{2A\beta p_1 T}$$
 (A.11)

• **PM:** Setting the magnitude of L(s) to one and  $A\beta \gg 1$ ,  $\omega_{UG} = p_1 \sqrt{A^2 \beta^2 - 1} = p_1 A\beta$ ,

$$PM = \frac{\pi}{2} - A\beta p_1 T \tag{A.12}$$

• For a stable loop the phase margin must be positive, then,

$$A\beta p_1 T < \frac{\pi}{2} \tag{A.13}$$

A relationship between the DC loop gain  $(A\beta)$ , the delay (T) and the dominant pole frequency  $(p_1)$  has been found. It is possible to see that these parameters have to be reciprocal with each other in order to agree with (A.13). For instance, an increment in the delay will have to be followed by a reduction in the DC loop gain.

### A.2 The problem of instability in a Cartesian feedback system

In the previous section it was determined that for a typical system with feedback to be stable, (A.13) must hold. Often it is desired to have at least 60° of phase margin to ensure a small overshot and sufficient speed in the response of the system. With this condition, (A.12) becomes,

$$A\beta p_1 T \le \frac{\pi}{6} \tag{A.14}$$

As was shown in chapter 2 the Class E PA introduces phase distortion for low values of the supply voltage. This distortion can be seen as a shift in time of the transmitted signal, therefore, besides the delay generated by the components in the loop, an extra delay is added by the PA. This extra delay will change the condition established in (A.14) and therefore, threaten the stability of the system.


Figure A.5: Typical Cartesian feedback: (a) Phase misalignment in the feedback LO. (b) Symbol rotation due to phase shift.

The impact of phase misalignment was described in [3] and is repeated here for convenience.

In the previous section I and Q were assumed to be not coupled. This assumption is not valid when sudden changes in the phase of the RF signal are present (for instance, AM-PM distortion originated in the PA). The phase variation in the PA plus the delay in the system and other effects such as temperature and aging contribute to the phase misalignment.

The phase misalignment can be seen as a phase mismatch of the local oscillator in the down-conversion as shown in Figure A.5

$$I' = (I\cos\omega t + Q\sin\omega t)\cos(\omega t + \varphi) \tag{A.15}$$

$$Q' = (I\cos\omega t + Q\sin\omega t)\sin(\omega t + \varphi) \tag{A.16}$$

where  $\omega$  is the carrier frequency and  $\varphi$  is the phase misalignment. Using trigonometric identities and filtering the frequency components above the carrier frequency (2 $\omega$ ) equations (A.15) and (A.16) reduces to

$$I' = \frac{1}{2} \left( I \cos \varphi - Q \sin \varphi \right) \tag{A.17}$$

$$Q' = \frac{1}{2} \left( I \sin \varphi + Q \cos \varphi \right) \tag{A.18}$$

The error  $e_I(s)$  and  $e_Q(s)$  is:

$$e_I(s) = \frac{I_{in}}{1 + L(s)} \tag{A.19}$$

$$e_Q(s) = \frac{Q_{in}}{1 + L(s)} \tag{A.20}$$

where,

$$L(s) = AB \cdot G(s)\beta(s) \tag{A.21}$$

The cross-coupling effect in the error signal, in one path (for instance the In-phase), is obtained by only injecting a signal in  $I_{in}$ , and keeping  $Q_{in}$  with zero input signal. The error is calculated as follow, (the same result can be obtained for  $e_Q(s)$ ).

$$e_I(s) = \frac{I_{in}}{1 + L(s)\cos\varphi + \frac{(L(s)\sin\varphi)^2}{1 + L(s)\cos\varphi}}$$
(A.22)

were

$$L_{eff}(s) = L(s)\cos\varphi + \frac{(L(s)\sin\varphi)^2}{1 + L(s)\cos\varphi}$$
(A.23)

is the effective loop gain as a function of the frequency and the phase misalignment [3].

For values in which  $\varphi = \pi/2$ ,  $L_{eff}(s) = [L(s)]^2$ , the loop gain tends towards the power of two as the phase misalignment increases. It is important to consider a compensation mechanism that allows the system to be stable for the whole range  $[0 \pi/2]$ .

It will be shown, in Section 3.2.5, that the phase misalignment can be corrected by performing a rotation of the Cartesian components. For that to work, a mechanism to detect the phase will be required in order to provide the phase value to the rotation.

# B

## B.1 MATLAB®/Simulink® model simulation environment



Figure B.1: Simulink model of the Cartesian feedback system. The compensation filter is in the forward path.

Figure B.1 shows the Simulink model used to mathematically test the behavior of the Cartesian feedback system. The assumption for this model is that the modulation scheme is constant envelope. Modulation schemes such as BPSK, GMSK or  $\pi/4$ -DQPSK are suitable for this system. The model also assumes that the compensation filter in the forward path. It is also possible to model the system with the compensation filter in the feedback path with one simple adjustment.

The model consists of the main following blocks:

**Rotation.** As its name indicates computes the rotation of the Cartesian components. The error signals, In-phase and Quadrature are scaled in such a way that the output vector form a new angle. The new angle consist in an addition or subtraction of the input signal *angle* to the angle formed by the input error vector. This block computes:

$$I_{rot} = I \cdot cos(angle) + Q \cdot sin(angle)$$
(B.1)

$$Q_{rot} = Q \cdot cos(angle) - I \cdot sin(angle) \tag{B.2}$$

The difference with respect to the CORDIC in circular rotation mode is that no scaling factor is present here. The result is the correct value of the rotation.

- **Time delay.** Allows to insert a delay in each loop of the system. It models a Padé approximant in the frequency domain. For a more complete modeling the time delay was also added to the fee back path.
- Anti imaging-filter (AIF), The order of this filter was calculated in Section 3.2.3.1. Two poles are modeled in the frequency domain as:

$$AIF(s) = \frac{1}{\frac{s}{p_1} + 1} \cdot \frac{1}{\frac{s}{p_2} + 1}$$
$$= \frac{1}{\frac{1}{\frac{1}{p_1 p_2} s^2 + \frac{p_1 + p_2}{p_1 p_2} s + 1}}$$
(B.3)

where  $p_1 = p_2$  to ensure maximum pass band in the bandwidth of interest.

Lead compensation filter (LCF). One zero is required to compensate the system. This zero will add phase to the system, improving the phase margin. In practical implementations a zero cannot be implemented unless is it followed by a pole. A lead compensator filter has the transfer function

$$LCF(s) = \frac{\frac{s}{z_1} + 1}{\frac{s}{p_4} + 1}$$
(B.4)

where  $p_4 \gg z_1$ . With  $p_4$  one order of magnitude higher than  $z_1$ , it is considered a non-dominant pole. For purposes of analysis this pole is left out.

**Power amplifier (PA).** This block models a non-linear amplifier by means of a third order polynomial

$$PA(v) = K_1 \cdot v + K_2 \cdot v^2 + K_3 \cdot v^3$$
(B.5)

where  $K_1$  is the linear gain. When modeling a linear amplifier this term is always present.  $K_1$  will be the only coefficient and the output signal will be an amplified copy of the input.  $K_2$  is the second order coefficient. When  $K_2 \neq 0$ , the second order term in the polynomial contribute with the second harmonic  $(2\omega_{BW})$ .  $K_3$ , the third order coefficient has a direct relationship with the third harmonics (when  $K_3 \neq 0$ ). Modeling a power amplifier with a polynomial has an important use when intermodulation distortion (IMD) has to be analyzed. IMD is present when two or more tones, close each other in frequency, are superpositioned to form the input signal. Intermodulation of the second order tones are not of importance because they are far from the bandwidth of interest [11]. On the other hand, third order components plays an important role in the distortion of the output signal. Third order components in a two tones input signal, for example, creates frequency components in  $2 \cdot f_1 - f_2$  and  $2 \cdot f_2 - f_1$ , where  $f_2$  and  $f_1$  are the frequency tones of the input signal. These component are close to the bandwidth of interest. It is desired to eliminate these components in order to obtain an amplified copy of the input signal without distortion.

Att. The attenuation is required to reduce the magnitude of the output signal before down-conversion. Also, the attenuation is a component of the loop gain, therefore,

it has to be as linear as possible. To get a linear attenuation, passive components, that does not depend on frequency, should be used. This will ensure a linear behavior, in all the range of frequencies, of the loop gain. Therefore, resistors are the best solution. Using a voltage divider with accurate resistors is recommended.

Anti anti-aliasing filter. This filter is required to eliminate aliasing. As was explained in Section 3.2.3.3, this filter can have a high cutoff frequency. Also, the order can be higher than two in order to have a higher rolloff, as long as does not affect the power consumption due to the increase in components. Because its high cutoff frequency the poles of this filter are non-dominant. A second order Butterworth filter is a good alternative to implement this filter. The second order Butterworth (BWF) filter is modeled in frequency as:

$$BWF(s) = \frac{1}{\frac{1}{p_3^2}s^2 + \frac{1.414}{p_3}s + 1}$$
(B.6)

**Phase detection.** The phase detection block computes (3.28), repeated here for convenience:

$$QI' - IQ' = \kappa_1 \kappa_2 \sin(\theta - \theta') \tag{B.7}$$

Equation (B.7) computes, in real time, the instantaneous phase shift. The instantaneous phase shift is proportional to the voltage. To obtain the effective phase, the signal is integrated in time as was explained in Section 3.2.5.

**Integrator.** The integrator and the integration constant  $C_o$  are applied to phase detection output. This action is understood as mechanizing the equation [3]

$$\frac{d\theta}{dt} = C_0(QI' - IQ')$$
  
=  $C_0\kappa_1\kappa_2 sin(\theta - \theta')$  (B.8)

This model was used to simulate two tones test and to verify the behavior of the phase detection block and the integrator as is shown in Figures 3.16 and 3.17.

## B.2 ADS<sup>®</sup> HDL co-simulation environment

ADS HDL co-simulation environment allows to model the system at component level. It is a co-simulation environment because components can be described in HDL (VHDL or Verilog) and by analog model such as amplifier, modulators, demodulators, etc. It also provide the means to simulate at transistor level by providing the right desing and transistors model libraries.

Figure B.2 shows the schematic of Cartesian feedback system for  $ADS(\hat{R})$ .

**Block**<sub>1</sub> consists of the input signals. A data file that contains the In-phase and Quadrature signals for a  $\pi/4$ -DQPSK generated by means of the data generator shown in Figure B.7.



Figure B.2: ADS model of the Cartesian feedback system.

- **Block**<sub>2</sub> consists of ADS modules that convert the signals from time domain to discrete fixed point domain <sup>1</sup>. Here different reference voltages are provided for the DACs and ADCs: forward digital to analog converter,  $V_{ref_{-}DAC}$  and feedback analog to digital converter,  $V_{ref_{-}ADC}$ .
- $Block_3$  consist of a merge block to concatenate the input signals into one single input stream <sup>2</sup>, the HdlCosim module; the digital design implemented in VHDL and the split block to separate the output signal from the single output stream.
- $Block_4$  converts the discrete fixed point signal into time domain signals.
- $Block_5$  is the analog model of the remaining part of the Cartesian feedback system.

<sup>&</sup>lt;sup>1</sup>Required for the ADS Ptolemy simulator to map signal in HDL.

 $<sup>^2\</sup>mathrm{Required}$  for the HDlCosim module to enter the input signals.

#### B.2.1 Analog part (Block<sub>5</sub>)

The analog part is shown in Figure B.3. The advantages of using ADS is that the implementation is closer to the transistor level.



Figure B.3: ADS model of the analog block.

Each component can be implemented by using the adequate transistor. In that way a more real implementation is possible. However, this work only aims to test the digital block, therefore the analog components are higher level models of the real components. Also ADS provide a model for the delay and a model for the phase shift. The delay model is useful to model the delay of the digital system. the phase shift is used to test the phase detection. Adding positive or negative phase and then checking the the output of the integrator, helps to verify the correct behavior of that part.

The up and down converters are implemented with the IQ\_ModTuned and IQ\_DemodTuned components respectively [42]. The forward\_filter (anti-imaging filter) is implemented as Figure B.4 shows

The second\_order\_BW\_filter models a second order Butterworth filter (antialiasing filter) and is implemented as Figure B.5 shows.



Figure B.4: Forward filter

#### B.2.1.1 Forward filter

The forward filter consists of the anti-imaging filter and the lead compensator as shown in Figure B.4. The anti-imaging filter is implemented with ideal operational amplifiers. Two active first order low pass filter connected in cascade.  $R_2$  and  $R_3$  allow to give gain to each filter. In order to have a unit gain  $R_2 \gg R_3$ .  $R_{cf}$  and C determine the cutoff frequency.

$$R_{2} = 100 \ K\Omega$$

$$R_{3} = 1 \ K\Omega$$

$$R_{cf} = 100 \ K\Omega$$

$$C = \frac{1}{R_{cf}CutFreq_{1}}$$

$$CutFreq_{1} = 2\pi f_{B_{eff}}$$

#### B.2.1.2 Feedback filter

The anti-aliasing filter consists of second order Butterworth filter. It is implemented with an ideal operation amplifier as shown in Figure B.5.  $R_{g3}$  and  $R_{g4}$  allow to give gain to each filter. In order to have a unit gain  $R_{g3} \gg R_{g4}$ .  $R_{cf}$  and  $C_{cf}$  determine the cutoff frequency.



Figure B.5: Feedback filter

$$R_{g3} = 100 \ K\Omega$$
$$R_{g4} = 1 \ K\Omega$$
$$C_{cf} = \frac{1}{2\pi R_{cf} f_{carier}}$$
$$R_{cf} = 100 \ K\Omega$$

### B.2.2 Lead compensator

The compensation is implemented with an ideal operation amplifier as shown in Figure B.6. This active circuit will generate a zero and a pole. It adds phase to the system until it reaches the lead pole which is located one decade higher.



Figure B.6: Lead compensator

$$R_4 = 90 \ K\Omega$$
$$R_5 = 10 \ K\Omega$$
$$CutFreq_3 = 2\pi f_z$$

where  $f_z$  is the frequency of the zero.

#### B.2.3 Data generator

The data generator is used to generate a quadrature signal from a pseudo random bitstream. For that, the PI4DQPSK module takes the serial bitstream and generates a complex signal. The raised cosine filter is used to shape the complex signal. Finally, the IQ\_DEMOD module separates the I and Q components.



Figure B.7: Data generator schematic

$$BitRate = 19.2[Kbps]$$
  

$$RFFreq = 144[MHz]$$
  

$$\alpha = 0.65$$
  
(B.9)

where  $\alpha$  is the rolloff factor defining the filter's excess bandwidth

Figure B.8 shows the constellation for the generated signal and the frequency spectrum. The bitrate is 19.2[Kbps] and the bandwidth of the signal is 9.6[kHz] as can be seen in Figure B.8(b). The  $\pi/4$ -DQPSK transmit two bits per symbol.



Figure B.8: Input data. (a) Constellation. (b) Frequency spectrum.

## C.1 FPGA implementation summaries

## C.1.1 FPGA power

| Power summary                     | I(mA) | P(mW)   |
|-----------------------------------|-------|---------|
| Total estimated power consumption |       | 33.31   |
| Total Vccint 1.20V                | 9.86  | 11.83   |
| Total Vccaux 2.50V                | 7.00  | 17.51   |
| Total Vcco25 2.50V                | 1.59  | 3.97    |
| Clocks                            |       | 3.76    |
| IO                                |       | 0.40    |
| Logic                             |       | 0.36    |
| MULT                              |       | 0.01    |
| Signals                           |       | 1.42    |
| Quiescent Vccint 1.20V            | 5.09  | 6.12    |
| Quiescent Vccaux 2.50V            | 7.00  | 17.50   |
| Quiescent Vcco25 2.50V            | 1.50  | 3.75    |
| Package power limits, ambient 25C |       | 1587.30 |
| 250  LFM                          |       | 2040.82 |
| 500  LFM                          |       | 2390.44 |
| 750 LFM                           |       | 2542.37 |

Table C.1: Power summary.

## C.1.2 FPGA area summary

Registers I\_IN\_REG\_U5, I\_FB\_REG\_U7, IROT\_REG\_U3, Q\_IN\_REG\_U6, Q\_FB\_REG\_U8, QROT\_REG\_U4 and MAG\_REG\_U2 are located into the I/O pad logic, therefore they are not considered as part of the CLBs.

Route thru indicates the amount of LUTs that are used to access internal Slice points when direct access is not available or less efficient. No logic is performed in the LUT and the net passes straight through into the CLB.

| Module Name       | Slice   | es     | Slice Reg |       | LUTs  |     | BUFG              |   |
|-------------------|---------|--------|-----------|-------|-------|-----|-------------------|---|
|                   | route   |        | route     |       | route |     | route             |   |
|                   | thru    |        | thru      |       | thru  |     | $\mathbf{x}$ thru |   |
| CFB_Core          | 6       | 323    | 0         | 126   | 11    | 439 | 1                 | 1 |
|                   |         | Add    | lers      |       |       |     |                   |   |
| ABS_VAL           | 9       | 15     | 0         | 0     | 12    | 12  | 0                 | 0 |
| I_COMP            | 0       | 6      | 0         | 0     | 0     | 12  | 0                 | 0 |
| Q_COMP            | 0       | 6      | 0         | 0     | 0     | 12  | 0                 | 0 |
|                   |         | COR    | DIC       |       |       |     |                   |   |
| MAG_U0            | 1       | 92     | 0         | 24    | 1     | 139 | 0                 | 0 |
| ROT_U0            | 0       | 126    | 0         | 40    | 0     | 183 | 0                 | 0 |
| P                 | hase de | tecti  | on/corr   | ectio | n     |     |                   |   |
| PHASE_DET0        | 0       | 41     | 0         | 28    | 0     | 48  | 0                 | 0 |
|                   | Finite  | e stat | e mach    | ine   |       |     |                   |   |
| FSM_U0            | 7       | 7      | 6         | 6     | 6     | 6   | 0                 | 0 |
| $FSM_{-}U1$       | 7       | 7      | 8         | 8     | 6     | 6   | 0                 | 0 |
| Registers         |         |        |           |       |       |     |                   |   |
| MAG_REG_U1        | 8       | 8      | 12        | 12    | 0     | 0   | 0                 | 0 |
| PHASE_DET_REG_U11 | 8       | 8      | 12        | 12    | 0     | 0   | 0                 | 0 |
| Other logic       |         |        |           |       |       |     |                   |   |
| COMP_U0           | 1       | 1      | 0         | 0     | 1     | 1   | 0                 | 0 |
| COMP_U1           | 1       | 1      | 0         | 0     | 1     | 1   | 0                 | 0 |
| COUNTER_U0        | 4       | 4      | 4         | 4     | 4     | 4   | 0                 | 0 |
| COUNTER_U1        | 3       | 3      | 4         | 4     | 4     | 4   | 0                 | 0 |

Table C.2: Resource usage by components

# C.2 ASIC implementation summaries

## C.2.1 ASIC timing

|                                                                 | ammar j.        |                          |
|-----------------------------------------------------------------|-----------------|--------------------------|
| Des/Clust/Port                                                  | Wire Load Model | Library                  |
| CED ACIC 101 N10 M04 L1C CNE4                                   | Clok            |                          |
| OF D_A610_12D_N12_W124_L10_ON 14                                | enG10K          | isuUa_a_generic_core_wc  |
| PHASE_DETECT_N12_M24_L16                                        | enG5K           | fsd0a_a_generic_core_wc  |
| TCP4PPPC 11.0.11.000.1                                          | onCEV           | fad0a a ganceia anno     |
| 1CR4BFFG_11_0_11_000_1                                          | engok           | isdua_a_generic_core_wc  |
| C42TR_13_0_15_0_1000_1                                          | enG5K           | fsd0a_a_generic_core_wc  |
| UDUCA 22.2.2.0.1                                                | CEV.            | fadoa a annania anna ana |
| UBHCA_23_2_22_0_1                                               | engok           | Isdua_a_generic_core_wc  |
| DIFFERENCE_N16                                                  | enG5K           | fsd0a_a_generic_core_wc  |
| Point                                                           | Incr            | Path                     |
| r omt                                                           | IIICI           | I dtll                   |
| clock CLK (rise edge)                                           | 0.00            | 0.00                     |
| ala ala anteriorale dellare (ideal)                             | 0.00            | 0.00                     |
| clock lietwork delay (ideal)                                    | 0.00            | 0.00                     |
| Q_FB_REG_U8/DO_reg[5]/CK (QDFERBX1)                             | 0.00            | 0.00 r                   |
| O EP PEC US/DO regist/O (ODEEPPX1)                              | 0.26            | 0.26 -                   |
| Q_F B_REG_08/D0_reg[3]/Q (QDFERBAT)                             | 0.20            | 0.20 1                   |
| Q_FB_REG_U8/DO[5] (REGISTER_N_N12_3)                            | 0.00            | 0.26 r                   |
| DUASE DETO (O EDIS) (DUASE DETECT N19 M94 L16)                  | 0.00            | 0.96 -                   |
| $FHASE_DE10/Q_FB[5]$ (FHASE_DE1EC1_N12_M24_L16)                 | 0.00            | 0.20 r                   |
| PHASE_DET0/MULT2/MULT_U/U0/U0/IN2[5] (TCR4BPPG_11_0_11_000_1)   | 0.00            | 0.26 r                   |
| DUASE DETO/MULTO/MULT U/UO/UO/U10/O (BUEXI)                     | 0.24            | 0.60                     |
| PHASE_DE10/MOL12/MOL1-0/00/00/012/O (BOFAI)                     | 0.54            | 0.00 F                   |
| PHASE_DET0/MULT2/MULT_U/U0/U0/U4/O (INVX1)                      | 0.30            | 0.89 f                   |
| PHASE DETO/MULT2/MULT $U/U0/U181/O(OA122X1)$                    | 0.70            | 1.60 "                   |
| PHASE_DE10/M0E12/M0E1_0/00/00/0181/0 (OAI35X1)                  | 0.70            | 1.00 r                   |
| PHASE_DET0/MULT2/MULT_U/U0/U0/U92/O (AOI22XLP)                  | 0.18            | 1.78 f                   |
| PHASE DETO/MULT2/MULT U/U0/U0/U01/OP (MYL2YLP)                  | 0.17            | 1.04 -                   |
| PHASE_DE10/MOL12/MOL1_0/00/00/091/OB (MAL2ALF)                  | 0.17            | 1.94 f                   |
| PHASE_DET0/MULT2/MULT_U/U0/U0/PP3[11] (TCR4BPPG_11_0_11_000_1)  | 0.00            | 1.94 r                   |
| PHASE DETO/MULT2/MULT U/U0/U1/PD2111 (C42TP 12 0 15 0 1000 1)   | 0.00            | 1.04 -                   |
| INASE_DE10/MOE12/MOE1_0/00/01/115[11] (042110_15_0_15_0_1000_1) | 0.00            | 1.941                    |
| PHASE_DET0/MULT2/MULT_U/U0/U1/U70/O (INVX1)                     | 0.07            | 2.01 f                   |
| DUASE DETO/MULTO/MULT U/U0/U1/U056 (OD (MYLOYLD)                | 0.19            | 9.12 -                   |
| PHASE_DE10/MOL12/MOL1_0/00/01/0256/OB (MAL2ALF)                 | 0.12            | 2.13 r                   |
| PHASE_DET0/MULT2/MULT_U/U0/U1/U255/OB (MXL2XLP)                 | 0.23            | 2.36 r                   |
| DUASE DETO/MULT2/MULT U/U0/U1/U60/O (INVXLD)                    | 0.06            | 9.49.6                   |
| PHASE_DE10/MOL12/MOL1-0/00/01/009/0 (INVALP)                    | 0.06            | 2.421                    |
| PHASE_DET0/MULT2/MULT_U/U0/U1/U252/OB (MXL2XLP)                 | 0.12            | 2.54 r                   |
| DUASE DETO/MULTO/MULT U/UO/U1/USE1/OD (MYLOYLD)                 | 0.92            | 9.78                     |
| PHASE_DE10/MOL12/MOL1_0/00/01/0251/OB (MAL2ALF)                 | 0.23            | 2.10 f                   |
| PHASE_DET0/MULT2/MULT_U/U0/U1/U68/O (INVXLP)                    | 0.06            | 2.84 f                   |
| PHASE DETO/MULT2/MULT II/II0/U1/U248/OP (MYL2YLP)               | 0.10            | 2.02 -                   |
| PHASE_DE10/MOL12/MOL1_0/00/01/0248/OB (MAL2ALF)                 | 0.19            | 3.03 r                   |
| PHASE_DET0/MULT2/MULT_U/U0/U1/U5/O (INVXLP)                     | 0.06            | 3.10 f                   |
| PHASE DETO/MULTO/MULT U/U0/U1/U197/OP (MYL9YLP)                 | 0.18            | 2.20                     |
| THASE_DETO/MOLTZ/MOLT_0/00/01/0127/OB (MALZALI)                 | 0.18            | 3.28 1                   |
| PHASE_DET0/MULT2/MULT_U/U0/U1/S2 11  (C42TR_13_0_15_0_1000_1)   | 0.00            | 3.28 r                   |
| PHASE DETO/MULT2/MULT II/II0/U2/V[11] (IIPHCA 22.2.2.0.1)       | 0.00            | 2.20                     |
| PHASE_DE10/MUL12/MUL1-0/00/02/1[11] (UBHCA-23-2-22-0-1)         | 0.00            | 3.26 f                   |
| PHASE_DET0/MULT2/MULT_U/U0/U2/U37/O (INVXLP)                    | 0.06            | 3.34 f                   |
| PHASE DETO/MULT9/MULT II/II0/II9/II197/OP (MYL9YLP)             | 0.18            | 252                      |
| THASE DETO/MOLTZ/MOLTO/00/02/0127/0B (MALZALI)                  | 0.18            | 3.521                    |
| PHASE_DET0/MULT2/MULT_U/U0/U2/U123/O (ND2X1)                    | 0.10            | 3.62 f                   |
| PHASE DETO/MULT9/MULT II/II0/U9/U118/O (OA192X1)                | 0.12            | 2 75                     |
| PHASE_DE10/M0E12/M0E1_0/00/02/0118/0 (OAI23X1)                  | 0.15            | 5.75 r                   |
| PHASE_DET0/MULT2/MULT_U/U0/U2/U116/O (OR2B1XLP)                 | 0.17            | 3.92 r                   |
| PHASE DETO/MULTO/MULT U/U0/U0/U15/O (INVX1)                     | 0.06            | 2 07 f                   |
| 1 HASE_DE10/ MOE12/ MOE1_0/00/02/015/0 (INVAI)                  | 0.00            | 3.97 1                   |
| PHASE_DET0/MULT2/MULT_U/U0/U2/U114/OB (MXL2XLP)                 | 0.10            | 4.07 r                   |
| PHASE DETO/MULT2/MULT II/II0/U2/S[12] (UPPCA 22 2 22 0 1)       | 0.00            | 4.07 -                   |
| (DBICA-23-2-22-0-1)                                             | 0.00            | 4.07 1                   |
| PHASE_DET0/MULT2/MULT_U/U0/U3/I12 (UBTCCONV23_24_0_1)           | 0.00            | 4.07 r                   |
| PHASE DETO/MULT2/MULT U/U0/U3/U2/O (BUEX1)                      | 0.10            | 417 r                    |
|                                                                 | 0.10            | 4.17 1                   |
| PHASE_DET0/MULT2/MULT_U/U0/U3/O[12] (UBTCCONV23_24_0_1)         | 0.00            | 4.17 r                   |
| PHASE DET0/DIFF1/IN1[4] (DIFFEBENCE N16)                        | 0.00            | 417 r                    |
|                                                                 | 0.00            |                          |
| PHASE_DET0/DIFF1/U32/O (INVX1)                                  | 0.04            | 4.21 f                   |
| PHASE DETO/DIFF1/U87/OB (MXL2XLP)                               | 0.18            | 4 39 r                   |
|                                                                 | 0.10            | 4.05 1                   |
| PHASE_DET0/DIFF1/U16/O (INVX1)                                  | 0.07            | 4.47 f                   |
| PHASE DETO/DIFF1/U85/O (ND2X1)                                  | 0.06            | 4 53 r                   |
|                                                                 | 0.00            | 4.00 1                   |
| PHASE_DET0/DIFF1/U84/O (NR2X1)                                  | 0.04            | 4.57 f                   |
| PHASE DET0/DIFF1/U83/O ( $\Delta$ OI222X1)                      | 0.19            | 4 76 r                   |
|                                                                 | 0.10            | 1015                     |
| $rnse_{De10/Diff1/U(//O(ND2A1))}$                               | 0.08            | 4.84 f                   |
| $PHASE_DET0/DIFF1/U76/O$ (MAOI222X1)                            | 0.17            | 5.01 r                   |
| PHASE DETO(DIFET)/U75(O(NP2Y1))                                 | 0.07            | 5 00 5                   |
| PHASE_DE10/DIFF1/075/O (NR3X1)                                  | 0.07            | 5.08 I                   |
| PHASE_DET0/DIFF1/U64/O (OAI23X1)                                | 0.20            | 5.28 r                   |
|                                                                 | 0.00            | FOCT                     |
| PHASE_DE10/DIFF1/05/O (INVX1)                                   | 0.08            | 0.30 I                   |
| PHASE_DET0/DIFF1/U61/O (MAOI222X1)                              | 0.14            | 5.50 r                   |
| DUAGE DET (DIEEL/US (O (DUVX))                                  | 0.07            | FFOR                     |
| PHASE_DE10/DIFF1/03/O (INVX1)                                   | 0.07            | 0.08 I                   |
| PHASE_DET0/DIFF1/U58/O (MAOI222X1)                              | 0.14            | 5.71 r                   |
| DUASE DETO/DIFEI/U2/O (INVX1)                                   | 0.07            | E 70 £                   |
| $\Gamma IIASE_DE I U/DIFF I/U2/U (III VAI)$                     | 0.07            | 0.78 I                   |
| $PHASE_DET0/DIFF1/U55/O$ (MAO222X1)                             | 0.13            | 5.92 f                   |
| PHASE DETO/DIFE1/US4/O (XNP2X1)                                 | 0.11            | 6 02 f                   |
| $\Gamma IIASE_DETU/DIFFT/034/O (ANR3AT)$                        | 0.11            | 0.02 1                   |
| PHASE_DET0/DIFF1/DIFF[15] (DIFFERENCE_N16)                      | 0.00            | 6.02 f                   |
| PHASE DETO/PHASE DET DEC UII /DIIII (DECISTED N NICI)           | 0.00            | 6.02.4                   |
| THASE_DETU/FRASE_DET_REG_UTI/DI[15] (REGISTER_N_N10_1)          | 0.00            | 0.02 I                   |
| PHASE_DET0/PHASE_DET_REG_U11/DO_reg[15]/D (ODFERBX1)            | 0.00            | 6.02 f                   |
|                                                                 | 0.00            | 0.02                     |
| data arrival time                                               |                 | 6.02                     |
| clock CLK (rise edge)                                           | 7.00            | 7.00                     |
|                                                                 | 0.00            | 7.00                     |
| clock network delay (ldeal)                                     | 0.00            | 1.00                     |
| PHASE_DET0/PHASE_DET_REG_U11/DO_reg[15]/CK (ODFERBX1)           | 0.00            | 7.00 r                   |
|                                                                 | 0.17            | 6.00                     |
| norary setup time                                               | -0.17           | 0.83                     |
| data required time                                              |                 | 6.83                     |
|                                                                 |                 | 6.00                     |
| data required time                                              |                 | 0.83                     |
| data arrival time                                               |                 | -6.02                    |
|                                                                 |                 | 0.01                     |
| SIACK (WELL)                                                    |                 | 0.81                     |

Table C.3: ASIC critical path summary.

## C.2.2 ASIC power

| Global Operating Voltage 0.9      |                                 |               |              |          |       |  |  |
|-----------------------------------|---------------------------------|---------------|--------------|----------|-------|--|--|
| Power-specific unit information : |                                 |               |              |          |       |  |  |
| Voltage Units                     | 1[V]                            |               |              |          |       |  |  |
| Capacitance Units                 |                                 | 1.000000 [pF] |              |          |       |  |  |
| Time Units                        |                                 | 1[ns]         |              |          |       |  |  |
| Dynamic Power Un                  | 1 mW (derived from V,C,T units) |               |              |          |       |  |  |
| Leakage Power Uni                 | 1pW                             |               |              |          |       |  |  |
| Hierarchy                         | Switch                          | Int           | Leak         | Total    |       |  |  |
| v                                 | Power                           | Power         | Power        | Power    | %     |  |  |
| Top Level                         | 0.984                           | 1.170         | 6.16e + 07   | 2.216    | 100.0 |  |  |
| CFB_Core                          | 8.57e-02                        | 0.325         | 4.05e + 07   | 0.451    | 20.4  |  |  |
| ABS_VAL                           | 1.39e-04                        | 1.16e-04      | 5.72e + 05   | 8.28e-04 | 0.0   |  |  |
| I_COMP                            | 7.60e-04                        | 6.77e-04      | 7.52e + 05   | 2.19e-03 | 0.1   |  |  |
| Q_COMP                            | 7.44e-04                        | 6.71e-04      | 7.52e + 05   | 2.17e-03 | 0.1   |  |  |
| MAG_U0                            | 1.84e-02                        | 5.06e-02      | 5.44e + 06   | 7.45e-02 | 3.4   |  |  |
| ROT_U0                            | 2.45e-02                        | 7.30e-02      | 8.09e + 06   | 1.06e-01 | 4.8   |  |  |
| PHASE_DET0                        | 2.32e-02                        | 5.00e-02      | 1.93e + 07   | 9.25e-02 | 4.2   |  |  |
| FSM_U0                            | 6.78e-04                        | 5.18e-03      | 3.05e + 05   | 6.17e-03 | 0.3   |  |  |
| FSM_U1                            | 1.53e-03                        | 5.37e-03      | 3.44e + 05   | 7.24e-03 | 0.3   |  |  |
| COMP_U0                           | 3.34e-05                        | 6.34e-05      | 3.10e + 04   | 1.28e-04 | 0.0   |  |  |
| COMP_U1                           | 6.89e-05                        | 6.07e-05      | 3.11e + 04   | 1.61e-04 | 0.0   |  |  |
| COUNTER_U0                        | 4.51e-03                        | 6.99e-03      | 3.23e + 05   | 1.18e-02 | 0.5   |  |  |
| COUNTER_U1                        | 4.34e-03                        | 6.79e-03      | 3.22e + 05   | 1.14e-02 | 0.5   |  |  |
| I_IN_REG_U5                       | 1.90e-03                        | 1.60e-02      | 5.30e + 05   | 1.84e-02 | 0.8   |  |  |
| I_FB_REG_U7                       | 9.53e-04                        | 1.60e-02      | 5.30e + 05   | 1.74e-02 | 0.8   |  |  |
| IROT_REG_U3                       | 3.20e-04                        | 1.54e-02      | 5.18e + 05   | 1.63e-02 | 0.7   |  |  |
| Q_IN_REG_U6                       | 1.88e-03                        | 1.60e-02      | 5.30e + 05   | 1.84e-02 | 0.8   |  |  |
| Q_FB_REG_U8                       | 9.49e-04                        | 1.60e-02      | 5.30e + 05   | 1.74e-02 | 0.8   |  |  |
| QROT_REG_U4                       | 3.25e-04                        | 1.54e-02      | 5.37e + 05   | 1.63e-02 | 0.7   |  |  |
| MAG_REG_U1                        | 1.31e-04                        | 1.54e-02      | 5.19e + 05   | 1.61e-02 | 0.7   |  |  |
| MAG_REG_U2                        | 3.28e-04                        | 1.54e-02      | 5.14e + 05   | 1.63e-02 | 0.7   |  |  |
| I/O                               | Switch                          | Int           | Leak         | Total    |       |  |  |
|                                   | Power                           | Power         | Power        | Power    | %     |  |  |
| io_MAG                            | 1.28e-03                        | 0.232         | 1.39e + 06   | 0.235    | 10.6  |  |  |
| io_Q_ROT                          | 1.27e-03                        | 0.229         | 1.39e + 06   | 0.232    | 10.5  |  |  |
| io_I_ROT                          | 1.25e-03                        | 0.226         | 1.39e + 06   | 0.228    | 10.3  |  |  |
| io_Q_FB                           | 1.44e-03                        | 2.65e-02      | 1.59e + 06   | 2.95e-02 | 1.3   |  |  |
| io_I_FB                           | 1.43e-03                        | 2.64e-02      | $1.59e{+}06$ | 2.94e-02 | 1.3   |  |  |
| io_Q_IN                           | 1.43e-03                        | 2.63e-02      | $1.59e{+}06$ | 2.93e-02 | 1.3   |  |  |
| io_I_IN                           | 1.43e-03                        | 2.64e-02      | $1.59e{+}06$ | 2.94e-02 | 1.3   |  |  |
| io_nd                             | 1.79e-04                        | 2.20e-03      | 1.33e+05     | 2.52e-03 | 0.1   |  |  |
| io_rst                            | 2.75e-03                        | 2.25e-03      | 1.33e+05     | 5.14e-03 | 0.2   |  |  |
| io_clk                            | 4.68e-02                        | 3.63e-02      | 1.33e + 05   | 8.33e-02 | 3.8   |  |  |

Table C.4: Core power summary.

## C.2.3 ASIC area

| Number of ports:               | 87                                                       |             |           |           |       |  |  |
|--------------------------------|----------------------------------------------------------|-------------|-----------|-----------|-------|--|--|
| Number of nets:                | 274                                                      |             |           |           |       |  |  |
| Number of cells:               | 111                                                      |             |           |           |       |  |  |
| Number of references:          | 16                                                       |             |           |           |       |  |  |
| Combinational area:            | $456966.74 \ [\mu m^2]$                                  |             |           |           |       |  |  |
| Noncombinational area:         | $315895.58 \left[ \mu m^2 \right]$                       |             |           |           |       |  |  |
| Total cell area:               | 772862.32                                                | $[\mu m^2]$ |           |           |       |  |  |
| Hierarchical area distribution |                                                          |             |           |           |       |  |  |
|                                | Global cell area $[\mu m^2]$ Local cell area $[\mu m^2]$ |             |           |           | 2]    |  |  |
| Hierarchical cell              | Absolute                                                 | Percent     | Combi-    | Noncombi- | Black |  |  |
|                                | Total                                                    | Total       | national  | national  | boxes |  |  |
| Top level                      | 772862.5                                                 | 100         | 1262      | 0         | 0     |  |  |
| CFB_Core                       | 19699                                                    | 2.5         | 3         | 0         | 0     |  |  |
| ABS_VAL                        | 252                                                      | 0           | 252       | 0         | 0     |  |  |
| I_COMP                         | 381                                                      | 0           | 381       | 0         | 0     |  |  |
| Q_COMP                         | 381                                                      | 0           | 381       | 0         | 0     |  |  |
| MAG_U0                         | 2598                                                     | 0.3         | 1974      | 624       | 0     |  |  |
| ROT_U0                         | 4016                                                     | 0.5         | 3016      | 1000      | 0     |  |  |
| PHASE_DET0                     | 9067                                                     | 1.2         | 8391      | 376       | 300   |  |  |
| FSM_U0                         | 121                                                      | 0           | 61        | 60        | 0     |  |  |
| FSM_U1                         | 134                                                      | 0           | 74        | 60        | 0     |  |  |
| COMP_U0                        | 11                                                       | 0           | 11        | 0         | 0     |  |  |
| COMP_U1                        | 11                                                       | 0           | 11        | 0         | 0     |  |  |
| COUNTER_U0                     | 150                                                      | 0           | 70        | 80        | 0     |  |  |
| COUNTER_U1                     | 150                                                      | 0           | 70        | 80        | 0     |  |  |
| I_IN_REG_U5                    | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| I_FB_REG_U7                    | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| IROT_REG_U3                    | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| Q_IN_REG_U6                    | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| $Q_FB_REG_U8$                  | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| QROT_REG_U4                    | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| MAG_REG_U1                     | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| MAG_REG_U2                     | 303                                                      | 0           | 3         | 300       | 0     |  |  |
| io_I_FB                        | 103710.54                                                | 13.4        | 103710.54 | 0         | 0     |  |  |
| io_I_IN                        | 103710.54                                                | 13.4        | 103711.54 | 0         | 0     |  |  |
| io_I_ROT                       | 103710.54                                                | 13.4        | 103712.54 | 0         | 0     |  |  |
| io_MAG                         | 103710.54                                                | 13.4        | 103713.54 | 0         | 0     |  |  |
| io_Q_FB                        | 103710.54                                                | 13.4        | 103714.54 | 0         | 0     |  |  |
| io_Q_IN                        | 103710.54                                                | 13.4        | 103715.54 | 0         | 0     |  |  |
| io_Q_ROT                       | 103710.54                                                | 13.4        | 103716.54 | 0         | 0     |  |  |
| io_clk                         | 8642.54                                                  | 1.1         | 8642.54   | 0         | 0     |  |  |
| io_nd                          | 8642.54                                                  | 1.1         | 8642.54   | 0         | 0     |  |  |
| io_rst                         | 8642.54                                                  | 1.1         | 8642.54   | 0         | 0     |  |  |

| Table | C.5: | Core area summary. |  |
|-------|------|--------------------|--|
| 10010 | 0.0. | core area summary. |  |

- V. Petrovic and W. Gosling, "Polar-loop transmitter", *Electronics Letters*, vol. 15, no. 10, pp. 286 –288, may 1979.
- [2] V. Petrovic, "Reduction of spurious emission from radio transmitters by means of modulation feedback", *IEE Conf. on Radio Spectrum Conservation Techniques*, pp. 44–49, 1983.
- [3] J.L. Dawson, Thomas H. Lee, and Joel L. Dawson, *Feedback Linearization of RF Power Amplifiers*, Springer, 1st edition, 6 2004.
- [4] M. A. Briffa, *Linearization of RF Power Amplifiers*, PhD dissertation, Victoria University of Technology, Dep. of Electrical and Electronic, December 1996.
- [5] Patrick Reynaert and Michiel Steyaert, RF Power Amplifiers for Mobile Communications (Analog Circuits and Signal Processing), Springer, 1 edition, 9 2006.
- [6] C. Tassin, P. Garcia, J.-B. Begueret, R. Toup, Y. Deval, and D. Belot, "A mixedsignal cartesian feedback linearization system for a zero-if wcdma transmitter handset ic", in *Research in Microelectronics and Electronics*, 2005 PhD, July 2005, vol. 2, pp. 59–62.
- [7] C. Tassin, P. Garcia, J.-P. Begueret, Y. Deval, and D. Belot, "A cartesian feedback feasibility study for a zero-if wcdma transmitter handset ic", in *IEEE-NEWCAS Conference*, 2005. The 3rd International, 19-22 2005, pp. 243 – 246.
- [8] N.O. Sokal and A.D. Sokal, "Class E A new class of high-efficiency tuned singleended switching power amplifiers", *Solid-State Circuits*, *IEEE Journal of*, vol. 10, no. 3, pp. 168–176, Jun 1975.
- [9] Frank Stelwagen, "A highly efficient space qualified class e power amplifier for  $\mu$ -satellite Delfi- $C^{3}$ ", MSc dissertation, Delft University of Technology, August 2007.
- [10] L.R. Kahn, "Single-sideband transmission by envelope elimination and restoration", Proceedings of the IRE, vol. 40, no. 7, pp. 803–806, July 1952.
- [11] Peter B. Kenington, *High Linearity RF Amplifier Design*, Artech House Publishers, illustrated edition edition, 9 2000.
- [12] D. Cox, "Linear amplification by sampling techniques: A new application for delta coders", *Communications, IEEE Transactions on*, vol. 23, no. 8, pp. 793–798, Aug 1975.
- [13] M. Johansson, T. Mattsson, L. Sundstrom, and M. Faulkner, "Linearization of multi-carrier power amplifiers", in *Vehicular Technology Conference*, 1993 IEEE 43rd, May 1993, pp. 684–687.

- [14] Colin R. Smithers, "Bipolar transistor rf power amplifier", U.S Patent No. 4 631 491, December 1986.
- [15] A.N. Brown and V. Petrovic, "Phase delay compensation in hf cartesian-loop transmitters", in *HF Radio Systems and Techniques*, 1988., Fourth International Conference on, Apr 1988, pp. 200–204.
- [16] Y. Ohishi, M. Minowa, E. Fukuda, and T. Takano, "Cartesian feedback amplifier with soft landing", in *Personal, Indoor and Mobile Radio Communications*, 1992. *Proceedings, PIMRC '92., Third IEEE International Symposium on*, Oct 1992, pp. 402–406.
- [17] J.L. Dawson and T.H. Lee, "Automatic phase alignment for a fully integrated cartesian feedback power amplifier system", *Solid-State Circuits*, *IEEE Journal* of, vol. 38, no. 12, pp. 2269–2279, Dec. 2003.
- [18] J.L. Dawson and T.H. Lee, "Automatic phase alignment for high bandwidth cartesian feedback power amplifiers", in *Radio and Wireless Conference*, 2000. *RAWCON 2000. 2000 IEEE*, 2000, pp. 71–74.
- [19] Chris J.M. Verhoeven, Arie van Staveren, G.L.E. Monna, M.H.L. Kouwenhoven, and E. Yildiz, *Structured Electronic Design: Negative-Feedback Amplifiers*, Springer, 1st edition, 10 2003.
- [20] Franco Maloberti, Data Converters, Springer, 1 edition, 2 2007.
- [21] Milton Abramowitz and Irene Stegun, Eds., Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (National Bureau of Standards Applied Mathematics Series #55), U.S. Department Of Commerce, 1972.
- [22] Thomas H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits, Second Edition, Cambridge University Press, 2nd edition, 12 2003.
- [23] Katsuhiko Ogata, Modern Control Engineering (5th Edition), Prentice Hall, 5 edition, 9 2009.
- [24] Advanced Design System 2009, "Amplifier2 (rf system amplifier)", http://edocs. soco.agilent.com/display/ads2009/Amplifier2+(RF+System+Amplifier), 2009.
- [25] Jan Rabaey, Low Power Design Essentials (Integrated Circuits and Systems), Springer, 1 edition, 4 2009.
- [26] Jan M. Rabaey, Anantha P. Chandrakasan, and Borivoje Nikolić, Assistant Professor, *Digital integrated circuits: a design perspective*, Prentice Hall electronics and VLSI series. Pearson Education, pub-PEARSON-EDUCATION:adr, second edition, 2003.
- [27] Sanjit Mitra, Digital Signal Processing, McGraw-Hill Science/Engineering/Math, 3 edition, 1 2005.

- [28] Shousheng He and M. Torkelson, "A complex array multiplier using distributed arithmetic", in *Custom Integrated Circuits Conference*, 1996., Proceedings of the *IEEE 1996*, May 1996, pp. 71–74.
- [29] Pietro Andreani and Lars Sundström, "A chip for linearization of rf power amplifiers using predistortion based on a bit-parallel complex multiplier", Analog Integr. Circuits Signal Process., vol. 22, no. 1, pp. 25–30, 2000.
- [30] Jack E. Volder, "The cordic trigonometric computing technique", *Electronic Computers, IEEE Transactions on*, vol. EC-8, no. 3, pp. 330–334, Sept. 1959.
- [31] Yuan-Long Jeang, Liang-Bi Chen, Jiun-Hau Tu, and Ing-Jer Huang, "An efficient and low power systolic squarer", in VLSI Design, Automation and Test, 2005. (VLSI-TSA-DAT). 2005 IEEE VLSI-TSA International Symposium on, April 2005, pp. 37–40.
- [32] Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Designs (The Oxford Series in Electrical and Computer Engineering), Oxford University Press, USA, 2 edition, 10 2009.
- [33] T. Enomoto and N. Kobayashi, "A low dynamic power and low leakage power 90-nm cmos square-root circuit", in *Design Automation*, 2006. Asia and South Pacific Conference on, Jan. 2006, pp. 2 pp.–.
- [34] Alain Vachoux, "Top-down digital design flow", http://lsm.epfl.ch/webdav/ site/lsm/shared/Resourcesdocuments/Topdown\_DF\_3.4.pdf, October 2008.
- [35] Aoki Laboratory, "Graduate school of Information Sciences, Tohoku University", http://www.aoki.ecei.tohoku.ac.jp/arith/, 2009.
- [36] Xilinx, "Spartan-3 Generation FPGA User Guide", http://www.xilinx.com/ support/documentation/user\_guides/ug331.pdf, December 2009.
- [37] Xilinx, "Xilinx<sup>®</sup> LogiCORE<sup>™</sup> Multiplier v11.2", http://www.xilinx.com/ support/documentation/ip\_documentation/mult\_gen\_ds255.pdf, December 2009.
- [38] Xilinx, "Xilinx<sup>®</sup> LogiCORE<sup>™</sup> Adder/Subtracter v11.0", http://www.xilinx. com/support/documentation/ip\_documentation/addsub\_ds214.pdf, April 2009.
- [39] Faraday Technology Corporation, "FSD0A\_A 90 nm Logic SP-RVT (Low K) Process", http://www.faraday-tech.com, 2006.
- [40] Michael R. Zill Dennis G.;Cullen, Advanced Engineering Mathematics, 2nd Ed., Jones Bartlett Pub, Sudbury, Massachusetts, U.S.A., 1999.
- [41] W. J. Jones, William B.; Thron, Continued Fractions: Analytic Theory and Applications, vol. 11, Addison-Wesley Publishing Co., Reading, Massachusetts, 1980.
- [42] Agilent Technologies, "IQ\_ModTuned (I/Q Modulator, Tuned)", http://cp. literature.agilent.com/litweb/pdf/ads15/ccsys/ccsys037.html, 2009.