## A wide-band frequency-tracking loop based on multiple sub-Nyquist samplers for a wide acquisition range and a fast locking

by

### Praneetha Sannidhanam

to obtain the degree of Master of Science at the Delft University of Technology, to be defended publicly on Thursday October 14, 2021 at 3:00 PM.

Student number:5144213Project duration:August 1, 2020 – October 14, 2021Thesis committee:Dr. M. Babaie,TU Delft, supervisorProf. dr. C. S. VaucherTU Delft, NXP SemiconductorsDr. F. Sebastiano,TU Delft

An electronic version of this thesis is available at http://repository.tudelft.nl/.



## Preface

The masters thesis that I have undertaken for the past one year was a real test of my capabilities. Although it was filled with ups and downs, it helped me understand my potential. Through the course of this one year, I was fortunate to have worked with some knowledgeable and supportive individuals who I would like to acknowledge in this section.

I would like to profusely thank my academic supervisor Dr. Masoud Babaie for his constant support and encouragement throughout this project. He always appreciated serious efforts and pushed for innovative problem solving. Next, I would like to extend my utmost respect and gratitude to Dr. Cicero Vaucher for giving me this amazing opportunity. It was a delightful pleasure working with him. I thank Dr. Fabio Sebastiano for being a part of my defense committee and providing helpful inputs during the cool-group meetings. I would also like to thank Jiang Gong, who is my daily supervisor and a very resourceful individual, for his help during my project. I am glad to have had the company of Lennart, Aishwarya, and Niels, as we all travelled in the same boat through the ebbs and flows of thesis.

I would like to acknowledge my wonderful friends at Delft - Harshini, Martijn, Niveditha, and Snigdha for making me feel home away from home, for proof reading my thesis and for making sure that I am well fed and rested. Special thanks to Gayathri and Gopi for always believing in me and lifting me up in my low times. I would particularly like to thank my late uncle M. A. S. Bhadram for instilling in me the love for science and the financial help. You will be deeply missed. Further, I extend my gratitude to my uncle A.S.N Murthy and family for their kind support. Finally, I would like to express my love and gratitude to my mother, father, and my sister for making me the person who I am and for always having my back. Last but not the least, I would like to thank Brooklyn 99 and Modern Family for keeping my sanity during the tough times of my project.

Praneetha Sannidhanam Delft, October 2021

## Contents

| 1 In           | Introduction                                                                                           |  |  |  |  |
|----------------|--------------------------------------------------------------------------------------------------------|--|--|--|--|
| 1.             | 1 Phase-locked loops                                                                                   |  |  |  |  |
| 1.             | 2 Conventional Frequency tracking loops                                                                |  |  |  |  |
|                | 1.2.1 Frequency sweeping                                                                               |  |  |  |  |
|                | 1.2.2 Frequency Tracking loops based on divider                                                        |  |  |  |  |
|                | 1.2.3 Cascaded PLL structures                                                                          |  |  |  |  |
| 1.             | 3 Need for a low settling time                                                                         |  |  |  |  |
| 1.             | 4 Objective                                                                                            |  |  |  |  |
| 1.             | 5 Thesis Organisation                                                                                  |  |  |  |  |
| 1.             | 6 Research Contributions                                                                               |  |  |  |  |
| 2 S            | ub-Nyquist Sampling for frequency estimation 9                                                         |  |  |  |  |
| 2 3            |                                                                                                        |  |  |  |  |
| 2.<br>2.       |                                                                                                        |  |  |  |  |
| ۷.             |                                                                                                        |  |  |  |  |
|                | 2.2.1       Chinese Remainder Theorem       11         2.2.2       Symmetrical Number Systems       13 |  |  |  |  |
|                |                                                                                                        |  |  |  |  |
| 2.             | 2.2.3 SNS with non-co-prime moduli and shifted Dynamic Range                                           |  |  |  |  |
| Ζ.             |                                                                                                        |  |  |  |  |
|                | 2.3.1 Limitations of using sub-nyquist sampling and SNS                                                |  |  |  |  |
|                | 2.3.2 Choosing Sub-Nyquist frequency Combinations                                                      |  |  |  |  |
| 3 B            | lock Level Implementation 23                                                                           |  |  |  |  |
| 3.             | 1 Frequency Estimation                                                                                 |  |  |  |  |
| 3.             | 2 Top level design                                                                                     |  |  |  |  |
| 3.             | 3 Specifications                                                                                       |  |  |  |  |
|                | 3.3.1 Sampling clock jitter                                                                            |  |  |  |  |
|                | 3.3.2 Deterministic Jitter of Sampling clock                                                           |  |  |  |  |
|                | 3.3.3 VCO spur performance                                                                             |  |  |  |  |
|                | 3.3.4 Amplifier specifications                                                                         |  |  |  |  |
|                | 3.3.5 Frequency Multiplier range                                                                       |  |  |  |  |
| <u>م م</u>     | nalog block Design 30                                                                                  |  |  |  |  |
| 4.             |                                                                                                        |  |  |  |  |
| 4.             | 4.1.1 Ring oscillator based type-I PLL                                                                 |  |  |  |  |
|                | 4.1.2 Post-layout Simulations                                                                          |  |  |  |  |
| 4.             | •                                                                                                      |  |  |  |  |
| 4.             |                                                                                                        |  |  |  |  |
| 4.             | ·                                                                                                      |  |  |  |  |
| 4.             |                                                                                                        |  |  |  |  |
|                |                                                                                                        |  |  |  |  |
| 4.             | 6 FTL Top Level Layout                                                                                 |  |  |  |  |
| 5 R            | TL Implementation 47                                                                                   |  |  |  |  |
| 5.             |                                                                                                        |  |  |  |  |
| 5.             |                                                                                                        |  |  |  |  |
| -              | 3 Speed and Accuracy Optimisation                                                                      |  |  |  |  |
| 5.             |                                                                                                        |  |  |  |  |
| 5.<br>5.<br>5. | 4 Lock and Unlock Detection circuit                                                                    |  |  |  |  |

| 6 | Post Layout simulation Results         6.1       Frequency locking         6.2       Impact of FTL on PLL performance         6.3       FTL Power consumption         6.4       Area and power comparison with State-of-the-art FTLs      | 58<br>59 |
|---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| 7 | Conclusion         7.1       Future Work.         7.1.1       Acquisition range of Ring-oscillator based frequency multipliers in FTL.         7.1.2       Power Optimization.         7.1.3       Scaling the FTL to higher frequencies. | 62<br>63 |

## Abstract

With the emergence of new communication standards like Fifth-Generation New Radio (5G NR), technologies are being developed to exploit millimeter-wave (mm-wave) frequency bands from 30-300 GHz, for their advantage of high bandwidth availability. Generation of carrier frequencies for these mm-wave applications imposes a challenging specification of high spectral-purity on the frequency synthesizers. Sub-sampling PLLs (SSPLLs) show a remarkable performance in terms of low-power and good spectral purity, which are critical for such high-speed applications. However, due to their low lock-in range, SSPLLs require the assistance of an additional frequency-tracking loop (FTL) for an improved locking performance. These FTLs conventionally employ either high power consuming frequency dividers, or reference frequency multipliers. In this thesis, a novel implementation of an FTL which avoids the usage of high-frequency dividers is proposed.

The proposed FTL uses three sub-Nyquist sampling rates, which are derived from three mutually co-prime integers, for an unambiguous VCO frequency estimation which helps in frequency error correction. Consequently, the proposed FTL eliminates the need of sampling rates higher than the Nyquist rate and the circuit limitations posed by such high sampling frequencies. The FTL employs a simple amplifier, counter and look-up table based VCO frequency estimation procedure which avoids the need of performing high complexity frequency estimation algorithms like Fast Fourier Transform (FFT). The FTL also features a speed optimization algorithm which helps in achieving low frequency-locking times.

The proposed FTL is designed in the 40-nm CMOS technology, targeting an output frequency locking range of 9.8-12.2 GHz. The post-layout simulations show that the FTL is able to coarsely lock to any desired frequency in the wide-band locking range within an error of 3 MHz, in less than 3  $\mu$ s at startup. Error injections as high as 1.5 GHz are efficiently detected and corrected in less than 3  $\mu$ s as well. The area consumed by the FTL is 0.35  $mm^2$  and the active area of the total chip is 1.09  $mm^2$  including the decoupling capacitors. The FTL consumes a maximum power of 1.56 mW when the PLL is in a locked state. A comparison with other state-of-the-art frequency-tracking loops demonstrates its clear advantage of wide-band frequency locking and low locking time, while consuming a similar amount of power. Analytically, the proposed FTL also exhibits competence in scaling to mm-wave frequencies.

## Introduction

Internet of Things (IoT) is all set to take over the world as it slowly steps into our day-to-day lives in the form of applications ranging from self-driving cars and smart health care systems to smart kitchen appliances. It is doing so by connecting various physical devices to a vast network, and efficiently collecting, and communicating data between them. With a prediction of billions of devices being connected to such networks in the next couple of years, IoT is pushing the existing communication technologies to their maximum capacity [1]. The advent of the fifth generation wireless technology (5G) presents a promising future for IoT owing to the increased data rates, increased capacity, and reduced latencies. This is achieved by moving into millimetre-wave (mm-Wave) frequency spectrum and exploiting frequency bands between 30-300GHz.

Phase-Locked Loops (PLL) are at the heart of any wireless communication system. They are used in frequency modulation and demodulation in transceivers, clock, and data recovery for noisy communication channels, and frequency synthesis for carrier frequency generation. The performance of these PLLs plays a major role in realising the goals of the 5G standard. The phase noise requirement of the PLLs has become quite stringent in order to meet the target error vector magnitude (EVM). To support reliable low latency communications, the switching time i.e., the settling time of the PLLs, needs to be fairly low. IoT, and similar battery-driven applications, have high demand for low power System-on-Chip (SoC) devices [2] which pushes the PLLs to also be more power efficient. In this chapter, starting with an introduction to PLLs, the trends of the current PLLs in these aspects are presented, which will converge upon the motivation and objective of this project.

#### 1.1. Phase-locked loops

A PLL is a ubiquitous and critical circuit block in many applications like transceivers and data converters. It is fundamentally a negative feedback circuit, which locks the output phase of an oscillator, which is usually prone to PVT variations and high phase noise, to that of a precise and stable reference input like a crystal oscillator to achieve a stable and low noise output clock. Second-order type-II PLLs have always been technologically important and most of the applications employ this configuration. A typical second-order type-II frequency synthesizer consists of a Voltage Controlled Oscillator (VCO), a phase detector (PD), a charge pump (CP) or an operational transconductance amplifier (OTA), a low pass filter (LPF), and a frequency divider in the feedback. This is shown in Figure 1.1.



Figure 1.1: Block diagram of a conventional second order PLL with a feedback divider.

The open-loop transfer function of this circuit in the phase-domain can be given by

$$G(s) = \frac{K_d K_{vco} Z(s)}{s},\tag{1.1}$$

where  $K_d$  is the PD+CP gain,  $K_{vco}$  is the VCO gain, and Z(s) is the loop filter gain. Then the closed loop phase-domain transfer function can be given by

$$H(s) = \frac{\phi_{out}(s)}{\phi_{in}(s)} = \frac{G(s)}{1 + G(s)/N} = \frac{K_d K_{\nu co} Z(s)}{s + K_d K_{\nu co} Z(s)/N},$$
(1.2)

where  $\phi_{out}$  is the output phase,  $\phi_{in}$  is the input phase, and N is the integer division factor. Most of the high-gain systems take the below form [3].

$$H(s) = \frac{2\zeta\omega_n s + \omega_n^2}{s^2 + 2\zeta\omega_n s + \omega_n^2},\tag{1.3}$$

where  $\omega_n$  is the natural frequency and  $\zeta$  is the damping factor of the system, given by

$$\omega_n = \sqrt{\frac{K_d K_{\nu co}}{C_p}}; \zeta = \frac{R_p}{2} \sqrt{K_d K_{\nu co} C_p}, \tag{1.4}$$

when LPF gain is of the form  $Z(s) = \frac{1+sC_pR_p}{sC_p}$ .

Some of the important attributes of a PLL are its frequency range, spectral purity, power consumption, locking time, and area. The area is usually dominated by the large inductors used in the VCO, while the power consumption is dominated by circuit blocks functioning at high frequency like the VCO and the feedback divider. Design choices such as the loop bandwidth affect the settling time, phase noise, and spur performances of the PLL. A higher bandwidth is desirable for faster settling time or VCO noise suppression, where as a smaller bandwidth is desired for lower spurs and in-band noise suppression. However, the maximum bandwidth is limited to 1/10<sup>th</sup> of the reference frequency due to stability limitations [4].

Different aspects of a PLL are important for different applications. With a surge in high-speed applications, a good phase noise performance of PLLs has become critical. In recent times, the subsampling PLLs (SSPLLs) have gained popularity for their advantage of high phase detector (PD) gain, which helps in suppressing the phase noise of the PD and its following stages. In this architecture, the oscillator output is directly sampled by the reference clock in the phase detector, as shown in Figure 1.2. Therefore, the high frequency divider in the feedback path is avoided, which in turn avoids its noise and high power consumption.



Figure 1.2: Type-II sub-sampling PLL.

While a clear advantage is seen in low in-band phase noise, the SSPLLs observe a degradation of the reference spur, due to direct sampling of the VCO output. In the present architectures this problem was resolved by employing various solutions like using an output buffer for isolation of VCO, or even adding a dummy sampler [5] to compensate the VCO capacitance modulation. In [6], a very low power charge-sampling PD (CSPD) is proposed, which is able to achieve a good reference spur performance without the use of any high-frequency buffers.

Although sub-sampling PLLs exhibit a leading performance in spectral purity along with low power consumption, they have a few setbacks which hinder their suitability in mm-Wave applications [7]. The biggest limitation is their low acquisition, and lock-in range. The acquisition/capture range is the maximum frequency deviation of the free-running oscillator frequency from the desired locking frequency, that the loop can correct and acquire a frequency lock from an unlocked state, and the lock-in range is the maximum frequency deviation, from an initially unlocked state, that can be corrected and a locked state is achieved without cycle slipping. As explained extensively in [3], the lock-in and acquisition ranges depend largely on the PD and the LPF characteristics.

#### Lock-in Range in SSPLL:

The phase detector used in SSPLL is usually a sampling circuit where the VCO output of frequency  $f_{vco}$  is sub-sampled by the reference input of frequency  $f_{ref}$ . When both the signals are considered to be sinusoidal in nature, the PD output in an unlocked state can be written as

$$V_{\rm s} = K_d \sin(\Delta \omega \cdot t) + high - freq \ terms, \tag{1.5}$$

where  $\Delta \omega = |f_{vco} - N \cdot f_{ref}|$  is the frequency deviation (can also be called the alias frequency). The high-frequency terms are usually high enough to be filtered by the LPF. The filter output can then be given by

$$V_f = K_d sin(\Delta \omega \cdot t) \cdot Z(\Delta \omega), \tag{1.6}$$

where  $Z(\Delta\omega)$  is the LPF gain at frequency  $\Delta\omega$ . As  $V_f$  is the control signal to the VCO, the change in VCO frequency can be given by

$$\Delta\omega_{vco} = K_{vco}V_f = K_{vco}K_d Z(\Delta\omega)sin(\Delta\omega \cdot t).$$
(1.7)

As can be seen from the above equation, the VCO frequency is modulated between  $[-K_{vco}K_dZ(\Delta\omega), K_{vco}K_dZ(\Delta\omega)]$  around its free-running frequency. Therefore, the VCO can successfully lock to any desired frequency that lies within this error range. Since the lock-in range  $\Delta\omega_L$  is the maximum frequency error that can be corrected without cycle slipping, it can be equated to the maximum achievable frequency change in Eq. (1.7), which can be given as

$$\Delta\omega_L = K_{\nu co} K_d Z(\Delta\omega). \tag{1.8}$$

It is assumed that  $\Delta \omega_L$  is always greater than the poles and zeroes of the LPF. Considering the given assumption, for any kind of loop filter Z(s), the lock-in range for a PD with a sinusoidal characteristic (shown in Eq. (1.6), is approximated in [3] as below,

$$\Delta\omega_L = 2\zeta\omega_n. \tag{1.9}$$

The lock-in range can be extended by employing a PD with a larger linear range (phase error vs output voltage characteristics). For example, an XOR gate based PD, with a linear range of  $[-\pi/2, \pi/2]$ , has

a lock-in range of  $\pi\zeta\omega_n$ , whereas a Phase Frequency Detector (PFD), with a linear range of  $[-2\pi, 2\pi]$ , provides a lock-in range of  $4\pi\zeta\omega_n$  [3]. Since the PD, CP, and VCO gains are limited by the loop bandwidth and stability, the lock-in ranges cannot be arbitrarily high. Hence, SSPLLs suffer from a low lock-in range. A low lock-in range results in the PLL losing lock easily if there are any frequency disturbances larger than the lock-in range. SSPLLs also have a risk of locking to a different harmonic than the desired one as the sampling circuit cannot differentiate between harmonics.

Therefore, for a robust locking, an additional frequency tracking loop (FTL) is needed. Conventional FTLs (explained in the next section) make use of either high-frequency dividers or an additional cascade input stage to increase the reference frequency. These solutions are usually power-hungry either due to the components working at high frequencies or due to stringent phase noise requirements. A new FTL approach that can perform the same task without the use of such high frequency blocks is necessary and is the research interest of this thesis.

The general functions of an FTL are to estimate the VCO frequency, calculate the error from the desired frequency, and tune the VCO to reach the target frequency. Out of all these functions, accurately estimating the frequency is of prime importance. Sinusoidal signal frequency estimation conventionally involves sampling of the signal and then performing Fast Fourier Transform (FFT) on the samples to obtain the frequency information. This method requires very high sampling rates to satisfy Nyquist criterion, which poses performance constraints on the circuit blocks like Analog-to-Digital converters (ADCs) which work at such high frequencies. Additionally, performing FFT involves high computational complexity. Multiple sub-Nyquist sampler based frequency estimation is a widely researched topic in the fields of wide-band spectrum sensing and cognitive radio [8]. However, this method of frequency sensing has not been implemented in PLLs yet. In this thesis, an FTL based on multiple sub-Nyquist samplers is proposed and its performance is compared with the state-of-the-art FTLs to analyse its efficiency.

#### **1.2. Conventional Frequency tracking loops**

This section describes a few conventional implementations of an FTL that are used as an aid to SSPLLs, for faster and efficient frequency acquisition.

#### 1.2.1. Frequency sweeping

One of the practical and common aided frequency acquisition techniques is by sweeping the VCO control voltage [4],[9]. By applying a slow ramp to the VCO control voltage, the VCO output frequency is swept and the loop locks when the VCO reaches close to the desired frequency, i.e., within the lock-in range. To ensure that the frequency ramp is terminated after the PLL is locked, a closed feedback loop with lock detection is necessary. The voltage/frequency ramp slope has an upper limit for a reliable locking so that the ramp does not sweep past the locking point. This method is mostly useful in the case of an initial acquisition. However, in the case of frequency unlock due to error injection, the entire frequency tuning range needs to be swept again before getting relocked. Furthermore, the acquisition time is also linearly proportional to the search range.

#### 1.2.2. Frequency Tracking loops based on divider

The most typical solution to increase the lock-in range is to use closed-loop solutions [10] for frequency tracing. Most of the conventional structures using a frequency tracking loop for wider locking/acquisition range adopt a frequency divider/counter in combination with a phase frequency detector (PFD), as shown in figures 1.3a, 1.3b, where they are either used in the PLL main loop feedback itself or in the feedback of an auxiliary frequency tracking loop. As explained in [3], since PFD has a tristate output, any loop filter that is used with PFD will act as a real integrator. Since a real integrator has an infinite gain at DC (frequency = 0), the acquisition range theoretically becomes infinite. This means that the PLL will always reach a locked state for arbitrarily large initial frequency deviations. However, practically the acquisition range is limited to the VCO tuning range.



Figure 1.3: (a) Type-II PLL with feedback divider, (b) Type-II Sub-sampling PLL with divider based FTL.

Although an FTL using a divider and a PFD provides robust wide-band locking, it has a circuit block (the divider) functioning at high VCO frequencies, which makes the loop very power consuming. [11] used a frequency divider in the feedback circuit and has a similar structure as Fig. 1.3a, where the power consumption of the divider for a VCO frequency of 6.3 GHz is 1.1 mW. Zhao Zhang, *et al.*, [12] implemented a 12-16GHz PLL with a divider-based FTL similar to 1.3b. The power consumption of the divider circuit alone comes up to 1.5 mW.

Furthermore, as PLLs move towards mm-wave applications, digital frequency dividers are limited by circuit speed and consume large power. Alternatively, injection-locked Frequency Dividers (ILFD) are used, which are essentially oscillators that are locked to VCO frequency and output a divided frequency. Conventional ILFDs also have (1) limited locking range, (2) are susceptible to PVT variations [13], and (3) expect high injection strength from the VCO [14]. Current Mode Logic (CML) dividers are another type of dividers that are usually used at high frequencies which consume high static current. Consequently, these blocks that need to operate at high frequencies lead to high power consumption, reducing their suitability for low power applications.

#### 1.2.3. Cascaded PLL structures

Cascaded PLL structures, shown in Figure 1.4, are another form of PLLs that aim to increase the bandwidth of the main PLL stage such that their lock-in range can be extended. They employ a PLL in the first stage to lock to an intermediate frequency (IF), which then serves as the reference frequency to the subsequent stage. Since the reference frequency of the second stage is higher, the loop can be stable for much higher bandwidths, leading to extended lock-in ranges. Since the high-frequency dividers are avoided in the second stage and the dividers are only used in the first stage, which has a relatively low frequency of operation, the power consumption is also reduced.



Figure 1.4: Cascaded PLL structure with a reference multiplier to achieve a high bandwidth.

However, this method has a few limitations. Firstly, the IF should be chosen high enough such that the extended bandwidth covers the entire tuning range of the main PLL. Secondly, the output frequency tuning range resolution is equal to its reference frequency in the case of an integer-N PLL. As the IF increases, the frequency resolution reduces. Thirdly, there is a trade-off between the in-band phase noise and the acquisition range because the phase noise of the first stage will be amplified by  $N_2^2$  factor and larger bandwidths lead to higher in-band phase noise. As the phase noise constraint on this stage becomes more stringent to meet the specifications, the power consumption will be also be increased.

Some structures use a gear shifting mechanism wherein during the locking period, they switch to a higher reference frequency or to a different loop so that the loop parameters can be altered to achieve

a larger lock-in range and faster settling [14]. However, the lock-in range is still limited to the higher reference frequency and locking to only one frequency is possible.

In conclusion, we observe that most of the conventional FTLs used, have a disadvantage of high power consumption because of either high-frequency circuit blocks or a strict phase noise constraint. As the PLL output frequency is further increased into mm-Wave bands, the FTL power consumption scales profusely, hence calling for a need of a new FTL architecture that avoids any high-frequency and high power consuming blocks.

#### 1.3. Need for a low settling time

Applications like 5G New Radio (NR) and Long-Term evolution (LTE) dynamically allocate resources to users, to improve spectral efficiency and increase the system capacity. Additionally, in applications like Bluetooth Low Energy (BLE) and Global Systems for Mobile communications (GSM), an effective technique called frequency hopping is employed to avoid interference [15]. The frequency hopping technique switches the carrier frequency of the signal to different channels in consecutive time intervals in a pseudo-random order, as shown in the Figure 1.5. Consequently, the local oscillator should switch to the desired frequency as soon as possible to not degrade the latency. The time available for the local oscillator (LO) to switch from one centre frequency to another must be much less than the cyclic prefix time in LTE standard (i.e., <5.2  $\mu$ s) [16].



Figure 1.5: Frequency hopping.

Furthermore, in many communication systems, resources are shared between users by multiplexing them. In the case of time-division multiplexing (TDM), different user signals are transmitted over a single channel by multiplexing them over time. To reduce interference between different users, there is a limit on the spurious emissions on the users that are not being transmitted (not transmitting = OFF region), on the one being transmitted (transmitting = ON region) in a certain time duration. To ensure low spurious emissions, the users in OFF region are turned OFF or put to low power mode. When they start transmitting again, all the components like power amplifiers, and transceivers are powered ON and the PLLs need to be settled to the correct frequency [17]. This transition duration (ON to OFF or OFF to ON) is called the transient period. In the  $3^{rd}$  Generation Partnership Project (3GPP) 5G NR standard [18], this transient period is required to be 10  $\mu$ s, as shown in figure 1.6.



Figure 1.6: Transmit ON/OFF transient time specification for 5G NR [18]

It is not easy to switch the PLL to a different frequency in one step due to non-linearity in the oscillator gain. Since the oscillator frequency is usually controlled by switched capacitors and the frequency is inversely proportional to the square root of the tank capacitance, the gain which is dependent on the capacitance is not constant in the entire frequency range. Literature shows some methods where the locking time is reduced by improving the linearity of VCO tank capacitances [7], or by externally controlling the VCO free-running frequency to be as close to frequency code word (FCW) as possible so that it locks immediately. However, a frequency tracking loop that can operate in the background is necessary for a fast and efficient frequency switching. This project investigates the use of a speed optimization algorithm for a fast settling performance in PLLs.

#### 1.4. Objective

The motivation of this project is the need for a low-power frequency tracking loop with wide locking/acquisition range and low acquisition time. The obvious solution for reducing power consumption is by avoiding blocks like frequency dividers which operate at high VCO frequencies. This thesis explores the possibility of using multiple sub-Nyquist samplers to accurately estimate the frequency error, of any magnitude, between the VCO frequency and the desired frequency. It targets an acquisition range of 2 GHz and a locking time less than  $5 \,\mu$ s to have sufficient margin from the 5G transient requirements.

#### 1.5. Thesis Organisation

An overview of this thesis report is given in this section. As an introduction to PLLs and the conventional implementation is already given in the current chapter, the following chapters delve into the concept and implementation of the proposed FTL. First, the possibility of using multiple sub-Nyquist samplers for signal frequency estimation are explored studied with a mathematical approach in Chapter 2. Additionally, some supporting formulae have been derived, to estimate the frequency bandwidth in which an unambiguous frequency estimation/reconstruction is possible for a given system of sampling frequencies. Furthermore, depending on the possible circuit limitations, the optimum number of frequency samplers and the limits on the sampling frequencies are obtained.

In Chapter 3, the block level implementation of the proposed FTL is presented and various specifications required for each analog block for a robust frequency locking are derived. Next, the circuit level implementation of each of the analog sub-blocks in the FTL are explained in detail and their post-layout performance is presented in Chapter 4. In chapter 5, a detailed description of the RTL implementation of the FTL digital block, which houses the frequency estimation and the speed optimization algorithms, is given. Finally, the locking and power performance results of the FTL are discussed in chapter 6 and the report ends with a conclusion drawn upon these results.

#### **1.6. Research Contributions**

My research contributions through this thesis are summarized below.

 Proposed a new sub-Nyquist samplers based FTL, which can support wide-band acquisition range and small locking time requirements.

- Studied the theory of symmetric number systems to mathematically support the use of multiple sub-Nyquist frequencies for frequency estimation and also proposed a further extension of this theory to non-pairwise coprime frequencies, in Chapter 2.
- Derived the necessary mathematical formulae to calculate the dynamic range offered by a system of sampling frequencies at a frequency band of interest, in Chapter 2.
- Implemented a frequency estimation technique, with dynamic speed and accuracy optimization, explained in Chapters 3 and 5, which help in minimizing the locking time.
- A possibility to extend this FTL to mm-Wave frequencies is analysed in Chapter 7.

# 2

## Sub-Nyquist Sampling for frequency estimation

The process of locking a PLL to the desired frequency involves the estimation of the frequency error between the desired and the VCO frequencies, for which estimating the VCO frequency first is necessary. Sinusoidal signal frequency estimation finds importance in numerous applications like communication, radar, image analysis, power grid stability, and many more. Most of these applications expect an accurate estimation of frequency from a finite number of noisy samples. Furthermore, some applications require estimation of multiple sinusoidal signals in a spectrum, e.g. cognitive radio [8]. Most of these applications use Fast Fourier Transform (FFT) based methods for frequency estimation, which require the sampling rates to be higher than the Nyquist rate. Recent research suggests use of multiple sub-Nyquist sampling frequencies to overcome the challenges posed by high sampling frequencies [19].

This chapter first introduces the conventional method of frequency estimation using sampling and FFT. Second, the concept of modulo arithmetic is introduced and then the Chinese Remainder Theorem (CRT) which lays a foundation to frequency estimation from sub-Nyquist frequencies is presented. Thirdly, the existing literature (Symmetrical Number Systems (SNS) [20]) which proves that an unambiguous frequency estimation from sub-Nyquist frequencies is possible under some conditions, is discussed. Then the theory of SNS is further extended to apply for the scenarios of PLLs with finite tuning range and with a frequency-tracking loop. Finally, these estimations are used to choose a sampling frequency combination to be used in the frequency tracking loop for frequency estimation.

#### 2.1. Sampling and FFT

Periodogram is a process conventionally used to calculate the power spectral density of a signal, which is then analysed to estimate its frequency [21]. Consider a sinusoidal signal represented by x(t) as shown in equation 2.1.

$$x(t) = A \cdot \sin(2\pi f_0 t + \theta). \tag{2.1}$$

where  $f_0$  is the unknown signal frequency, A is the complex amplitude,  $\theta$  is the initial phase shift and t is time. Figure 2.1 represents a procedure to estimate the input signal frequency using Discrete Fourier Transform (DFT). The signal is first sampled by a sampling frequency of  $f_s$  to discretize it, and then converted to digital form by an Analog to Digital converter (ADC). Consider that N digital samples are taken. These sampled values can be given by a vector  $X_k = \{X_0, X_1, X_2, ..., X_{N-1}\}$ , whose values are

$$x[k] = X_k = A \cdot sin(\frac{2\pi k f_0}{f_s} + \theta), \qquad (2.2)$$

where x[k] is the discrete representation of the signal x(t),  $k \in \{0, 1, ...N - 1\}$ ,  $X_k$  is the real sampled value at instant  $k/f_s$ , and  $f_0$  is the down-converted frequency. By Nyquist-Shannon theorem,  $f_0$  lies between  $[0, f_s/2]$ . As explained in [22], the frequency of the signal in equation 2.1 can be uniquely identified and reconstructed by sampling it and performing DFT on it.



Figure 2.1: General block diagram of discrete frequency estimation method

Discrete Fourier Transform (DFT) converts the time domain sequence into a periodic sequence in frequency domain. By taking an N-point DFT, the frequency range from  $0-f_s$  or  $(-f_s/2 \text{ to } f_s)$  is divided into N bins. The complex coefficients for an N-point DFT are given by

$$X[n] = \sum_{k=0}^{N-1} X_k \cdot e^{-j(\frac{2\pi kn}{N})},$$
(2.3)

where  $n \in \{0, 1, ...N - 1\}$ , and n/N is the normalized representation of the frequency bin that the coefficient belongs to, and  $N^{th}$  bin is equal to the sampling frequency  $f_s$ . To find the power spectral densities, the square of the complex coefficients magnitude is calculated by taking their conjugate multiplication as shown below.  $n^{th}$  bin's power is given by

$$P[n] = \left| \frac{X[n] \cdot X^*[n]}{N^2} \right|.$$
 (2.4)

If the signal period is an integer multiple of the sampling interval ( $T_0 = mT_s$ ), then it coincides with the centres of one of the bins and there is one maximum tone visible in the power spectral density between  $[0, -f_s/2]$  that corresponds to  $f_0$ . Thus by finding the peak in half of PSD vector P[n], the sinusoidal signal frequency can be estimated.

The limitations of this method are:

- 1. Calculation of DFT requires  $N^2$  complex multiplications which result in high computational and storage complexity.
- 2. For an unambiguous estimation of a frequency  $f_0$ , the minimum sampling frequency  $f_s$  required is  $f_s \ge 2 * f_0$  as stated by Nyquist theorem. This requires functioning of the ADC at  $f_s$  frequency leading to high power consumption and reduced effective resolutions. This also poses constraints on digital block speed requirements, and anti-aliasing filters.
- 3. A lower sampling frequency  $f_0 < f_s \le 2 * f_0$  can be used for frequency estimation. However, it then requires the generation of I (in-phase) and Q (Quadrature) phases for the estimation of the signal's complex value  $Z_k$  instead of the real value  $X_k$ , for a single side band down-conversion. Then a range of  $[0 f_s]$  can be utilized.
- 4. The frequency resolution with which the frequency can be estimated depends on the number of samples taken (N) which directly affects the DFT computation complexity given by  $O(N^2)$ . Fast Fourier Transform (FFT) which is an improved algorithm needs  $O(N \cdot log N)$ .
- 5. If the sampled frequency's time period is not an integer multiple of sampling time period, then there will be spectral leakage in PSD leading to reduced magnitude of the tone at  $f_0$  and increased tones at other frequencies.

#### 2.2. Sub-Nyquist Sampling and Frequency Estimation

According to Nyquist-Shannon sampling theorem, to unambiguiously reconstruct a signal with maximum frequency  $f_{max}$  the sampling frequency should be at least equal to or greater than twice the maximum frequency ( $f_s \ge 2f_{max}$ ). When a sampling frequency  $f_s$  does not meet the Nyquist criteria, a phenomenon called aliasing occurs. When an analog signal is sampled at uniform time intervals ( $T_s = \frac{1}{f_s}$ ) to make it discrete, the output is a discrete spectrum that is symmetrical about  $f_s/2$  and periodic for every  $f_s$ . This can be seen in figure 2.2. Let a real valued signal of frequency f from Eq. (2.1) be sampled by a sampling clock at time instants  $nT_s$  where n = {0,1,2,..}. The output in time domain can be given by,

$$x(t) = A \cdot \sin(2\pi f \cdot nT_s) = A \cdot \sin(2n\pi \frac{f_s}{f_s}),$$
  
$$x(t) = A \cdot \sin(2n\pi \frac{mf_s + \Delta f}{f_s}) = A \cdot \sin(2n\pi \frac{\Delta f}{f_s}),$$
 (2.5)

where *m* is an integer ( $mf_s$  is an integer multiple closest to *f* since  $f > f_s$ ) and  $\Delta f$  is the frequency difference between *f* and  $mf_s$ . Figure 2.2 shows the aliasing frequency  $f_a = \Delta f$  as a function of signal frequency  $f = f_{in}$  as obtained form Eq. (2.5). This frequency response shows that for a given output aliasing frequency  $f_x$  there are numerous possibilities of  $f_{in}$  that give the same response.



Where f<sub>s</sub> = Sampling Frequency

Figure 2.2: Sampling output vs input signal

In the past, several articles have been published proposing the use of multiple such sub-Nyquist frequencies whose low frequency operation is exploited to reduce the limitations on high speed digital circuits, filters and ADCs. The following sub-sections describe the mathematical theorems that support this concept.

#### 2.2.1. Chinese Remainder Theorem

Chinese Remainder Theorem (CRT) is a well known concept in cryptography [19]. Applications like sensor networks use CRT to estimate frequency when there are multiple under-sampled signal waveforms. Necessary condition is that the input signal x(t) should be a complex value signal rather than a real value signal in order to make it a single sided spectrum and then the range from  $0-f_s$  can be utilized for a sampling frequency of  $f_s$ .

To understand the CRT, a mathematical function is introduced here.

#### Modulo function:

This function returns the remainder of an integer division. It is written as,

$$a_r = L \mod m, \tag{2.6}$$

such that  $0 \le a_r \le m$ , where *L* is the integer dividend, *m* is the divisor (also called modulo) and  $a_r$  is the remainder. For any integers *L*, *m* there always exist a unique pair of integers "*k*", "*a<sub>r</sub>*" which can be written as  $a_r = L \mod m$  and satisfy  $L = km + a_r$ . "*a<sub>r</sub>*" is also called a "residue".

If there are two integers b, c which are called "congruent modulo m", it is represented as,

$$b \equiv c \mod m. \tag{2.7}$$

It means that both *b*, *c* have the same remainder when divided by *m* and (*c-b*) is divisible by *m* [19]. Thus the integer N is always congruent to its remainder  $a_r \equiv N \mod m$  or  $N \equiv a_r \mod m$ . The transfer function of a modulus function can be seen in figure 2.3 which is periodic with a period *m*.



Figure 2.3: Mod function transfer curve

#### 3.2.2.1 Modulo Arthematic Identities

Some modulo arithmetic identities are given below which help in understanding some derivations made in further sections. For all positive integers a,b,c,d,k,m:

1. If  $a \equiv b \mod m$ , and  $c \equiv d \mod m$ , when  $0 \le b, d \le m$ , then

$$a + c \equiv (b + d) \mod m. \tag{2.8}$$

$$a - c \equiv (b - d) \mod m. \tag{2.9}$$

$$a \cdot c \equiv (b \cdot d) \mod m. \tag{2.10}$$

2. If  $a \equiv b \mod m$ , and for any integer k

$$a + k \equiv (b + k) \mod m. \tag{2.11}$$

$$a - k \equiv (b - k) \mod m. \tag{2.12}$$

$$a \cdot k \equiv (b \cdot k) \mod m. \tag{2.13}$$

$$a^k \equiv (b^k) \mod m. \tag{2.14}$$

3.

$$a \mod m + (-a) \mod m = 0.$$
 (2.15)

Let  $m_r = \{m_1, m_2, ...m_p\}$  be p integers (p moduli) such that all are pairwise co-prime, which means when each pair is considered, they have no common factors and their greatest common divisor (GCD) is 1. Let N be any integer such that  $N < N_{max}$  where  $N_{max} = LCM(m_1, m_2, m_3, ...m_p)$ . The Least common multiple (LCM) in case of coprime numbers is just their product  $(N_{max} = m_1 \cdot m_2 \cdot m_3 ... \cdot m_p)$  since they have no common factors. The integer array of reminders can be given by  $n_r = \{a_1, a_2, ...a_p\}$  where  $a_r = N \mod m_r \ \forall r \in \{1, 2...p\}$ .

#### Theorem 1:

The Chinese Remainder theorem states that there is one and only one unique value of L in  $[0, L_{max})$  such that:

$$L \mod L_{max} \to \{a_1, a_2, ... a_p\}.$$
 (2.16)

That is, if given the values of vectors  $n_r$  and  $m_r$ , there is only one value of L in  $[0, L_{max})$  that maps to the residue vector  $a_r$  and the value of L can be unambiguously determined. If there are two values  $L_1$  and  $L_2$  that satisfy the above condition, then  $L_2 \equiv L_1 \mod L_{max}$ . Theorem 1 can also be explained in terms of congruence ( equivalence or mapping) that there is only one integer L <  $L_{max}$  which satisfies the simultaneous linear congruence equations.

$$L \equiv a_1 \pmod{m_1},$$
  

$$L \equiv a_2 \pmod{m_2},$$
  

$$\vdots$$
  

$$L \equiv a_p \pmod{m_p}.$$

 $[0, L_{max})$  is called the dynamic range of the system of moduli {  $m_1, m_2, ..., m_p$  }.

However, this theorem cannot be directly applied to our case since, as explained in Figure 2.2, the sampling transfer curve of a real-valued signal is triangular in nature and symmetrical around  $f_s/2$  unlike the modulus function which is saw-tooth in nature. Some applications use both I and Q phases to convert the real values into complex values in which case the CRT works. However, it would involve two sampling phases per sampling frequency, ADCs in each phase and then DFT to estimate the alias frequency tones before the CRT can be applied.

#### 2.2.2. Symmetrical Number Systems

Although CRT cannot be directly applied to alias frequency transfer curves, it still establishes a concept that there exists a range of integers that can unambiguously reconstructed given a set of residues that are periodic in nature. To alleviate the drawbacks of CRT method, studies have been done using Symmetrical number systems (SNS) to reconstruct frequencies using under-sampled signals. As seen in Figure 2.2, if a DFT is taken on a under-sampled real valued sinusoidal signal, it gives out a symmetrical response which fits into a symmetrical number system [20]. If the triangular response in 2.2 is expressed mathematically by Eq. (2.17) when *m* is the integer modulo equivalent to the sampling frequency and *h* is any integer equivalent to the signal frequency. Consider that  $0 \le h < m$ . The aliasing frequency or the symmetrical number system (SNS) response can be given by

$$a_{h} = min\{h, m-h\} = \begin{cases} h \mod m, & 0 \le h \le \left\lfloor \frac{m}{2} \right\rfloor \\ -h \mod m = (m-h) \mod m, & \left\lfloor \frac{m}{2} \right\rfloor < h < m \end{cases}$$
(2.17)

where [x] represents a floor function. Here  $a_h$  is called a *symmetric residue* and this function is periodic with a period *m* which is represented as below for any integer value of *n*.

С

$$a_{h+nm} = a_h \tag{2.18}$$

Eq. (2.17),(2.18) can also be written in congruence form as

$$h \equiv \begin{cases} (h+nm) \mod m, & 0 \le h \le \left\lfloor \frac{m}{2} \right\rfloor, \\ -(h+nm) \mod m, & \left\lfloor \frac{m}{2} \right\rfloor < h < m. \end{cases}$$
(2.19)

Intuitively, it can be seen that when compared to the modulo transfer function Figure 2.3, the SNS transfer function Figure 2.2 has more redundancy (since the negative frequency errors are also folded back to the positive frequency errors). This means that, when *p* co-prime moduli  $\{m_1, m_2, ..., m_p\}$  are used to resolve an integer N uniquely, the dynamic range within which this is possible may be much less compared to that found using CRT which is  $\prod_{r=0}^{p} m_r$ . [20] provides a derivation of this dynamic range based on SNS which is explained in this sub-section.

Consider p pairwise co-prime moduli  $m_r = \{m_1, m_2, ..m_p\}$  and let  $A_h$  be a column vector such that for an integer h,

$$A_h = \begin{bmatrix} a_h^1 \\ a_h^2 \\ \vdots \\ a_h^p \end{bmatrix},$$

where  $a_h^1$  is the symmetrical residue of  $h \mod m_1$  and so on. Then the dynamic range of the moduli set  $m_r$ , is the highest integer (K-1) that can be unambiguously reconstructed since all the column vectors  $A_0, A_1, ..., A_{K-1}$  are unique. Let K be the first integer that causes ambiguity because  $A_K = A_h$  where  $0 \le h < K$ . This is possible only if  $a_K = a_h$  for all moduli  $m_r$ , which in turn is possible if and only if  $h \equiv \pm(K) \mod m_r \forall m_r \in \{m_1, m_2, ..., m_p\}$ . It can also be represented as,

$$A_h = \begin{bmatrix} a_h^1 \\ a_h^2 \\ \vdots \\ a_h^p \end{bmatrix} = A_K = \begin{bmatrix} a_K^1 \\ a_K^2 \\ \vdots \\ a_K^p \end{bmatrix}$$

**Theorem 2:** Let  $\{m_1, m_2, ..m_p\}$  be p pairwise co-prime moduli, then the dynamic range  $\hat{M}$  of this system can be given by:

1. If one of the elements in vector  $m_r$ , say  $m_1$ , is an even number, then

$$\hat{M} = min\left\{\frac{m_1}{2}\Pi_{i=2}^j m_i + \Pi_{i=j+1}^p m_i\right\}$$
(2.20)

for all values of  $j \in [1,p-1]$  and all permutations and combinations of  $m_2, m_3, ..., m_p$ .

2. If all moduli are odd numbers, then

$$\hat{M} = \min\left\{\frac{1}{2}\Pi_{i=1}^{j}m_{i} + \frac{1}{2}\Pi_{i=j+1}^{p}m_{i}\right\}$$
(2.21)

for all values of  $j \in [1,p-1]$  and all permutations and combinations of  $m_2, m_3, ..., m_p$ .

To explain this with an example, let  $\{m_1, m_2, m_3\} = \{11, 13, 17\}$ , then the smallest integer that will cause ambiguity in this system is given by

$$\hat{M} = min\left\{\frac{m_1}{2} + m_2 \cdot m_3, \frac{m_2}{2} + m_1 \cdot m_3, \frac{m_3}{2} + m_2 \cdot m_1\right\} = min\{135, 101, 91\} = 91$$

#### Proof:

Consider that the ambiguity occurs at integer K since  $A_K = A_h$  for  $0 \le h < K$ . Ambiguity occurs because each vector element  $a_K^r = a_h^r$ . This means, for each modulo  $m_r$ , (K-h) is either an integer multiple of  $m_r$  such that  $K \equiv h + n_r m_r \equiv h \equiv a_h \pmod{m_r}$ , or K folds back onto (-h mod  $m_r$ ). Let K=h+k, where  $h \ge 0$ , k > 0, then the above two conditions can be represented as below,

$$h \equiv (h+k) \pmod{m_i}, \quad 1 \le i \le j, \forall j \le p, \tag{2.22}$$

$$h \equiv -(h+k) \pmod{m_i}, \quad j+1 \le i \le p.$$
 (2.23)

For *j* moduli the same symmetric residue  $a_h$  is observed due to periodic occurrence and for the rest p - j moduli, same  $a_h$  is seen due to folding back onto  $-a_h$ . The proof here forth can be divided into two cases, (1) where one of the moduli is even, (2) All moduli are odd. Only the derivation of one even modulo is explained since the second case also follows the same steps of derivation.

**Proof of Theorem 2.1:** Consider that  $m_1$  is an even number. For the equation (2.22) to be satisfied, *k* should be a multiple of each modulo  $m_1, m_2..m_j$  for any  $j \le p$ . Since each of these moduli are pairwise co-prime, the only way *k* is an integer multiple of all of them is when  $k \equiv 0 \mod \prod_{i=1}^{j} m_i$  i.e.,  $k = a \prod_{i=1}^{j} m_i$  where *a* is a positive integer,  $\forall a \in \{1,2,3..\}$ . Since  $m_1$  is even, *k* can be written as,

$$\frac{k}{2} = \frac{am_1}{2} \prod_{i=2}^{j} m_i.$$
 (2.24)

For the remaining p-j moduli, considering equation 2.23, adding h on both sides and applying the modulo identity 2.15 followed by identity 2.13, equation 2.23 can be brought to the form

$$2h \equiv -k \pmod{m_i} \Longrightarrow h \equiv a_h \equiv \frac{-k}{2} \pmod{m_i}, \quad j+1 \le i \le p.$$
(2.25)

Since we considered that the p-j moduli are pairwise co-prime, from the CRT equation 2.16 solving the set of linear congruences, there are multiple solutions of h,k for equation 2.25, such that for any positive integer b,

$$h - \frac{-k}{2} = h + \frac{k}{2} = b \prod_{i=j+1}^{p} m_i.$$
 (2.26)

By adding k/2 on both sides and then substituting Eq. (2.24) in (2.26), we get the below solution.

$$h + k = \frac{am_1}{2} \prod_{i=2}^{j} m_i + b \prod_{i=j+1}^{p} m_i.$$
 (2.27)

To obtain the least integer value of h + k, make a=1,b=1 and we get,

$$h + k = \frac{m_1}{2} \prod_{i=2}^{j} m_i + \prod_{i=j+1}^{p} m_i.$$
 (2.28)

**Proof of Theorem 2.2:** In the case of all odd moduli, the derivation follows the same steps as the previous case. The obtained estimation for ambiguity can be given by the below equation.

$$h + k = \frac{a}{2} \prod_{i=1}^{j} m_i + \frac{b}{2} \prod_{i=j+1}^{p} m_i.$$
 (2.29)

For k to be even, both a and b should be even. Similarly, for the case of an odd k, the same solution as equation 2.29 is obtained but for odd positive integers a and b.

To obtain the least integer value of h + k, make a=1, b=1 and we get,

$$h + k = \frac{1}{2} \prod_{i=1}^{j} m_i + \frac{1}{2} \prod_{i=j+1}^{p} m_i.$$
 (2.30)

The solution for every  $j \in \{2, ..., p\}$  and different permutations of  $m_r$  in the order 2, 3, ..., p in the equation 2.27 and 2.29 causes an ambiguity in the system, but the minimum of all solutions is the dynamic range of this system.

To summarize this sub-section, derivations of formulae to calculate the dynamic range of a system of moduli, for which an unambiguous reconstruction is possible, are studied. The two assumptions made in this derivation is that all the moduli are pairwise co-prime and the dynamic range always starts from zero.

#### 2.2.3. SNS with non-co-prime moduli and shifted Dynamic Range

The derivation of dynamic range provided in [20] is very advantageous in choosing the moduli in order to cover the required frequency range. Yet, it does not answer two questions:

- 1. Can the frequency be unambiguously estimated if GCD of all the moduli is not 1, i.e., if they are not co-prime?
- 2. What if the dynamic range does not start at zero but at an arbitrary integer S? Would a similar dynamic range still hold true?

Finding answers to these questions is necessary to make a few practical decisions such as using integer-N PLLs for sampling frequency generation, which will be explained in Section 2.3.2.

#### Question 1: If $GCD(m_1, m_2, ..., m_p) > 1$ (moduli are not pairwise co-prime)

From tests conducted in MATLAB, it is observed that, if the *p* moduli  $m_r \in m_1, m_2, m_3...m_p$  have a GCD of *d* such that  $\mu_r = \frac{m_r}{d}$ , and given that  $\mu_1, \mu_2...\mu_p$  are pairwise co-prime in nature, the dynamic range of system  $m_r$  can be given by

$$\hat{M}_d = d \cdot \hat{M} \tag{2.31}$$

where  $\hat{M}$  is the dynamic range derived from equations (2.27) and (2.29) for the system with moduli  $\mu_1, \mu_2...\mu_p$ . The formula for dynamic range estimation may not work when the moduli have a GCD 1 but not all of them are pairwise co-prime.

#### Question 2: If 0 is not the starting of the dynamic range

In the previous section, the dynamic range is estimated assuming that the starting point is 0. However, in the case of an available frequency bandwidth  $[f_{min}, f_{max}]$  which does not start at DC, the dynamic range needs to be much higher than  $f_{max}$  in order to cover the bandwidth. This means choosing higher or more number of sub-nyquist frequencies to achieve a high dynamic range. This section considers that the dynamic range starts from an integer S that is not 0 and an approximate estimate of the upper limit of dynamic range (DR) is made. This way, more moduli are not required to extend the DR but it can just be shifted from  $[0, \hat{M})$  to  $[S, \hat{M}_1)$  such that  $[f_{min}, f_{max}] \subset [S, \hat{M}_1)$ .

Consider the SNS definition in equation 2.19. Let S be an integer such that  $[S, \hat{M}-1]$  be the dynamic range of the system where  $\hat{M}$  is the first integer that has ambiguity with an integer S + h. Consider that  $\hat{M} = S + h + k$  such that  $h \ge 0, k > 0$ .

Let  $m_1, m_2, ..., m_p$  be pairwise co-prime integers and  $a_h$  be their symmetrical residue whose definition is given by equation 2.17. Then the column vectors  $A_{S+h} = A_{S+h+k}$  when there is an ambiguity and they are given by,

$$A_{h} = \begin{bmatrix} a_{S+h}^{1} \\ a_{S+h}^{2} \\ \vdots \\ a_{S+h}^{p} \end{bmatrix} = A_{S+h+k} = \begin{bmatrix} a_{S+h+k}^{1} \\ a_{S+h+k}^{2} \\ \vdots \\ a_{S+h+k}^{p} \end{bmatrix}$$

**Theorem 3:** Let  $m_1, m_2, ..., m_p$  be p pairwise co-prime moduli, then the upper limit of the dynamic range  $[S, \hat{M})$  of this system can be given by,

1. If one of the elements in vector  $m_r$ , say  $m_1$  is an even number

$$\hat{M} = \min\left\{\frac{m_1}{2}\prod_{i=2}^{j}m_i + \left[\frac{S + \frac{\prod_{i=1}^{j}m_i}{2}}{\prod_{i=j+1}^{p}}\right]\prod_{i=j+1}^{p}m_i\right\},$$
(2.32)

i

for all values of  $j \in [1,p-1]$  and all permutations and combinations of  $m_2, m_3, ..., m_p$ .

2. If all moduli are odd numbers,

$$\hat{M} = \min\left\{\left\{\frac{1}{2}\Pi_{i=1}^{j}m_{i} + \frac{1}{2}\left[\frac{2S + \prod_{i=1}^{j}m_{i}}{\prod_{i=j+1}^{p}}\right]\Pi_{i=j+1}^{p}m_{i}\right\}, \left\{\Pi_{i=1}^{j}m_{i} + \left[\frac{S + \frac{\prod_{i=1}^{j}m_{i}}{2}}{\prod_{i=j+1}^{p}}\right]\Pi_{i=j+1}^{p}m_{i}\right\}\right\}, (2.33)$$

for all values of  $j \in [1,p-1]$  and all permutations and combinations of  $m_2, m_3, ..., m_p$  and odd or even value of ceil function.

where [x] implements the ceil function where a fractional number x is rounded up to nearest integer.

#### **Proof:** Theorem 3

The derivation of this theorem is done similar to that of Theorem 2 with the only exception that the starting value of the dynamic range is *S* instead of 0. Hence here the derivation is done for only the case of all odd moduli which can similarly be extended to even moduli. To reiterate, for the response

of the system to have an ambiguity at an integer S + h + k with an integer S + h, it should satisfy the condition,

$$(S+h) \equiv \pm (S+h+k) \mod m_i, i \in [1,p].$$
 (2.34)

Consider that there are *j* moduli that get the same symmetric residue due to a periodic response and the rest of the moduli p - j see a negative symmetric residue as expressed in the below equations.

$$S+h \equiv (S+h+k) \pmod{m_i}, \quad 1 \le i \le j, \forall j \le p, \tag{2.35}$$

$$S + h \equiv -(S + h + k) \pmod{m_i}, \quad j + 1 \le i \le p.$$
 (2.36)

For the *j* moduli in equation 2.35, the congruence condition is satisfied if and only if *k* is an integer multiple of  $\prod_{i=1}^{j} m_i \forall 1 \le j \le p - 1$ . Then the integer *k* can be given by,

$$k = a \prod_{i=1}^{j} m_i, \quad a \in \{1, 2, ...\}.$$
(2.37)

Equation 2.36 can be solved using the modulo arthematic identities such that it is brought to the form,

$$(2(S+h)+k) \equiv 0 \mod m_i, \quad j+1 < i < p.$$
(2.38)

Since  $m_{j+1}, ..., m_p$  are pairwise coprime, the above equation has one and only one solution in every  $\prod_{i=j+1}^{p} m_i$ . To solve this equation, two cases can be considered.

Case I: k is an even number. Eq. (2.38) can be simplified as,

$$(S+h) \equiv -\frac{k}{2} \mod m_i, \quad j+1 < i < p.$$
 (2.39)

Using Chinese Reminder Theorem, the above equation has one unique solution in the range  $[0, \prod_{i=j+1}^{p} m_i)$  and for every integer multiple of  $\prod_{i=j+1}^{p} m_i$ . Let b' be the least integer that satisfies,

$$(S+h) = -\frac{k}{2} + b' \prod_{i=j+1}^{p} m_i.$$
 (2.40)

For the condition that k > 0 to be true, a > 0 and b' should satisfy the below condition for  $h \ge 0$ 

$$b'\prod_{i=j+1}^{p} m_i - S - \frac{k}{2} > 0 \Longrightarrow b' > \frac{S + \frac{k}{2}}{\prod_{i=j+1}^{p} m_i}.$$
(2.41)

$$b' = \left[\frac{S + \frac{\prod_{i=1}^{j} m_{i}}{2}}{\prod_{i=j+1}^{p} m_{i}}\right].$$
(2.42)

When k is an even number. Let b be an even integer,

$$S + h + k = \frac{k}{2} + \frac{b}{2} \prod_{i=j+1}^{p} m_i.$$
 (2.43)

Substituting 2.37,

$$S + h + k = \frac{a}{2} \prod_{i=1}^{j} m_i + \frac{b}{2} \prod_{i=j+1}^{p} m_i.$$
 (2.44)

For *k* to be even, *a* should be an even integer and  $b = 2(b' + \tilde{b})$  where  $\tilde{b} \in \{0, 1, ...\}$ .

**Case II:** k is an odd number. Let b' be the least integer that can satisfy eq. 2.38

$$(2(S+h)+k) = b' \prod_{i=j+1}^{p} m_i.$$
(2.45)

For the condition that k > 0 is true, b should satisfy the below condition, (by making h = 0)

$$b' \prod_{i=j+1}^{p} m_i - 2S - k > 0 \Longrightarrow b' > \frac{2S + k}{\prod_{i=j+1}^{p} m_i}.$$
(2.46)

$$b' = \left[\frac{2S + \prod_{i=1}^{j} m_i}{\prod_{i=j+1}^{p} m_i}\right].$$
(2.47)

When *k* is an odd number. Let *b* be an odd positive integer,

$$2(S+h) + 2k = k + b \prod_{i=j+1}^{p} m_i.$$
(2.48)

$$S + h + k = \frac{k}{2} + \frac{b}{2} \prod_{i=j+1}^{p} m_i.$$
 (2.49)

Substituting 2.37,

$$S + h + k = \frac{a}{2} \prod_{i=1}^{j} m_i + \frac{b}{2} \prod_{i=j+1}^{p} m_i.$$
 (2.50)

For *k* to be odd, *a* should be an odd positive integer and  $b = (b' + \tilde{b})$  should be an odd positive integer where  $\tilde{b} \in \{0, 1, ...\}$ . The Dynamic range  $\hat{M} = S + h + k$  can be given by the minimum values of the equations 2.50, 2.44 for all values of  $1 \le j \le p$  and all permutations of  $m_1, m_2, ..., m_p$ .

#### Example:

Let  $m_1 = 11$ ,  $m_2 = 15$ ,  $m_3 = 17$ . For a starting value of 0, the dynamic range  $\hat{M} = min\{91, 101, 135\}$  as obtained from Theorem 2. But if the range starts from an integer S = 180 i.e.,  $[180, \hat{M})$  then equations (2.50), (2.44) are used. For k even,  $a_{min} = 2$  and k odd,  $a_{min} = 1$ 

$$\begin{split} \hat{M} &= \min\{\left\{11 + (\left\lceil\frac{180 + 11/2}{15 \times 17}\right\rceil + 1)15 \times 17, \, 15 + (\left\lceil\frac{180 + 15/2}{11 \times 17}\right\rceil + 1)11 \times 17, \, 17 + (\left\lceil\frac{180 + 17/2}{15 \times 11}\right\rceil)15 \times 11\right\}, \\ &\left\{\frac{11}{2} + \frac{1}{2}(\left\lceil\frac{2 \times 180 + 11}{15 \times 17}\right\rceil + 1)15 \times 17, \, \frac{15}{2} + \frac{1}{2}(\left\lceil\frac{2 \times 180 + 15}{11 \times 17}\right\rceil + 1)11 \times 17, \, \frac{17}{2} + \frac{1}{2}(\left\lceil\frac{2 \times 180 + 17}{15 \times 11}\right\rceil)15 \times 11\}\right\}, \\ &\hat{M} = \min\{\{266, 389, 347\}, \{388, 288, 256\}\}\end{split}$$

The minimum value 256 is the value which causes the first ambiguity and hence the maximum value of the dynamic range obtained is [180,255].

To summarize this sub-section, first, it was experimentally verified that if the moduli are derived from pairwise co-prime integers, and all the moduli together have a GCD of d, then their dynamic range is also d times higher. Secondly, an approximate formula has been derived to calculate the upper limit of the dynamic range of a system when its lower limit is not zero. It is observed that in such cases, the dynamic range is very much dependent on the moduli combination and the starting limit. These formulae when substituted with d = 0 and S = 0, reduced to the equations in the previous subsection. These derivations will be useful when adopting these methods for frequency reconstruction in a frequency band.

#### 2.3. SNS for Frequency estimation in PLLs

The discussion thus far was purely mathematical, explained in terms of integers, residues and moduli. However, in this section, an equivalence is drawn to the sampling system for a clear understanding. Consider that there are p samplers whose frequencies are given by a vector  $f_{si} = \{f_{s1}, f_{s2}, ..f_{sp}\}$ . Let all these sampling frequencies be derived from a reference frequency  $f_{ref}$ , by integer multiplication to an arbitrary pairwise co-prime vector  $\mu_i = \{\mu_1, \mu_2, ..., \mu_p\}$  i.e.,  $f_{si} = f_{ref} \cdot \mu_i$ ,  $\forall i \in [1, p]$ . Then,  $f_{ref}$  is the GCD of the sampling system  $f_{si}$ . Consider a frequency band of interest  $[f_{min}, f_{max}]$ , in which we desire an unambiguous frequency reconstruction. Then, the equivalence to SNS is as follows.

- Sampling frequencies  $f_{si}$  are equivalent to moduli  $m_i$ .
- $f_{min}$  is equivalent to the starting of a dynamic range *S*.
- $f_{max}$  is the last integer which can be unambiguously reconstructed, i.e.,  $\hat{M} 1$ .
- $f_{ref}$  is the GCD *d* of the system.

From this equivalence, and combining the equations (2.31),(2.32),(2.33), the below formula can be obtained, which gives the minimum frequency value that causes ambiguity when sampled by the system  $f_{si}$ .

$$f_{max} + 1 = \begin{cases} f_{ref} \cdot min\{\frac{\mu_1}{2} \prod_{i=2}^{j} \mu_i + \left[\frac{S + 0.5 \prod_{i=1}^{j} \mu_i}{\prod_{i=j+1}^{p}}\right] \prod_{i=j+1}^{p} \mu_i\}, & \mu_1 = even, 1 \le j \le p-1 \\ f_{max} + 1 = \begin{cases} f_{ref} \cdot min\{\{\frac{1}{2} \prod_{i=1}^{j} \mu_i + \frac{1}{2}[\frac{2S + \prod_{i=1}^{j} \mu_i}{\prod_{i=j+1}^{p}}] \prod_{i=j+1}^{p} \mu_i\}, & (2.51) \\ \{\prod_{i=1}^{j} \mu_i + [\frac{S + 0.5 \prod_{i=1}^{j} \mu_i}{\prod_{i=j+1}^{p}}] \prod_{i=j+1}^{p} \mu_i\}\}, & \forall \mu_i = odd, 1 \le j \le p-1. \end{cases}$$

Therefore, by choosing proper values for  $f_{ref}$  and  $\mu_i$ , the desired frequency dynamic range  $[f_{min}, f_{max}]$  can be obtained.

The advantage of using sub-Nyquist samplers for frequency estimation is that, theoretically, the same or a similar set of samplers can be used to estimate frequencies of similar dynamic range (bandwidth), but at different absolute frequencies. For example, consider three co-prime moduli  $\mu_1 = 10$ ,  $\mu_2 = 11$ ,  $\mu_3 = 13$ , and a reference frequency of  $f_{ref} = 50 MHz$  which result in the sampling frequency system of  $f_{s1} = 500 MHz$ ,  $f_{s2} = 550 MHz$ ,  $f_{s3} = 650 MHz$ . The dynamic range depending on different starting frequencies  $f_{min}$  is shown in Table 2.2. The table evidently shows that a similar bandwidth/ dynamic range can be obtained by the same set of sampling frequencies, even at very high frequency bands. Therefore, theoretically its possible to estimate frequencies as high as and much higher than 68 GHz with the same set of sampling frequencies. It is to be noted that the possible frequency ranges are very rigidly dependent on the sampling frequencies and any frequency slightly lower or higher than the calculated dynamic range might result in ambiguity.

| f <sub>min</sub> | f <sub>max</sub> | Bandwidth |
|------------------|------------------|-----------|
| 0                | 3400 MHz         | 3.4 GHz   |
| 10.5 GHz         | 13.55 GHz        | 3.05 GHz  |
| 30 GHz           | 33.05GHz         | 3.05 GHz  |
| 65.5 GHz         | 68.8 GHz         | 3.3 GHz   |

Table 2.1: Calculated dynamic range of the sampling system (500,550,650 MHz) in different frequency bands.

| Frequency range (GHz) | Sampling frequencies (MHz) |
|-----------------------|----------------------------|
| 5-10                  | 650, 725, 775              |
| 10-20                 | 263, 271, 289              |
| 18-40                 | 263, 321, 389              |

Table 2.2: Arbitrary sampling frequency combinations for common output frequency ranges.

#### 2.3.1. Limitations of using sub-nyquist sampling and SNS

The only and the biggest limitation of frequency reconstruction using SNS, i.e., using symmetric residues of moduli to reconstruct an integer is that, if there is a small error in the estimation of a single residue, the reconstructed integer may have a huge error. This calls for accuracy as high as possible in residue or alias frequency generation. This means, a perfectly monotonic and linear SNS curve similar to Figure 2.2. However, in reality, as will be explained in detail in Section 4.6, there will be some non-linearities in circuit blocks that will introduce errors in the residue. This could further be worked around by allowing an error margin in the digital processing where the VCO frequency will be reconstructed from the residues.

#### 2.3.2. Choosing Sub-Nyquist frequency Combinations

As proven in all the previous sections, an unambiguous reconstruction of the VCO frequency is possible when more than one sub-Nyquist frequencies are used for sampling, thus alleviating the limitations faced by high sampling rates. The choice to be made here is the number of sub-sampling frequencies required and the combination of these frequencies. This choice depends on the following factors.

- (1) Frequency range  $[f_{min}, f_{max}]$ .
- (2) Low complexity in sampling frequency generation.
- (3) Robustness of the system.

#### Number of samplers

As the number of sampling channels increase, the area and design overhead increases. However, as the number of samplers increase, the sampling frequencies reduce which lead to lesser power consumption. As the area overhead and the design complexity are a higher trade off than the power consumption, the number of samplers need to be as low as possible. If 1 is the lowest number of channels, then the sampling frequency needs to satisfy Nyquist criterion. Hence, starting from two samplers, to obtain a dynamic range of  $f_{max} - f_{min}$ , the sum of the two frequencies should be greater than the dynamic range as seen in Eq. (2.51). For example, consider a dynamic range requirement of [0, M] and two co-prime sampling frequencies  $f_{s1}, f_{s2}$ . Then the first ambiguity should be greater than M, where M can be given by

$$M < \begin{cases} \frac{f_{s1}}{2} + f_{s2}, & f_{s1} = even, \\ \frac{f_{s1}}{2} + \frac{f_{s2}}{2}, & f_{s1}, f_{s2} = odd. \end{cases}$$
(2.52)

In case M = 2000, then the best case sampling frequencies can be 1000,1501 such that (1000/2+1501 = 2001 > 2000). But by adding another sampler, the ambiguity equation becomes,

$$M < \begin{cases} \frac{f_{s1}}{2} f_{s2} + f_{s3}, & f_{s1}, f_{s2} < f_{s3}; f_{s1} = even, \\ \frac{f_{s1}}{2} f_{s2} + \frac{f_{s3}}{2}, & f_{s1}, f_{s2} < f_{s1}, f_{si} = odd. \end{cases}$$

$$(2.53)$$

If the same case of M = 2000 is considered,  $f_{S1} = 8$ ,  $f_{S2} = 125$ ,  $f_{S3} = 1501$  would be sufficient to achieve unambiguous reconstruction. It is clearly evident that, adding another sampler reduces at least two of the sampling frequencies by an order of magnitude in the best case. Hence using three samplers is more desired than two. The sampling frequencies can be further reduced by using 4 or more samplers, but the area overhead would be a greater trade off than the power saved. Three is believed to be an optimum point between power and area trade off and hence three sampling frequencies are used in this project.

#### Combination of sampling frequencies

As seen from the Eq. 2.51, the combination of frequencies certainly depends on the starting frequency  $f_{min}$ . Secondly, these sampling frequencies should be equal to or be derived from a set of pairwise co-prime integers.

When the sampling frequencies are generated, it is preferable to derive them from a reference clock using a PLL for a stable and low noise performance, rather than a free-running oscillator. When all the frequencies are derived from a reference clock using an integer-N PLL, they automatically have a  $GCD = f_{ref}$ , where  $f_{ref}$  is the reference clock frequency. As explained in the previous section, the dynamic range depends on the pairwise co-prime moduli, and if GCD > 1, the total dynamic range is just a multiple of the dynamic range from co-prime integers. Alternatively, a GCD of 1 can also be achieved by using a fractional-N PLL. In this section, certain limits are set on the sampling frequencies and the GCD for a robust frequency estimation.

#### Maximum sampling frequency:

The maximum sampling frequency is limited by three factors. First, as will be discussed in section 3.3, the tolerable sampling clock jitter is limited by its frequency of sampling operation and as this frequency increases, the jitter constraint on its generation circuit becomes very stringent. Second, the non-linearities caused by jitter, spur, and amplifier gain, distort the aliasing transfer curve which leads to an incorrect frequency estimation. The impact of these non-linearities is discussed in Section 4.3. These non-linearities increase with the frequency of operation. Thirdly, as the sampling clock frequency increases, the power consumption of all the blocks operating at that frequency also increases. Keeping all these factors in mind, the maximum sampling frequency is limited to 1 GHz.

#### Minimum sampling frequency:

As will be discussed in Chapter 3, the required dynamic range in the current application is [10 GHz, 12 GHz]. To have an extra margin, the a requirement of [9.3 GHz, 12.7 GHz] is assumed. To achieve such bandwidths, the sampling frequencies can be derived from very small co-prime numbers using high GCD, or high co-prime numbers using small GCD. For example, the given range can be obtained by using,

**Case-I:** Co-prime numbers {11,15,17} with  $f_{ref} = 50 MHz \Rightarrow f_{si} = \{550, 750, 850\} MHz$ , **Case-II:** Co-prime numbers {263,271,289} with  $f_{ref} = 50 MHz \Rightarrow f_{si} = \{263, 271, 289\} MHz$ .

The following comparisons between Case-I and Case-II are used to decide a limit on the minimum sampling frequency.

#### Integer-N vs Fractional-N PLLs:

Firstly, Case-I sampling frequencies are almost double the frequencies of Case-II. The reference frequency used is same, for a fair comparison. Having a high reference frequency is important for better oscillator noise suppression.

Since in Case-II, the sampling frequencies are not integer multiples of 50 MHz, a fractional-N PLL needs to be used. Compared to integer-N PLLs, fractional-N PLLs have higher power and area overhead because of extra blocks needed, like a Multi-Modulus Divider (MMDIV), Delta-Sigma modulator (DSM) or a Digital-to-time converter (DTC), which implement a fractional-N operation. This also increases the design complexity, which needs to be taken into consideration due to the limited time avaliability. Additionally, due to quantization noise produced during a fractional-N operation, output phase noise will be degraded. Furthermore, fractional-N PLLs give rise to fractional spurious tones, in addition to reference spurs, which might be detrimental for a robust frequency estimation. Although, Section 3.3 shows that lower sampling frequencies are allowed to have slightly higher jitter, the area, power and design complexity are a higher trade off.

**Linearity and Robustness:** For frequencies in Case-I, due to periodicity, the same alias frequency may occur 10-20 times in the given dynamic range. However, in Case-II, they might occur 20-40 times, because of their high periodicity. This means, less error can be tolerated in alias frequency estimation, in Case-II than in Case-I. In other words, smaller errors in alias frequencies result in larger errors in VCO frequency estimation, in Case-II than in Case-II than in Case-I.

Furthermore, as will be explained in the Chapter 4, due to non-linearities in the sampling operation, the area of monotonic and linear region in the sampling curve is also much lower than expected.

**Speed of Locking:** In Case-I, the alias frequencies lie within 0 to 250-500 MHz, where as in Case-II, they lie within 0-150 MHz. Hence Case-II needs more number of samples to determine all the aliasing frequencies accurately than Case-I. This results in longer locking times in Case-II than in Case-I.

**Reference Frequency:** In integer-N PLLs, a higher reference frequency is desired as it allows a higher bandwidth. A Higher bandwidth helps in better oscillator noise suppression, hence a low output jitter. The available reference frequency to this design is a 100 MHz crystal input (explained in Chapter 3). As explained in Eq. (2.51), for a given dynamic range, a higher reference frequency leads to low co-prime factors. By a trial and error method, it is found that there are no combinations of co-prime integers such that all the sampling frequencies derived are less than 1 GHz. Hence the second highest reference frequency (50 MHz) that can be derived from 100 MHz, is used in this project.

#### Summary

Since the robustness of the system is extremely important for a proper locking operation, lower power consumption is traded off with higher linearity and sampling frequencies higher than 500 MHz are chosen. However, due to sampling jitter limitation, the sampling frequencies are limited to 1 GHz. A GCD = reference clock frequency of 50 MHz is chosen depending on the lowest co-prime integers possible. So the co-prime moduli can be calculated using Eq. (2.50), (2.44), such that the DR [ $S_1$ ,  $S_2$ ] = [9.3, 12.7]GHz/50MHz. Alternately a perl script was used which could compute the combinations of sampling frequencies, such that the combinations of symmetric residues or the aliasing frequencies are unique in the given DR.

3

## **Block Level Implementation**

The main functions of an FTL can be divided into three parts - (1) Estimation of the VCO frequency  $(f_{vco})$ , (2) computation of the error  $F_{err}$  between the VCO frequency  $f_{vco}$  and the desired frequency input code word  $F_{desired}$ , and finally (3) application of an appropriate control signal to update the VCO frequency to reduce this error. VCO frequency estimation, which is the most crucial function of an FTL, is implemented using a combination of analog and digital sub-blocks. The functions (2) and (3) can be easily performed in the digital domain. This chapter briefly discusses the overall block-level implementation of the FTL, starting with the VCO frequency estimation step in the first section. Further in the chapter, specifications for each of the analog sub-blocks are derived, which ensure the robust functioning of the FTL.

#### 3.1. Frequency Estimation

This thesis explores the idea of using multiple sub-Nyquist samplers for frequency estimation. These sub-Nyquist samplers down-sample the VCO output signal to produce aliased signals. The first step of VCO clock frequency estimation is the estimation of these different alias frequencies. The estimation of an alias frequency can be divided into two steps.

- (1) Generation of the aliasing signal.
- (2) Processing the aliased signal in the analog and digital domain to estimate the frequency.

The block-level implementation of these operations is shown in Figures 3.1 and 3.2. The aliased signal can be generated by performing a sampling operation on VCO output of frequency  $f_{vco}$  using a sampling signal of frequency  $f_s$ . The sampling clock  $f_s$  is derived from a reference clock by multiplying with an integer N using a frequency multiplier  $f_s = NF_{ref}$ . An integer multiplication factor is used because of the simplicity of the implementation of an integer-N multiplier. A fractional-N multiplication factor can also be used, but they would introduce excessive phase noise due to additional quantization noise and fractional spurious tones caused by the fractional operation. Besides, the design complexity also increases.

Conventionally, frequency estimation of such alias signals is done using an FFT based approach [22]. These methods typically involve discretization of the aliasing signal, converting it into digital form using an ADCs, and then performing DFT/FFT or a similar algorithm to estimate the frequency. Alternatively, instead of discretization, the continuous signal is processed in this project to obtain the alias frequency estimate in a simple manner, using rail-to-rail amplifiers and counters. As Figure 3.1 suggests, the continuous-time alias signal is then amplified close to rail-to-rail by the amplifier, and then converted into digital pulses using a Schmitt trigger. This digital signal ( $V_{dig}$ ), is then sent to the digital block for processing. The Schmitt trigger is mainly used to suppress any false pulses caused by the noise in the system.



Figure 3.1: Alias signal generation and conversion into digital pulses.

 $V_{dig}$  toggles at an average frequency equal to the alias frequency. In the digital domain, this signal is processed to estimate the absolute value of the alias frequency. Figure 3.2 shows a basic digital implementation of a circuit that takes a single alias signal information and estimates the VCO frequency. It is a three-step process.

#### Step 1:

Counting the number of pulses *N* (or rising edges) of a digital signal of frequency  $f_{dig}$  in a time period  $T_p$  (observation time). This is done by using a combination of a counter and a differentiator as shown in Figure 3.2. The counter up-counts at every posedge of  $V_{dig}$ , thereby storing the number of pulses that have occurred starting from time 0. The differentiator calculates the difference between the counter output at the 1<sup>st</sup> and the n<sup>th</sup> clock edges of FTL clock  $clk_{ftl}$ , which results in the effective number of pulses counted in  $T_p$ . One observation time period  $T_p$  consists of n clock periods of  $clk_{ftl}$ .

#### Step 2:

Calculating the absolute value of the alias frequency  $f_a$  by scaling the count N by  $\frac{1}{T_n}$ . (i.e.,  $f_a = \frac{N}{T_n}$ ).

#### Step 3:

Calculating the VCO frequency  $f_{vco}$  from  $f_a$  using the equation

$$f_{\nu co} = m \cdot f_s + s \cdot f_a, \tag{3.1}$$

where *m* is an integer such that  $m \cdot f_s$  is the integer multiple of  $f_s$  closest to  $f_{vco}$ , and *s* is the sign of the alias frequency given by  $s = sign(f_{vco} - m \cdot f_s)$ .

The last step of the process explained applies only for the case of a single sampling frequency based frequency estimation and it requires the knowledge of m and s beforehand. The estimation of frequency using three aliasing frequencies is explained thoroughly in Chapter 5.



Figure 3.2: Alias signal processing in the digital domain to estimate VCO frequency for a single sub-Nyquist sampler.

#### 3.2. Top level design

Figure 3.3 shows the complete top-level block diagram of the design. The figure shows a sub-sampling PLL assisted by the proposed frequency tracking loop for frequency acquisition. The main PLL consists of a high gain Dynamic-Amplifier-based Phase Detector (DAPD) which directly samples the VCO

output, an operational transconductance amplifier (OTA) which converts the voltage from PD to current, a low pass filter to convert the OTA output into VCO control voltage ( $V_c$ ), and an L-C VCO. This PLL takes input from a 100 MHz crystal oscillator. A reference buffer is used to convert the sinusoidal reference clock into a square wave. Then, a pulse generator is used to generate a PD reference clock with the required duty cycle  $D_{ref}$ , which is used for VCO sampling in the PD. The VCO output  $V_{vco}$ (which is a differential signal in reality) is then amplified by an output buffer and given to a dividy-by-4 block to divide the frequency to a lower value that can be observed by a phase noise analyzer. This PLL designed by Jiang Gong as presented in [23] has been taken as a basis for the design of the proposed frequency-tracking loop. The expected frequency range of the VCO is 9.8-12.2GHz. The PLL has a reference spur of -78 dBc and an in-band phase noise of -129 dBc/Hz at an offset of 1 MHz. The obtained lock-in range of the main PLL is 7 MHz, up-to 3.5 MHz on each side of the desired frequency. The proposed FTL is expected to extend this lock-in range to >2.4 GHz so that it can cover the entire tuning range of the PLL.



Figure 3.3: Top level block diagram of an SSPLL with proposed FTL for a wider locking range.

The frequency-tracking loop is highlighted in blue in Figure 3.3. The functions of the FTL can be divided into the following tasks:

- 1. Detecting if the PLL is unlocked.
- 2. Generating the alias pulses  $\{V_{dig1}, V_{dig2}, V_{dig3}\}$ .
- 3. Estimation of current VCO frequency  $f_{vco}$  from the alias pulses.
- 4. Finding the frequency error between the VCO frequency and the desired frequency  $f_{desired}$ . Here  $f_{desired}$  is a digital input to the PLL.
- 5. Apply an equivalent control code to the VCO to change its frequency to the desired value.

The unlock detection circuit is used to detect any disturbances caused in the loop because of frequency unlock. The output of this circuit is zero when no disturbance is present. However, when the PLL is unlocked, frequency modulation occurring in the VCO manifest as ripples on the VCO control voltage, which are amplified into digital pulses. The presence of these pulses indicates unlock and a signal is sent to the digital block to switch control to the FTL.

As decided in the previous chapter, there are three sampling frequencies which are all multiples of 50 MHz. Hence, there will be three slices of alias signal generation blocks. Three digital signals

 $V_{dig1}$ ,  $V_{dig2}$ ,  $V_{dig2}$ ,  $V_{dig3}$  representing the three aliasing frequencies are given as input to the digital blocks for processing. These analog blocks are explained in detail in Chapter 4.

The frequency estimation and the following tasks are performed in the digital domain represented by the digital block in figure 3.3. The digital processing is divided into alias frequency estimation  $(f_{a1}, f_{a2}, f_{a3})$  followed by VCO frequency estimation, and frequency error calculation. Then depending on the frequency error, a control code is sent to VCO to update its frequency. The FTL is enabled when a "**PLL Unlock**" signal is received from the unlock-detect block in the main PLL. Additionally, there is a coarse lock detection digital sub-block whose function is to observe  $f_{err}$  and turn off the FTL when the frequency error reaches below 3 MHz (main PLL lock-in range) so that the main PLL can take over. It can be treated as a dead-zone so that there aren't two different loops controlling VCO at the same time. Furthermore, a speed and accuracy optimization block is introduced in the feedback so that the accuracy of  $f_{vco}$  estimation can be dynamically changed for an optimal locking time. All of these blocks are discussed in detail in chapter 5.



Figure 3.4: Block level representation of tasks performed in the FTL digital block.

Figure 3.4 shows that there are 6 inputs to the digital block. To be precise, the FTL analog block has 6 output signals, 2 corresponding to each sampling path. This is done as a work around to address the non-linearities in the analog blocks, as will be explained in the Section 4.3.

#### 3.3. Specifications

As explained in chapter 2, the property that allows an unambiguous reconstruction of the VCO frequency is that the combination of all the alias frequencies is unique for each frequency in the bandwidth of interest. However, even a slight error in either of the alias frequencies may result in large errors in the estimated frequency. Hence, the digital algorithm used to estimate the VCO frequency requires a very high accuracy in the alias frequency generation, as it needs a precise count value to correctly calculate the VCO frequency. Thus, it poses stringent constraints on the performance of each of the analog blocks used in alias frequency generation. To successfully implement the frequency estimation block, the following attributes of analog blocks play a significant role.

- 1. RMS jitter of the sampling clock.
- 2. Amplifier gain and bandwidth.

In this section, the impact of these attributes on FTL functioning are estimated and hence their specifications are derived. A few other performance attributes like the main VCO spur performance due to additional sampling operations is also constrained so that the existing PLL performance is not degraded.

#### 3.3.1. Sampling clock jitter

The impact of sampling clock jitter on frequency estimation can be simplified and understood from Figure 3.5. Consider that the VCO signal is being sampled at a frequency  $f_s$  and the output is quantized to either 0 or 1 depending on the sign of  $V_s$ , where  $V_s$  is the sampled value. In an ideal case, at instant  $T_s$ , the sampled value is positive leading to a quantized value of 1. If the sampling clock has jitter (represented by gray dotted lines), there is a limit on the amount of jitter the sampling clock can have

before sampling results in a wrong decision, i.e., a negative value or 0 quantized output. This limit on jitter can be derived as below.



Figure 3.5: Impact of clock jitter in a sampling operation.

$$V_s = \sin(2\pi (f_{vco} + \Delta f)T_s) = \sin(2\pi \frac{f_{vco}}{f_s} + 2\pi \frac{\Delta f}{f_s}) \xrightarrow{(f_0/f_s = integer)} \sin(2\pi \frac{\Delta f}{f_s}), \quad (3.2)$$

where  $V_s$  is the sampled voltage at instant  $T_s$ ,  $T_s = 1/f_s$ ,  $\Delta f$  is the frequency error or the aliasing frequency ( $|f_{vco} - k * f_s|$ ), and  $f_{vco}$  is the VCO frequency. As seen from this equation,  $V_s$  will have a positive value if  $\Delta f$  is positive and vice-versa.

The above equation can be modified as shown in equation (3.3) to include sampling clock jitter. The sampled voltage can then be give by

$$V_{s} = \sin(2\pi(f_{vco} + \Delta f)(T_{s} - t_{j})) = \sin(2\pi \frac{f_{vco}}{f_{s}} + 2\pi \frac{\Delta f}{f_{s}} - 2\pi(f_{vco} + \Delta f)t_{j})$$

$$\xrightarrow{(f_{0}/f_{s} = integer)} \sin(2\pi \frac{\Delta f}{f_{s}} - 2\pi(f_{vco} + \Delta f)t_{j}),$$
(3.3)

where  $t_j$  is the sampling clock jitter. There is a maximum jitter that can be allowed in Eq. (3.3), such that  $V_s$  still stays positive. Hence, we have

$$t_j < \frac{\Delta f}{f_s} \frac{1}{f_{vco}}.\tag{3.4}$$

From Eq. (3.4), it can be deduced that the allowed jitter is directly proportional to the error/aliasing frequency and hence low aliasing frequency signals are the most affected by jitter. Jitter tolerance is also inversely proportional to the signal and the sampling clock frequencies, and hence for high sampling frequency, less jitter can be tolerated. The jitter tolerance for different  $\Delta f$  and  $f_s$  values are summarised in Table 3.1. Assuming that the maximum allowed jitter calculated using Eq. (3.4) is the  $5\sigma$  variance, then the target integrated jitter of the sampling clock should be  $1\sigma$ . At low  $\Delta f$ , the jitter tolerance is as low as 100-200 fs which is not a trivial specification to achieve. However, interestingly, the jitter tolerance at higher  $\Delta f$ , which is close to  $f_s/4$ , is much higher showing that these frequencies are very less affected by higher jitter. To alleviate the multiplier from a very strict jitter constraint, a jitter of about 2 ps at a sampling frequency of  $f_s = 1 GHz$  is targeted, which is an achievable performance. The incorrect pulses caused by jitter at  $\Delta f$  close to 0 or  $f_s/2$  should be taken care of in the digital domain, which will be discussed in Section 5.1.

| Aliasing frequency ( $\Delta f$ ) | Sampling frequency ( $f_s$ ) | Signal frequency ( $f_{vco}$ ) | $t_j(5\sigma)$ | $t_j(1\sigma)$ |
|-----------------------------------|------------------------------|--------------------------------|----------------|----------------|
| 5 MHz                             | 1 GHz                        | 10 GHz                         | 500 fs         | 100 fs         |
| 5 MHz                             | 0.5 GHz                      | 10 GHz                         | 1 ps           | 200 fs         |
| 5 MHz                             | 0.25 GHz                     | 10 GHz                         | 2 ps           | 400 fs         |
| 100 MHz                           | 1 GHz                        | 10 GHz                         | 10 ps          | 2 ps           |
| 100 MHz                           | 0.5 GHz                      | 10 GHz                         | 20 ps          | 4 ps           |
| 70 MHz                            | 0.25 GHz                     | 10 GHz                         | 28 ps          | 6 ps           |

Table 3.1: Jitter tolerance at different  $\Delta f$  and  $f_s$  values.

#### 3.3.2. Deterministic Jitter of Sampling clock

Similar to integrated jitter, excessive deterministic jitter results in a wrong pulse count. Deterministic jitter is a result of the reference spur in frequency synthesizers. It can be given by

$$j_{spur,peak} = \frac{10^{\frac{spur}{20}}}{\pi f_s},\tag{3.5}$$

where *spur* is the reference spurious tone in dBc and  $f_s$  is the output frequency [24]. This spur is observed at an offset frequency of  $f_{ref}$  which is the reference frequency to the low frequency PLLs in the FTL. From the above equation, it is evident that for a given spur level, if the output frequency is lower, the deterministic jitter is higher. To limit the deterministic jitter to the same level as integrated jitter, the deterministic jitter is targetted to be below 2 ps as calculated in the previous section. Then the expected reference spur should be better than -50 dBc for a 500 MHz sampling frequency.

#### 3.3.3. VCO spur performance

Sampling operation can lead to a modulation of the load capacitance seen by the VCO tank. Figure 3.6 shows the act of sampling which connects and disconnects a sampling capacitance to the VCO tank periodically. The frequency of oscillation of a VCO in normal conditions is given by

$$\frac{1}{2\pi\sqrt{L_{tank}C_{tank}}}.$$

However, during the ON time of the switch, an extra capacitance  $C_{mod}$  is added to the tank and then the frequency of oscillation changes to

$$\frac{1}{2\pi\sqrt{L_{tank}(C_{tank}+C_{mod})}},$$

for the duration of  $D \cdot T_s$ , where D is the duty cycle of ON time of the sampling clock. This modulation of capacitance causes a modulation of frequency between the ON and OFF times of the sampling switch which manifests as a reference/sampling spur. If a transmission gate or CMOS switch is used for sampling, the sampling capacitance acts as the modulation capacitance. Additionally, charge injection from the sampling switch and charge sharing between  $C_{tank}$  and  $C_{mod}$  can also lead to spurs. In case a gated buffer is used between the switch and the VCO for isolation, because of its huge size, the difference between its ON and OFF parasitic capacitance can be the cause of this spur.



Figure 3.6: Capacitance modulation causing spur on the VCO.

The spur caused due to this capacitance modulation can be given by

$$Spur = sin(\pi D) \frac{N}{2\pi} \frac{C_{mod}}{C_{tank}},$$
(3.6)

where  $N = f_{vco} \cdot T_s$  [5]. The main PLL that is taken as a base in this thesis has a good reference spur performance of -78 dBc. To avoid any degradation of the PLL performance, the spur due to the additional sampling circuits in the proposed FTL is expected to be <-80 dBc.

#### 3.3.4. Amplifier specifications

The amplifier is used to convert the analog aliasing signal to digital pulses so that it carries the accurate frequency information. An amplifier with band-pass characteristics is suitable for this application as it avoids any low-frequency noise or signal distortion. The bandwidth requirements are dependent on two factors.

- 1. The minimum frequency error between the desired and the VCO frequency that needs to be accurately estimated sets the lower limit of the amplifier bandwidth. This depends on the main PLL bandwidth: the lock-in range to be precise. The FTL needs to bring the frequency error under the lock-in range limit such that the main PLL is able to lock instantaneously. The lock-in range of the PLL considered in this case is 7 MHz i.e., 3.5 MHz on both sides of the desired frequency. Therefore, the amplifier needs to amplify alias frequencies as low as 3 MHz for sufficient error resolution.
- 2. The maximum alias frequency that is produced by the sampling operation, which is  $\frac{f_s}{2}$ , sets the upper limit of the amplifier bandwidth, where  $f_s$  is the sampling frequency. To have a flexibility in choosing different sampling frequencies, the amplifier is required to have a tunable upper limit of the bandwidth.

#### 3.3.5. Frequency Multiplier range

As explained in Chapter 2, the desired sampling frequency combinations, that allow an unambiguous frequency reconstruction, are all expected to lie within the range of 500 MHz-1000 MHz. The frequency multiplier should have a wide frequency tuning range, in order to have flexibility in choosing the sampling frequencies. To be able to cover this frequency range in all PVT conditions, the frequency multiplier should be able to support a frequency range of 400 MHz to 1200 MHz.

4

## Analog block Design

This chapter discusses in detail the choice of architecture, the circuit design and post layout performances of the reference divider, frequency multiplier, sampler, and the amplifier which constitute the analog section of the FTL. Figure 4.1 represents a single slice of an alias signal generation circuit.



Figure 4.1: Block-level representation of analog alias signal generation in a single sampling path.

#### 4.1. Frequency Multiplier

As explained in Chapter 3, the first step of frequency estimation is alias frequency generation, in which the VCO signal is sampled using three chosen sampling frequencies. The sampling clocks are derived from a 50 MHz clock using an integer-N frequency synthesizer. This 50 MHz reference clock is obtained by dividing the crystal input frequency (also a reference clock to the main PLL), by a factor of 2. As derived in the section 3.3, there is a constraint on the amount of jitter that can be tolerated on the sampling clock. It is also explained in Section 2.3.2 that the sampling frequencies should lie within 0.5-1 GHz, leading to a multiplication factor of 10-20. A frequency synthesizer is a preferable solution to comply to the jitter requirement and the required multiplication factor.

#### 4.1.1. Ring oscillator based type-I PLL

L-C oscillators and Ring oscillators (ROs) are two choices of oscillators available for frequency synthesis. Ring oscillators are known to have a phase noise performance which is up to 20 dB worse than LC oscillators. Since the jitter requirement on the sampling clock is in the order of pico-seconds, which is not very stringent, a ring oscillator can be chosen over an LC-based VCO because it has a clear advantage in area [25]. RO is also advantageous because it avoids any unwanted magnetic coupling between the inductor and the main VCO inductor.

As specified in section 3.3, the chosen sampling frequencies of this FTL should lie within a 0.5-1 GHz range, making the frequency tuning range a wide-band requirement. A ring oscillator allows a wider tuning range than LC oscillators, making it a perfect candidate for this application. In this subsection, the circuit design choices and post layout performances of a ring-oscillator-based frequency multiplier are explained.



Figure 4.2: Basic Single Ended N-stage RO structure.

#### **Ring Oscillator Core**

A ring oscillator core can be made up of N stages of single-ended (SE), fully differential (FD), or pseudodifferential inverting delay cells. Single-ended ring oscillators have a good trade-off between power and phase noise, and can be less power consuming since there are less static power losses (unlike differential structures that use current mode logic (CML)), but they are susceptible to supply pushing and degradation of phase noise due to parasitic coupling. Fully-differential structures are immune to supply voltage noise, but have a bad trade off between power and phase noise, i.e., they have worse phase noise than SE ROs for the same power consumption [26]. Pseudo-differential structures have similarities with both single-ended and fully-differential structures. Similar to FD, it rejects commonmode interference from other blocks, and also avoids any capacitive coupling between different nodes, thus resulting in a better phase noise. However, it is still prone to supply pushing like SE structures. Since in this application the phase noise requirement is quite relaxed and low power consumption is also one of the main criteria of this design, a single-ended ring oscillator core is chosen.

The free running frequency of an SE RO shown in Figure 4.2 can be given by

$$f_{free} = \frac{1}{2Nt_d} = \frac{\mu_{eff} W_{eff} C_{ox} (V_{DD}/2 - V_T)}{8\eta V_{DD} C_{node} NL},$$
(4.1)

where  $t_d$  is the inverter stage propagation delay,  $\mu_{eff}$ ,  $C_{ox}$ ,  $W_{eff}$ , L are the electron/hole mobility, gate oxide capacitance, effective Width (sum of PMOS and NMOS widths) and length of the inverter transistors,  $V_{DD}$  is the supply voltage, N is the number of inverter stages,  $\eta$  is a constant and  $C_{node}$  is the node capacitance on each stage [26].  $f_{free}$  can usually be controlled by the number of stages, supply voltage, node capacitance, and size of the inverters. The device ratio (W/L) and the number of stages N are usually used to set the centre frequency.

Phase noise (PN) of a free running SE RO, which is an important performance metric, can be given by the Eq. (4.2), when seen at a frequency offset of  $\Delta f$  from the oscillation frequency of  $f_{osc}$ .

$$L(\Delta f) = \frac{16\gamma}{3\eta} \cdot \frac{kT}{P} \cdot (\frac{f_{osc}}{\Delta f})^2, \qquad (4.2)$$

where  $\gamma$  is the transistor noise factor, kT is the thermal energy and P is the DC power consumption [26]. The power consumption of an N stage SE RO can be given by below equation [26].

$$P = NC_{node}V_{DD}^2 f_{osc}.$$
 (4.3)

The number of stages are kept as a minimum of 3 in this design to keep the power consumption as low as possible and additionally lower N means lesser number of noise sources. As seen in Eq. (4.2), there is a direct trade off between power and PN for a given frequency of oscillation. Hence, the ring oscillator core inverters are sized to achieve the required jitter of 2 ps @1 GHz, as per the jitter specifications, at the cost of increased power consumption. It can also be noted that the phase noise increases quadratically with the frequency of oscillation for a given power consumption.

Wide-band frequency tuning for this SE RO can be done by modulating the node capacitance or the supply voltage by using switched capacitors or supply switches, respectively. Using discrete digital inputs for frequency tuning might result in low frequency resolution, but allows a wide tuning range. On the contrary, a voltage controlled MOS varactor can be used for continuous tuning to avoid any quantization error, but it is limited to a narrow tuning range. A combination of both is used in this design for a good coarse and fine tuning, as will be explained further in this section.
#### **Frequency locking**

Frequency locking to a reference is necessary to suppress the up-converted flicker noise of the oscillator with a high pass transfer function and to cancel the accumulated phase noise caused by frequency drifts. This can be done by an open loop method like injection locking, or using a negative feedback loop with a phase detector and a low pass filter. Although injection locking PLLs are known to have a very wide lock-in range and bandwidth with good in-band noise suppression, they suffer from a direct trade off between deterministic jitter (reference spur) and phase noise [27]. A wide injection locking pulse, which has a higher injection strength, reduces the the phase noise but increases the reference spur, whereas a smaller pulse results in a lower spur but increases the phase noise [27]. The reference spur also depends on how close the free running frequency is to the desired frequency, and it can be given by

$$Spur = 20log_{10}(\frac{|f_{err}|}{f_{ref}}) = 20log_{10}(\frac{|f_{des} - f_{free}|}{f_{ref}}),$$
(4.4)

where  $f_{free}$  is the free running oscillator frequency,  $f_{ref}$  is the reference frequency,  $f_{des}$  is the desired frequency and  $f_{err}$  is the frequency error [27]. So for the specification of  $-50 \, dBc @ 500 \, MHz$  and  $f_{ref} = 50 \, MHz$ , a high frequency resolution (or maximum frequency error  $f_{err} \le 150 \, kHz$ ) is required. Additionally, any temperature based frequency drifts may increase the spur further.

Alternatively, if a simple type I sub-sampling PLL with a voltage-controlled ring oscillator core is used, very fine frequency control resolutions are not required, as the voltage tuning takes care of fine frequency tuning and hence reduces the spur levels quite a lot. As a downside, these type-I PLLs have lesser bandwidth and consequently lesser lock-in range than injection locking PLLs. Considering the trade-offs, since good noise and spur performances are necessary for proper functioning of the FTL, a type-I PLL is chosen over the injection locking structures. Additionally, a type-I PLL is chosen over a type-II PLL, for their wider loop bandwidth, locking range, stability [28] and ease of design.



Figure 4.3: (a) Dynamic amplifier based charge sampling phase detector (DAPD), (b) Circuit implementation of DAPD.

Figure 4.3a shows the general idea of a dynamic amplifier-based phase detector (DAPD) and Figure 4.3b shows its circuit implementation, where the sampling capacitance is isolated from the oscillator output by a  $G_m$  stage ( $M_{IN}$ ) and hence avoiding any spur induced in the VCO by sampling capacitance modulation. Figure 4.4 shows the complete type-I PLL structure with fine and coarse voltage control, and an output buffer. As seen in Figure 4.4, this type-I PLL does not include a low pass filter. However, the charge sampling phase detector exhibits a *sinc* type transfer function because of its windowed integration operation, and additionally the sample and hold capacitors  $C_s$ ,  $C_h$  form a discrete-time low pass filter, which introduces a pole in the transfer function. A detailed explanation of the working of this circuit, and its advantages along with different phases of operation, is given in section 4.2, where a similar circuit is used, but as a mixer. The output of this structure is a voltage, which is then given to the varactors as a VCO control voltage, shown in Figure 4.4, which then modulates the free-running frequency of the oscillator.

There are three phases of operation for the DAPD. In the sampling phase ( $\phi_s$  is ON), the input voltage is converted to current, by the transistor  $M_{IN}$ , which is used to discharge  $C_s$  creating a voltage



Figure 4.4: Ring oscillator based Type-I PLL with Dynamic amplifier based phase detector.

on it. This voltage is then re-sampled by  $C_{rs}$  in re-sample phase  $\phi_{rs}$ . The third phase of operation is the reset phase when the voltage of  $C_s$  is reset to VDD. The voltage of sampling capacitor  $C_s$  at node  $V_s$  during the ON time of reference sampling clock  $\phi_s$ , considering that the VCO has a sinusoidal response for simplicity, can be given by

$$V_{s} = \int_{-0.5T_{on}}^{0.5T_{on}} \frac{G_{m}A_{ro}}{C_{s}} \sin(\omega_{ro}t + \phi)dt \Rightarrow \frac{2G_{m}A_{ro}}{\omega_{ro}C_{s}} \cdot \sin(0.5\omega_{ro}T_{on}) \cdot \sin(\phi), \tag{4.5}$$

where  $\omega_{ro}$  is the RO frequency,  $T_{on}$  is the ON time of  $\phi_s$ ,  $G_m$  is large signal gain of  $M_{IN}$ ,  $A_{ro}$  is the amplitude of RO output, and  $\phi$  is the phase difference between the reference clock and the RO output [6]. The PD gain can then be given by

$$K_{pd} = \frac{V_s}{\phi} = \frac{2G_m A_{ro}}{\omega_{ro} C_s} \cdot \sin(0.5\omega_{ro} T_{on}) \cdot \frac{\sin(\phi)}{\phi}.$$
(4.6)

The factor  $sin(\phi)/\phi$  becomes 1 at very low values of  $\phi$ . It is seen in Eq. (4.6) that  $K_{pd}$  depends on  $T_{on}$  sinusoidally, and has a maximum value at  $T_{on} = 0.5/\omega_{ro}$ . If RO output is a square wave, then  $K_{pd}$  will depends on  $T_{on}$  linearly, but has the same maximum value condition. The value of  $K_{pd}$ as observed in simulations lies within 0.1-0.3 V/rad. The RO core with varactor exhibits a VCO gain  $K_{VCO} = 30 - 70 MHz/V$ . Since this is a type-I PLL, and it lacks an integrator, the phase error between the reference and oscillator does not go to zero, but settles at a constant value proportional to the control voltage.

#### **Frequency Tuning**

The RO free running frequency can be coarsely tuned by controlling the COARSE and the MED banks digitally, as shown in Figure 4.4. As mentioned in the previous section, the frequency can be controlled by changing the node capacitance or the supply voltages. From Eq. (4.1) and (4.3), it can be seen that by increasing the node capacitance, the frequency can be reduced but the power consumption will not reduce. On the other hand, by reducing the core supply rail voltage, both the frequency and power consumption can be reduced. On the down side, because of the lower  $V_{DD}$ , the voltage swing reduces, and the jitter increases according to Eq. (4.2) at lower frequencies. However, that is less of a disadvantage because as calculated in Eq. (3.4), higher jitter can be tolerated at lower sampling frequencies. For the said reasons, a coarse frequency control using supply voltage control is employed.

In this design, the 4-bit COARSE and 3-bit MED control signals control the PMOS switches to tune the internal supply voltage rail that supplies the RO core inverters. The internal supply control can vary the RO frequency in the range of 400 MHz to 1.2 GHz, to allow sufficient margin for process, voltage and temperature (PVT) variations. The MED bank has a resolution of 7 MHz, which is not sufficient for locking as the lock-in range is quite low for this type-I PLL.

For a finer resolution, in order to bring the RO close to locking frequency and improve the spur performance, another PMOS switch with tunable input voltage  $V_{dac}$  is used. This voltage  $V_{dac}$  is controlled by a 5-bit R-2R ladder voltage digital to analog converter (DAC) circuit. In this design, the resistive DAC (R-DAC) is only optimised for low power consumption, since it is not the dominant source of noise. The linearity of R-DAC is not of prime importance as the R-DAC code is manually controlled, and also its targeted resolution is around 300-400 kHz, which is sufficiently high to lie within the acquisition range of the PLL.

As the first priority of this design is to verify the concept of the proposed FTL, the coarse tuning of the RO is left as a manual tuning process. However, having a low lock-in range makes the synthesizer susceptible to loss of lock, due to frequency and temperature drifts. A divider based background frequency tuning needs to be adopted in the future versions for robust locking to desired frequencies. Using a feedback divider in these frequency synthesizers does not add a heavy penalty of power because of their low (sub-GHz) frequency of operation.

Figure 4.5 shows the layout of the designed RO-based PLL along with its reference pulse generator. The area of this layout is mostly dominated by decoupling capacitors placed on the internal core supply rail, which are used to suppress the reference spur due to supply ripple. The second major area consuming block is the R-DAC.



Figure 4.5: Ring oscillator based PLL top level layout.

# 4.1.2. Post-layout Simulations

# **Frequency Tuning Range**

Table 4.1 below, shows the total tuning range of the ring oscillator, as obtained in the post layout simulations at different process corner and temperature limits. The COARSE bank has a resolution of 50 MHz and the MED bank has a resolution of 7 MHz. The R-DAC has a range of 9 MHz and achieves a resolution as low as 300 kHz. Since the COARSE bank alone cannot cover the entire tuning range, two extra bits of control are added to select the required frequency range. During measurements, this frequency tuning needs to be done manually at the start-up time.

| Process Corner | Temperature (°C) | Tuning Range (GHz) |  |  |
|----------------|------------------|--------------------|--|--|
| TT             | 27               | 0.3 - 1.15         |  |  |
| TT             | -45              | 0.35 - 1.35        |  |  |
| TT             | 125              | 0.25 - 1           |  |  |
| FF             | 27               | 0.35 - 1.3         |  |  |
| SS             | 27               | 0.25 - 1.05        |  |  |

Table 4.1: RO-based PLL tuning range at different temperatures and process corners.

#### Phase noise performance

Figure 4.6a shows the output phase noise spectrum at 500 MHz output frequency. This plot is a result of a transient noise simulation on the post layout RO, which gives a similar response as the pnoise simulation. The integrated jitter is calculated by integrating the PN plot from 1 MHz to 250 MHz, which would be the minimum and maximum alias frequencies when 500 MHz is the sampling frequency, and hence is the bandwidth of interest. The minimum integration limit is also decided by the maximum observation time of the digital block during frequency estimation. The phase noise spectrum in Figure 4.6 results in an integrated jitter of 4 ps which is satisfying the required performance as derived in specifications 3.3. The plot exhibits a noise bandwidth of 10 MHz. On the right is the phase noise performance at 1 GHz frequency which gives a 2ps integrated jitter when integrated from 1 MHz to 500 MHz. The integrated jitter calculated does not include the reference spur.



Figure 4.6: (a) Phase noise spectrum  $@f_{osc} = 500 MHz$  (b) Phase noise spectrum  $@f_{osc} = 1000 MHz$ .

#### **Reference Spur**

The reference spur at 50 MHz offset leads to a deterministic jitter at the RO output. A specification for this spur is derived to be -50 dBc for an output frequency of 500 MHz. As can be seen in Figure 4.7, the spur performance at 500 MHz is -56 dBc which exceeds the specification.



Figure 4.7: Reference spur  $@f_{osc} = 500$ MHz.

Some of the causes of this reference spur are mentioned below.

- 1. Charge injection and re-sample switch leakage into  $C_h$  during reset phase  $\phi_{rst}$ .
- Coupling capacitance between re-sample switch clock and hold capacitor C<sub>h</sub> causing clock feedthrough.

3. Supply ripple caused by charging and discharging of large sampling capacitor  $C_s$  of 480 fF in the DAPD.

The total RMS jitter can be calculated by

$$j_{rms} = sqrt(j_{integrated}^2 + j_{deterministic}^2).$$
(4.7)

The RMS jitter calculated is 2.8 ps @ 1 GHz and 4.5 ps @ 500 MHz.

It is to be noted that the spur plots are the outputs of simulations run with all three ROs running in parallel and locked to the same reference clock.

# Lock-in Range

The lock-in range of the RO based PLL is estimated by sweeping the FINE and MED bank codes close to a reference harmonic, and observing the range of free running frequencies that result in a lock to the reference harmonic. The estimated lock-in range of the designed type-I sub-sampling PLL is limited to a total of 5 MHz. This makes the frequency synthesizer very susceptible to frequency drifts and temperature variation during operation. The RO also has a risk of locking to an incorrect reference harmonic due to the sub-sampling operation. Hence, it is desirable for future updates to include a frequency divider in the feedback to dynamically tune the frequency of operation. Since the operating frequencies are less than 1 GHz, these dividers can still be power efficient.

# 4.2. Sampling circuit

Sampling is the second step of alias frequency generation. This circuit sub-samples the VCO signal directly by the RO output clocks to produce an alias signal which lies within  $0-f_s/2$ , where  $f_s$  is the sampling frequency. When the sampling clock directly samples the VCO signal, it can introduce high spurious tones at the VCO output. These spurs occur at multiples of sampling clock frequency around the locked VCO frequency. In conventional pass gate or CMOS switch-based sampling structures, this can be caused due to

- 1. Charge injection into VCO tank from sampling switch,
- Capacitance modulation of VCO tank due to addition and removal of sampling capacitance depending on sampling clock,
- 3. Sampling clock feed through [6].

One of the conventional ways to reduce these spurs is by using a buffer to isolate VCO signals from the sampling capacitance to avoid capacitance modulation, but this is a power and area consuming solution. Alternatively the spur can also be reduced by reducing the sampling clock duty cycle  $(D_{ref})$  such that the capacitance modulation is quite small, as can be gathered from Eq. (3.6). However, this requires the sampling transient response to be very fast (high bandwidth) to capture the correct VCO signal value, which means the sampling switch needs to be quite large (for low ON resistance), which in turn increases the clock feed-through and charge injection.

To avoid the above-mentioned issues, a charge-sampling-based phase detector (CSPD) is proposed in [6], where instead of sampling voltage, charge is sampled onto the sampling capacitance. Since the VCO is isolated from the sampling capacitor by a small  $G_m$  cell, the spurs due to sampling at the output of VCO are greatly suppressed. On the downside, the continuous charging and discharging of the sampling capacitance, during OFF and ON times of sampling clock, led to sampling spurs at the PD output. This was resolved in their subsequent design [23] based on a dynamic-amplifier operation, where the sampling capacitor voltage is re-sampled and held on a re-sampling capacitance while the sampling capacitance is reset to VDD. Since the re-sampled voltage does not get reset, there is no periodic charging/discharging operation on this node, so the PD output spur is quite suppressed. The dynamic-amplifier-based structure is selected for the sampling operation in this design for the reasons summarized below.

1. The VCO signal is isolated from the sampling capacitance and hence does not see much modulation capacitance, leading to a low spur [6].

- Low sampler power consumption as the dynamic charging and discharging operation is limited to the small sampling capacitance.
- 3. Low spur at the output of the structure due to re-sampling.

The dynamic-amplifier-based phase detector (DAPD), whose structure is shown in Figure 4.8a, is chosen to perform the sampling operation. This DAPD acts as a charge sampling sub-sampling mixer rather than a PD in this block. Although a single phase of VCO was sufficient to find the alias frequency, this differential structure is employed to have a load balance on both phases of differential VCO output. However, only one of the output phases is used for further processing.



Figure 4.8: (a) Dynamic Amplifier based Sampler, (b) Phases of Sampling operation.

This structure has three phases of operation, which are explained below, and their clock phases are shown in Figure 4.8b.

**Phase 1:** Sampling phase  $\phi_s$ : VCO voltage is converted to current by  $M_1, M_2$  transistors, which act as  $G_m$  stages, and this current discharges  $C_s$  for  $T_{on}$  time, creating a voltage  $V_s$  on  $C_s$ .

**Phase 2:** Re-sampling phase  $\phi_{rs}$ : Switch  $S_1$  is turned off and switches  $S_2$  are turned ON. During the ON phase of  $\phi_{rs}$ ,  $C_s$  voltage is re-sampled onto  $C_{rs}$ .  $C_{rs}$  voltage is held until next re-sample cycle.

**Phase 3:** Reset phase  $\phi_{rst}$ : Only switches  $S_3$  are ON,  $C_s$  voltage is reset to VDD, so that  $M_1, M_2$  can start up instantaneously in the next sampling operation. The reset operation and the  $T_{on}$  duration make sure that the common-mode voltage is maintained at around 800 mV.

The transfer function of this DA based sampler can be derived by observing its equivalence to the charge-sampling-based mixer followed by a voltage sampler shown in Figure 4.9.



Figure 4.9: DA based sampler divided into charge sampling sinc function (blue) and voltage sampling discrete low-pass function (red).

In the above figure, the voltage input  $V_{in}$  is converted into current  $I_{in}$  by the  $G_m$  stage. This current charges the capacitance  $C_s$  during phase  $\phi_s$ . The voltage on  $C_s$  can be given by,

$$V_{s} = \frac{1}{C_{s}} \int_{nT_{s}}^{nT_{s}+T_{on}} I_{in} dt = \frac{G_{m}}{C_{s}} \int_{nT_{s}}^{nT_{s}+T_{on}} V_{in} dt,$$
(4.8)

where  $G_m$  is the trans-conductance of  $M_1, M_2, T_{ref}$  is the time period of the sampling clock and n is an integer. If  $C_h$  is ignored for the moment, during  $\phi_{rs}$ ,  $V_{out} = V_s$  and later  $V_s$  is reset to 0 (or VDD in our case, which can be treated equivalently). Then the transfer function from  $V_{in}$  to  $V_{out}$  considering all three phases can be given by

$$h(t) = \frac{V_{out}[n]}{V_{in}(t)} = \frac{G_m T_{on}}{C_s} rect(\frac{t - T_{on}/2}{T_{on}}),$$
(4.9)

where rect(t/T) is the function of a square pulse in the time domain. This can be expressed in frequency domain and the *rect* function translates into a *sinc* function.

$$|H(f)| = \frac{G_m T_{on}}{C_s} \Big| \frac{\sin(\pi f T_{on})}{\pi f T_{on}} \Big|.$$
(4.10)

It is to be noted that at this point the frequency of the output is equal to  $f_{out} = |f_{in} - m \cdot f_s|$  because of the conversion from continuous to discrete-time, where *m* is an integer. Now considering both  $C_s$  and  $C_h$ , the voltage sampling circuit presents a discrete-time low pass transfer as below.

$$H_2(z) = \frac{V_{out}[z]}{V_s[z]} = \frac{1 - \frac{C_h}{C_h + C_s}}{1 - \frac{C_h}{C_h + C_s} Z^{-1}} Z^{-\frac{T_{on}}{T_s}},$$
(4.11)

where  $Z^{-\frac{T_{on}}{T_s}}$  represents the delay in the system. The total sampler output is then given by

$$V_{out}[z] = \frac{G_m T_{on}}{C_s} \frac{\sin(\pi f T_{on})}{\pi f T_{on}} \frac{1 - \frac{c_h}{C_h + C_s}}{1 - \frac{c_h}{C_h + C_s} Z^{-1}} Z^{-\frac{T_{on}}{T_s}} V_{in}.$$
(4.12)

A low value of  $C_s$  is desirable for a higher gain, but a higher  $C_s$  is desirable for lower noise and spur.  $T_{on}$  less than or equal to half of VCO signal period allows a higher gain.

The sampler output common-mode voltage is maintained at 0.8 V such that the  $f_s$  spur is limited. As the sampler output voltage swing increases, the spur amplitude also increases. The spur is caused because of two reasons. First, due to clock feed-through of the re-sample clock to  $C_h$  due to coupling capacitance of the switch  $S_2$ . Second, leakage of the sample node voltage into re-sample node by leakage of the re-sample switch in the reset phase. Therefore, if the magnitude of voltage reset on  $V_s$  is large, then the spur is also large. The spur manifests as a periodic voltage bump on the output signal whose width is equivalent to the reset phase clock. There is a trade-off between the linearity and spur caused by the re-sampling switch. The non-linear ON resistance of the re-sampling switch may cause distortion in the output voltage of the sampler. To reduce the distortion, the switch needs to be large enough to reduce its ON-resistance. However, a larger switch leads to a larger clock feed-through due to increased coupling capacitance. The sampler voltage gain is observed to be around 0.3.

# 4.3. Amplification circuit

As derived in the specifications section 3.3, the amplifier should be able to amplify the alias frequency signals between 3 MHz (maximum lock-in range of the main PLL) and  $f_s/2$  where  $f_s$  is the sampling frequency. The signal needs to be amplified by at least a factor of 100 to convert them into rail-to-rail digital pulses, making the necessary gain >40 dB. Considering the sampler gain (0.3), the required gain increases by a few more dB. The necessary gain is obtained from simulations by observing the minimum amplitude of alias signals.

The architecture chosen for the amplifier is a multi-stage self-biased inverter-based amplifier. A differential amplifier is not used here for two reasons. One of the important non-linearities that needs to be removed is the sampling spur which is not a common-mode phenomenon. On the contrary, it can be detrimental, as explained further. The voltage bump at the output of the sampler (as mentioned at the end of previous section), which leads to a spur at the sampling frequency, is the sampled voltage-dependent. For example, in one sampling cycle, if the differential voltages of the sampling nodes are  $V_{sp}$ ,  $V_{sn}$  such that  $V_{sp} > V_{sn}$ , then in the reset phase, the voltage reset magnitudes will be  $VDD - V_{sp} < VDD - V_{sn}$ . Since the "n" phase reset is of larger magnitude, it induces a higher spur (due to leakage)

than the "p" phase, into the re-sampling node. Since the spur caused on both the phases is unequal, it will be treated as a differential signal and amplified by the differential amplifier, hence causing unwanted pulses at the amplifier output. Additionally, a differential amplifier is also more power-consuming than a single-ended amplifier because of the operating frequencies of around 500 MHz.

A few other nonlinearities in the sampling system need to be taken care of while designing the amplifier. They are explained below.

# Non-linearities at low alias frequencies



Figure 4.10: Sampler output wave and its rail-to-rail digital output signal at low alias frequencies for the cases of (a) no non-linearities, (b) jitter on sampling clock, (c) spurs at sampling frequency, at  $f_s = 1 GHz$ ,  $f_{alias} = 5 MHz$ .

The two major non-linearities observed at low alias frequencies are caused by (1) jitter and (2) spurs at the sampling frequency. Both of these effects cause extra unwanted pulses at the maximum slope regions of the alias signal. Figure 4.10a shows the expected output when no non-linearities are present. It was already established in section 3 that jitter affects the low-frequency alias signals the most.

It can be seen in Figure 4.10b that the presence of jitter adds additional pulses at the threshold crossing of the signal. Since the jitter performance of the ring oscillators is limited by the power performance, these jitter-caused pulses need to be taken care of either in the digital block or in the amplifier itself. Additionally, as seen in Figure 4.10c, the  $f_s$  spurs caused by the clock feed-through and resample switch leakage during the sampling operation are also amplified around the inverter threshold crossing and lead to extra output transitions.

#### Non-linearities at high alias frequencies

On the contrary to low alias frequencies, any non-idealities at high alias frequencies result in missing output pulses, which lead to incorrect frequency estimation. These non-idealities are most impactful when the alias frequencies get close to  $f_s/2$ . One of the effects is due to jitter, as shown in Figure 4.11a, as the signal amplitude at the sampling instant becomes so small. The other effect is caused by insufficient gain at high frequencies where the low amplitude pulses of the signal are not amplified, leading to periodically missing pulse counts. Apart from these, the high-frequency alias signals are also impacted by the spurs at the sampling frequency.



Figure 4.11: Sampler output wave and its rail-to-rail digital output signal at high alias frequencies for the cases of (a) jitter on sampling clock, (b) Insufficient gain at  $f_s/2$ ,  $@f_s = 1 GHz$ ,  $f_{alias} = 490 MHz$ .

## **Proposed Solution**

By having high gain at high alias frequencies, the missing pulses due to insufficient can be corrected to an extent. Insufficient gain at  $f_s/2$  cannot be fully compensated because the alias signal amplitudes go extremely low, because of the amplitude modulation as seen from the sampler output in Figure 4.11a. and becomes comparable to noise and increasing the gain further increases the impact of jitter too. Additionally, this amplitude envelope contains information of the alias frequency which can be lost because of excessive gain. Furthermore, by increasing the gain at high frequencies, the risk of amplifying spurs at  $F_s$  increases too. To avoid these spurs, a notch filter tuned to filter these spurs is needed. Nevertheless, the missing pulses due to jitter cannot be compensated in the analog domain and have to be compensated digitally, as will be explained in Section 5.1.

Just contrary to the high-frequency signals, the low alias frequencies observe non-idealities due to high gain at high frequencies. High-frequency Jitter and sampling spurs (due to clock feed-through and switch leakage) cause extra pulses and hence need to be suppressed by having a fairly low gain at high frequencies. In summary, low gain at high frequencies causes non-linearities at high alias frequencies, and high gain at high frequencies cause non-linearities at low frequencies.

This can be resolved by having two amplifiers in parallel where one amplifier has high gain at high frequencies and the other has sufficient gain at low frequencies and low gain at high frequencies. This solution creates only a small power overhead because the low frequency amplifier power consumption will be much lower than that of the high-frequency amplifier.

The final design structure is shown in Figure 4.12, where the sampler output is given to two amplifiers. Figure 4.12b shows the single-stage low-frequency (LF) amplifier which suppresses the high-frequency spurs and jitter and hence gives an accurate pulse train at the output,  $F_{dig,low}$ , for low frequencies. Figure 4.12c shows the structure of high-frequency (HF) amplifier. It is a multi-stage amplifier to satisfy the high gain requirement. The second stage of this amplifier has a tunable notch filter for suppression of the spur at  $F_s$ . Both the high and low-frequency amplifiers have tunable load capacitances at the output of each stage for bandwidth control.



(C)

Figure 4.12: Amplifier implementation with (a) sampler and amplifier block level representation, (b) Low-frequency amplifier, (c) High-frequency amplifier.

#### Notch filter:

Bandstop filters are generally formed by connecting a low pass filter and a high pass filter in parallel. A twin T-network notch filter is employed in the high gain amplifier to suppress spurs caused by sampling because of its simplicity in design and usage of only passive components. The R-C-R path creates the low pass filter and the C-R-C path creates the high pass filter, which when combined gives a notch at the center frequency.

The centre frequency of the notch filter in Figure 4.12c can be given by the below equations,

$$\begin{cases} f_0 = \frac{1}{2\pi} \sqrt{\left(\frac{2}{C_1 C_2 R_2^2}\right)^2}, \\ f_0 = \frac{1}{2\pi} \sqrt{\left(\frac{1}{C_1^2 R_1 (2R_2)}\right)^2}. \end{cases}$$



Figure 4.13: Amplifier alias frequency generation transfer curves (a) with only one amplifier, (b) with two amplifiers.

The resistor  $R_1$  and capacitor  $C_1$  are made tunable to cover notches at different sampling frequencies and also support the PVT variations and mismatches in the passive components.

Figure 4.13 shows the expected single-amplifier and two-amplifier frequency responses for a given sampling frequency  $f_s$ . On the X-axis is the VCO frequency ( $f_{vco}$ ) and on Y-axis is the number of pulses counted in a long time period  $T_p$ , which should be proportional to the alias frequency  $|F_{vco} - N \cdot F_s|$ . The blue lines show the expected response, which is linear and monotonically dependent on  $f_{vco}$ . The orange line in Figure 4.13 (a) which would be the performance of a single amplifier circuit, including all the low and high alias frequency non-linearities, resulting in non-monotonicities. On the right is the proposed performance, where the low-alias frequency non-linearities are completely suppressed by the LF amplifier and the high alias frequency non-linearities are mostly suppressed by HF amplifier.



# Post Layout Simulation Results

Figure 4.14: Amplifier alias frequency generation transfer curve.

Figure 4.14 shows the layout of a single sampling frequency slice, which has the sampling and amplification circuits along with the sampler's pulse generator. As can be seen, the sampler, the low-frequency amplifier, and the high-frequency amplifier are placed serially such that there is not a very high routing parasitic load on the sampler output.

The post-layout AC simulations in Figure 4.15 show the gain plots of high-frequency and low-frequency amplifiers. The high frequency amplifier has a high gain of upto 50 dB and it amplifies signals between the bandwidth 10 MHz -  $f_s/2$  which is 400 MHz in this case. It does not need to amplify signals as low as 1-3 MHz since the low frequency amplifier compensates for it. The plot also shows a notch, approximately around  $f_s$ , whose attenuation is observed to be sufficient to suppress the spurs caused by sampling.

On the right is the low-frequency amplifier. The maximum gain of this amplifier is 20-25 dB and it amplifies signals between 1 MHz and 100 MHz. It has low gain at higher alias frequencies, enough to suppress the high frequency jitter and spurious tones. Such low gain is sufficient for low alias signals

as they already have an amplitude of 300-400 mV, unlike the HF signals, which have amplitudes as low as tens of mV.



Figure 4.15: Amplifier AC response for high-frequency (left) and low-frequency (right) amplifiers.

# 4.4. Reference Divider

Since the chosen sub-Nyquist sampling frequencies for the FTL are not integer multiples of the given reference frequency of 100 MHz, a reference divider is employed. A simple frequency divide-by-two operation can be implemented by a D-flip flop with negative feedback as given in Figure 4.16a. This frequency divider operates at 50 MHz which is a very low frequency and its power consumption is not significant. The only relevant performance metric of this divider is its jitter. Since the phase noise of the reference clock of the frequency synthesizer PLL will be amplified by  $N^2$  at the output of multiplier, the phase noise requirement of the divider is stringent. From the ring oscillator performance in section 4.1.2, its in-band phase noise is -114 dBc/Hz, seen in Figure 4.6b. For the reference divider to not degrade this in-band phase noise, the targeted phase noise of the divider is at least 10 dB less than that, and also considering the  $N^2$  factor, for a maximum RO frequency 1 GHz (N = 20), it should be < -151 dBc/Hz. A True Single-Phase Clock (TSPC) flip flop (Figure 4.16b) based divide-by-2 reference divider is employed. These types of digital flip-flops are usually employed in high-frequency dividers, but here they are employed due to their simplicity and also since a differential signal is not required.



Figure 4.16: (a) Divide by 2 implementation with a D-flip flip (b) TSPC D-flip flop.

Figure 4.17 shows the post layout phase noise performance of divider, which is much better than the expected phase noise and the power consumption is around 10  $\mu$ W which is not very high because of the low-frequency operation.



Figure 4.17: Phase noise performance of Reference divider.

# 4.5. Pulse Generators

There are two sets of pulse generators needed in the proposed FTL architecture. One set is used to generate the reference pulses of the ring oscillator PLL PD and the other is for the generation of sampling clock pulses. Both pulse generators have pulse width and noise constraints.

For the ring oscillator's reference pulse generator, since it lies in the reference path, its noise contribution should be quite low as it will be amplified by a factor of  $N^2$  by the PLL where N is the frequency multiplication factor. Since it is operating at 50 MHz frequency, it is an easy specification to achieve. For the RO in the PLL's DAPD, to have the maximum gain, the sampling pulse width needs to be  $1/(2f_s)$  where  $f_s$  is the RO output frequency.

For the VCO signal sampler, the thermal noise of the pulse generator adds to the sampling clock jitter, so it has to be less than the ring oscillator phase noise to have a negligible impact. These pulse generators are one of the most power-hungry blocks since they need to be low jitter and have high operating frequency of up to 1 GHz.

# 4.6. FTL Top Level Layout

Figure 4.18 shows the top-level layout of all the analog blocks in the frequency tracking loop - frequency divider, ring oscillator based frequency multipliers, pulse generators, samplers and the amplification blocks. The total area of just the analog blocks of FTL is  $560 \,\mu m \times 460 \,\mu m$ . Ring oscillators are the blocks that consume most area, since they employ decoupling capacitors on the RO core's internal supply rail to suppress the reference spur caused by internal supply ripples. The rest of the decoupling capacitors are for the main supply voltage. There is also an output multiplexer, which allows observation of all the RO and amplifier outputs during measurements, for calibration purposes.

## **Post Layout Simulation Results**

As explained in chapter 3, the function of the analog blocks in this FTL is the generation of an alias signal and converting it into a pulse train which carries the information of the alias frequency. Hence, the final output of this block is three digital signals toggling at their respective alias frequencies. The robustness of frequency estimation depends on the linearity of the output of these analog blocks.

This section presents the post-layout simulation results of one sampling frequency slice, which consists of a frequency multiplier, a sampler and an amplifier block. For a constant sampling frequency of  $f_s$ , the input VCO frequency is swept (within the main PLL tuning range) such that the output alias frequency ranges from 0 to  $f_s/2$ . The output pulses are then counted for a long period of  $1 \mu s$ . The output count is a direct indication of the average alias frequency.

Figure 4.19a shows the average alias frequencies of the HF and LF amplifier outputs when the sampling frequency is 1 GHz and Figure 4.19b shows that of 500 MHz. These transfer functions are the result of transient noise simulations which include the effects of all non-idealities like jitter and insufficient gain. The sampling clock of 1 GHz has an RMS jitter of 2.8 ps and the 500 MHz clock has 4.5 ps, which were estimated from Figure 4.6. The low frequency alias signal has an accurate response







Figure 4.19: Aliasing signal transfer function from 0 to  $f_s/2$  for (a)  $f_s = 1 GHz$  (b)  $f_s = 500 MHz$ .

up to 100-150 MHz and the higher alias frequencies are suppressed along with the high frequency jitter and spur, which is seen at both sampling frequencies. On the other hand, at both sampling frequencies, the high frequency alias signal has an accurate response above a minimum of 25 MHz, below which the high frequency jitter causes unwanted pulse count.

At high alias frequencies close to  $f_s/2 = 500 MHz$ , the 1 GHz simulation shows some missing pulses which leads to a decrease in the average frequency. The observed frequency error at  $f_s/2$  is almost 50 MHz which is high enough to cause wrong VCO frequency estimation. However, increasing the gain of the amplifier further to reduce these missing pulses has a trade-off with extra jitter caused pulses at low frequencies. The characteristics of these missing pulses in smaller durations are observed and an error correction algorithm is employed in the digital block to compensate for this error (explained in section 5.1). Interestingly, for lower sampling frequencies, the non-linearities due to missing pulses are non-existent or at least not very prominent, making the lower sampling frequencies a better choice. The noise of the amplifier seems to have a negligible effect compared to the sampling clock jitter.

# 5

# **RTL Implementation**

The frequency tracking loop is a digitally intensive circuit since the alias signal processing and frequency estimation are done in the digital domain. This chapter gives an overview of the Register-Transfer Level (RTL) implementation of different functions of the digital block. The digital block is designed keeping in mind a low locking time, low complexity, and high robustness.



Figure 5.1: Top-level block diagram of digital block in FTL.

The digital block receives a high and a low-frequency aliasing signal per sampling slice, in the form of digital pulses, from their respective amplifier blocks. It then processes these signals to estimate the current VCO frequency and apply an appropriate control signal to the VCO to update the frequency. Figure 5.1 shows the complete top-level block diagram of the FTL digital module. The digital block has five functions and each of the sub-blocks performs one task. These tasks are enumerated below.

- 1. Alias frequency estimation: The alias signals from the analog environment, which are in the form of a digital pulse train, are converted into digital codes equal to the absolute value of their respective alias frequencies ( $f_{ai} \forall i = 1, 2, 3$ ).
- 2. VCO frequency estimation: By processing the three alias frequency values, a digital word  $f_{vco}$  corresponding to the VCO frequency is generated.
- 3. VCO control: The difference between the input control word  $F_{desired}$  and  $f_{vco}$  is calculated and an appropriate control code is sent to the VCO, if the FTL is ON.
- 4. Speed and accuracy optimization: Depending on the frequency error  $F_{err}$ , the speed and the accuracy of  $f_{vco}$  estimation are dynamically updated.
- 5. Lock-Unlock Detection: Depending on *F<sub>err</sub>* and the external signal **PLL Unlock**, the FTL is turned ON/OFF.

All the digital sub-blocks work synchronously with a global clock of 100 MHz, which is the same as the PLL reference clock. Each of these blocks is explained in detail in the following sections.

# 5.1. Alias Frequency estimation

As explained in Chapter 3, the aliasing frequency generation generally involves two steps of operation: (1) Counting the number of rising edges of the aliasing signal, (2) Scaling the count to full scale to calculate the aliasing frequency. As described in section 4.3, there are a few non-idealities in the sampling and amplification blocks that are difficult to fix in the analog domain, and hence, lead to errors in the count values. However, these non-idealities can be addressed in the digital domain by adding an extra sub-block for count error correction.

#### Phase 1: Pulse Counting

The counter is a 9 bit synchronous counter that up-counts by 1 each time there is a rising edge of the aliasing signal, which acts as a clock to the counter. Since the counter belongs to the aliasing signal clock domain and the rest of the digital block belongs to the FTL clock domain, the counter is asynchronous with respect to  $clk_{ftl}$ . This asynchronous behaviour can lead to metastability if the counter output does not meet the setup and hold time requirements of its following block. To avoid this problem, the output of this counter is sampled at every FTL clock edge to make it synchronous with the FTL's 100 MHz clock using two serial registers. Two registers are used to make sure that even if the first register faces any meta-stability, the second register still captures a stable value so that it will not affect the operations of the following blocks. This registered value is then differentiated to find the pulse count *N* in 10 ns observation time, which is the clock period of  $clk_{ftl}$ . These operations are done in the "Counter and Differentiator" block shown in Figure 5.2.



Figure 5.2: Alias signal pulse counter and differentiator.

There are two counter/differentiator structures in each for each sampling frequency path, as shown in Figure 5.3. One counter corresponds to the high-frequency amplifier output and the other to the low-frequency amplifier output. Their outputs are collected in the error correction block. The functions of the error correction block are two-fold.

- 1. To choose between the correct count value between the high-frequency and low-frequency alias pulse counts *N*<sub>high</sub>, *N*<sub>low</sub>.
- In the case of a high-frequency alias signal, as explained in Section 4.3, there will be missing pulses in the signal due to limited gain and jitter. The missing pulse count needs to be compensated by this block.

The output of the error correction block is  $N_{80ns}$  which is the effective pulse count of the alias signal in 80 ns (significance of this number is explained in the next sub-section). Detailed functioning of this block is explained further in this chapter.

## Phase 2: Frequency Scaling

The frequency information carried by the alias signal pulse train can be extracted by observing the number of rising edges occurring in a certain time period  $T_p$ . If the frequency of the aliasing signal is



Figure 5.3: Alias frequency estimation using high and low amplifier signal pulses.

 $f_a$ , its time period on average is  $T_a = 1/f_a$ . Therefore, in a period  $T_p$ , if there are *N* periods of  $T_a$  i.e.,  $T_p = N \cdot T_a$ , then there should be *N* pulses of aliasing signal in  $T_p$ . So the aliasing signal frequency can be reverse calculated as

$$f_a = \frac{N}{T_p} \Longrightarrow N = T_p \cdot f_a. \tag{5.1}$$

The accuracy with which  $f_a$  can be calculated depends on the observation time  $T_p$ . For example, if there is 1 rising edge of the aliasing signal in 10 ns, it could correspond to a minimum of 1 Hz or a maximum of 200 MHz depending on the phase of the signal. However, the estimated frequency is always 100 MHz from Eq. (5.1). Hence, there will be an estimation error of  $\pm 100 MHz = \pm 1/T_p$ .

As explained in Chapter 2, the three aliasing frequencies  $f_{a1}, f_{a2}, f_{a3}$  should be three integers whose combination must be unique so that the actual value of  $f_{vco}$  is estimated unambiguously. Literature suggests that the amount of error in alias frequency estimation that can be tolerated is dependent on the GCD [19]. As the chosen GCD of sampling frequencies is 50 MHz in this project, when the value of  $f_a$  is rounded off to a multiple of  $1/T_p$ , from simulations it is observed that a  $T_p$  of minimum 40 ns is needed for an unambiguous reconstruction of  $f_{vco}$ . 40 ns is equivalent to  $4 \cdot T_{ftl}$ , where  $T_{ftl}$  is the FTL clock period. Additionally, as will be explained in the next section of the error correction block, to compensate for the non-linearities in the sampling block, the minimum value of  $T_p$  required is  $8 \cdot T_{ftl}$ . There are two factors that play a role in choosing the value of  $T_p$ .

- 1. Accuracy of  $F_{err}$  estimation: If the frequency error  $F_{err}$  between the VCO frequency  $f_{vco}$  and the desired frequency  $F_{desired}$ ,  $F_{err} = |F_{desired} f_{vco}| >> 1/T_p$ , then the magnitude of estimation error in  $F_{err}$  will be negligible compared to  $F_{err}$ . In such a case, this accuracy is sufficient. As  $F_{err}$  reduces, there is a need to estimate  $f_a$  more accurately for proper locking. The minimum frequency error that needs to be estimated by the FTL is 3 MHz, so the maximum observation time that is needed can be  $32 \ clk_{ftl}$  cycles =>  $32 \cdot T_{ftl} = 320 \ ns$ .
- 2. **Speed of Locking:** The objective of this FTL is to acquire a frequency lock in the least amount of time. However, the value of  $T_p$  cannot be arbitrarily low as it is limited by robust frequency reconstruction and circuit non-linearities. As the minimum  $T_p$  is limited by robust estimation of  $f_{vco}$ , the minimum is  $8T_{ftl} = 80 ns$ .

To observe the number of pulses in  $T_p$  duration an accumulator is used, which adds all the pulse counts for *n* consecutive values of error correction block output  $N_{80ns}$ , as shown in Figure 5.3, where  $T_p = n \cdot 8 \cdot T_{ftl}$ . This value of  $T_p$  can be modulated by varying the value of *n* as required. The value *n* is controlled by the speed and accuracy optimization block. After every successful VCO frequency estimation, the accumulator is reset to 0.

#### **Error correction:**

The first function of the error correction block, which is choosing between  $N_{high}$ ,  $N_{low}$  is a simple operation done by observing both high and low count values and comparing them to a threshold  $N_{th}$ . High count  $N_{high}$  is selected if  $N_{high} >> N_{th}$ , and low count  $N_{low}$  is selected if  $0 << N_{low} \le N_{th}$  and  $N_{high} \le N_{th}$ .

The second function, which is the error correction, is used only in the case of high alias frequency signals which are very close to  $f_s/2$ . As the low-frequency amplifier does not amplify these frequencies,  $N_{high}$ , which is  $>> N_{th}$ , is always picked. This count may have errors as observed in Section 4.6, due to missing pulses. These missing pulses are analyzed more carefully to provide a workaround. As seen in Figure 5.4, during certain durations, due to the amplitude envelope of the alias signal, the peaks are not amplified to be rail-to-rail pulses. The workaround implemented here is to divide the observation period into windows similar to the Figure 5.4, and add extra counts for windows where the pulses are missing. It is explained with an example below.



Figure 5.4: Counting pulses in the windows and compensating for the missing window.

For example, consider the case when  $f_s = 1 GHz$ ,  $f_a = 490 MHz$  and the observation period is 40 ns, which is equivalent to 4 cycles of  $clk_{ftl} (= 100 MHz)$ . Ideally, the pulse count for this aliasing frequency in a 40 ns window should be 19-20, which would be approximately 5 pulses per 10 ns. By dividing the observation period into 4 parts, as shown in Figure 5.4, it can be seen that there would be zones where the pulse count requirement is not met, but in all other zones, it is met. If the algorithm can understand this discrepancy, the missing pulse error can be corrected.



Figure 5.5: RTL implementation of error correction block.

Figure 5.5 shows the RTL implementation of the error correction block. First, the high-counter pulse counts are compared with the maximum pulse count per 10 ns, for 8 consecutive  $clk_{ftl}$  cycles. For a given sampling frequency  $f_s$ , the maximum pulse count per 10 ns can be given by ROUND( $f_s*10*10^{-9}$ ). If the number of times the maximum count is observed ( $C_{max}$ ) is greater than a threshold value  $C_{th}$ , then the alias frequency is automatically rounded off to  $f_s/2$ . This way, the error in  $f_a$  estimation will lie within a 25 MHz error range. The value of  $C_{max}$  and  $C_{th}$  should be decided depending on  $f_s$  and by observing the amplifier outputs.

# 5.2. VCO Frequency estimation

This is the most important part of the frequency sensing algorithm and it needs to be robust. Some direct ways to do this are solving many simultaneous congruence equations to estimate the frequency or using large look-up tables that have a one-to-one mapping of the alias frequency combination to the corresponding signal frequency. The second method has two shortcomings:

(1) It requires a high amount of memory to store all the combinations,

(2) It does not account for any noise or non-linearities in the system, which may lead to errors in the alias frequency estimation. This consequently leads to large errors in VCO frequency estimations. Solving the simultaneous congruences involves solving simultaneous equations of the form:

$$f_{\nu co} = m_1 \cdot f_{s1} + s_1 \cdot f_{a1}, f_{\nu co} = m_2 \cdot f_{s2} + s_2 \cdot f_{a2}, f_{\nu co} = m_3 \cdot f_{s3} + s_3 \cdot f_{a3},$$

where,  $f_{vco}$  is the VCO signal frequency,  $f_{si} = \{f_{s1}, f_{s2}, f_{s3}\}$  are the three sampling frequencies,  $f_{ai} = \{f_{a1}, f_{a2}, f_{a3}\}$  are the observed alias frequencies,  $m_1, m_2, m_3$  are the integers (where  $m_i \cdot f_{si}$  is the integer multiple of sampling frequency  $f_{si}$  which is closest to the VCO frequency), and  $s_1, s_2, s_3$  give the sign of the alias frequencies and their value lies in  $\{1,-1\}$ . The limitation of these equations is that there are 6 known values -  $f_{si}$  and  $f_{ai}$ , and 7 unknown values -  $f_{vco}, m_i, s_i$ . Fortunately, for a given VCO frequency range  $[f_{min}, f_{max}]$ , there are only a limited number of possibilities for  $m_i$ , for each i. However, if there are x, y, z integer multiples possible for  $f_{s1}, f_{s2}, f_{s3}$  respectively, in  $[f_{min}, f_{max}]$ , then there will be 2x + 2y + 2z equations to be solved, and  $x \cdot y \cdot z$  comparisons to be made, which is a high number of multiplications, additions and comparisons to be performed. In the end, only one set of three equations unambiguously estimates the correct VCO frequency.

To reduce the number of equations to be solved, the VCO frequency can be divided into zones as shown in Figure 5.6. First, the frequency zones are divided based on the frequency integer multiple and then by the alias frequency sign. Furthermore, these zones can be divided into sub-zones which have unique combinations of the vector  $\{m_1, m_2, m_3, s_1, s_2, s_3\}$ . In Figure 5.6, one of such zones is highlighted in green. A look-up table can be created with these unique combinations of m's and s's along with the information of the maximum and minimum possible frequency values of each alias signal in that zone. By just comparing the alias frequency estimated in the previous section with the minimum and maximum limits of all the zones, the current VCO frequency can be narrowed down to 3-4 zones (1 in

the best case). That leaves us with 3\*(no. of zones shortlisted) equations to be solved. While solving these few sets, one of the sets will give an unambiguous result to the above simultaneous equations and hence the VCO frequency is estimated.



Figure 5.6: Division of VCO frequency into zones depending on unique combinations of sampling frequency multiples, and alias frequency signs.

The next priority is to robustly estimate the frequency. In ideal conditions, when the three simultaneous equations are solved, the output of all three equations should be exactly equal to  $f_{vco}$ . However, due to various non-linearities like voltage noise, jitter, and insufficient amplifier gain, the pulse count may be inaccurate which leads to an error in  $f_a$  estimation. Moreover, the estimated alias frequencies may have a rounding error depending on the observation time which controls the frequency accuracy. Taking these into account, an error margin of 25 MHz between these estimated frequencies is allowed. Having a high GCD between  $f_{s1}, f_{s2}, f_{s3}$ , allows a higher error margin [19]. Finally, the condition that needs to be satisfied for successful estimation of  $f_{vco}$  is given as

$$\begin{aligned} f_{vco,1} &= m_1 \cdot f_{s1} + s_1 \cdot f_{a1}, \\ f_{vco,2} &= m_2 \cdot f_{s2} + s_2 \cdot f_{a2}, \\ f_{vco,3} &= m_3 \cdot f_{s3} + s_3 \cdot f_{a3}, \\ \{|f_{vco,1} - f_{vco,2}|, |f_{vco,2} - f_{vco,3}|, |f_{vco,3} - f_{vco,1}|\} &\leq 25 \text{ MHz}. \end{aligned}$$

The next obstacle in this process is when the VCO approaches the desired frequency, i.e., the error  $f_{vco} - F_{desired}$  becomes much less than 25 MHz. The observation time is increased for increased accuracy. Eventually, when the VCO is locked to  $F_{desired}$ , the aliasing frequencies  $f_{ai}$  do not go to zero, since  $F_{desired}$  may not be an integer multiple of  $f_{si}$ , for any *i*. At the locking point, if any one of the aliasing frequencies lies close to its Nyquist rate, there may be a rounding error of 25 MHz introduced by the error correction algorithm. In that case, the accuracy requirement for small  $F_{err}$  is not met. This may result in an incorrect estimation of  $F_{err}$  which leads to difficulty in locking.

To avoid these rounding errors,  $f_{vco}$  estimation for  $F_{err}$  <25 MHz will be done by considering only one of the sampling frequency paths, such that its alias frequency at VCO locking point lies close to  $f_s/4$ , where the impact of non-linearities is very low. This ensures that the aliasing signals are accurate and therefore the estimated  $f_{vco}$  and  $F_{err}$  are also accurate. The frequency estimation also becomes easy when there is a prior knowledge of which single frequency  $f_{si}$  to be used during locking, and the zone in which the desired frequency lies in (i.e., knowing values of  $m_i$ ,  $s_i$  to solve  $f_{vco} = m_i \cdot f_{si} + s_i \cdot f_{ai}$ ).

# 5.3. Speed and Accuracy Optimisation

As explained in Section 5.1, the observation time can be modulated to optimize the frequency estimation accuracy or the locking time. The dynamic modulation of observation time  $T_p$  for increased accuracy can be simply explained by the flow chart in Figure 5.7.



Figure 5.7: One frequency estimation cycle with dynamic  $T_p$  modulation.

The flow chart represents the operation flow from the start to the end of one frequency estimation cycle. Every frequency estimation cycle starts with a minimum observation time, i.e., n = 1. This setting continues to be used as long as the VCO frequency is found and the accuracy of  $F_{err}$  estimation is sufficient. Sometimes using the minimum observation time setting, because of insufficient accuracy, either no correct estimate of VCO frequency is found or more than one frequency is found, causing ambiguity. In such cases, the observation time is increased for more accuracy (n is incremented). On the other hand, the observation time is also increased if the frequency is found, but the accuracy of  $F_{err}$  is not sufficient. This goes on until either the frequency is found successfully or maximum observation time is reached. In both cases, the observation time is set to the minimum again. Thus, the speed and accuracy optimization algorithm ensures fast locking by dynamically updating the minimum time required to accurately estimate a frequency error.

# 5.4. Lock and Unlock Detection circuit

The functions of this block are (1) to detect lock at the end of coarse locking procedure, and (2) to detect the loss of lock by observing the presence of large errors between  $f_{vco}$  and  $f_{desired}$  and the external input **PLL Unlock** from the main PLL unlock-detect block.

#### Lock Detection:

The FTL during the locking process brings the PLL close to the desired frequency within a frequency error of 3 MHz. This settling error is sufficient since the main PLL itself has a lock-in range >= 3 MHz and any errors less than that can be corrected by the main PLL. When the frequency error  $F_{err}$  detected by the FTL falls below 3 MHz, the FTL waits for an additional 320 ns (equivalent to 32 FTL clock cycles) to confirm that the error is below 3 MHz, and stops updating the VCO control codes by declaring that the PLL is coarsely locked. The main PLL takes over the fine locking after the FTL is locked. The main PLL is ON all the time. Since the lock-in range of the PLL loop is much smaller than FTL resolution, any frequency modulation caused by the main PLL during coarse locking has no effect on the FTL operation.

#### **Unlock Detection:**

Unlock is usually triggered because of two conditions. First is when the unlock detection circuit in the main PLL sends an unlock signal **PLL Unlock**. During an unlocked state, the phase detector in the main PLL acts as a mixer whose output is an aliasing signal with a frequency  $f_{vco} - m \cdot f_{ref}$ , where  $m \cdot f_{ref}$  is the integer multiple of the reference frequency closest to the VCO frequency. Consequently, the LPF output also oscillates at the same frequency. In a locked state, the frequency error in the main loop reduces to zero. Hence, there are no oscillations present at the LPF output. Therefore, by amplifying these oscillations into digital pulses and counting these pulses, the presence of a frequency error can be detected. If there are pulses observed in 5 consecutive observation times of 320 ns each, then the FTL assumes the PLL is unlocked and starts the locking process. This additional unlock detection circuit is necessary because the three sampling loops do not reach a zero aliasing signal after locking, and they may easily trigger an error/unlock detection with low frequency errors due to noise and non-linearities.

The 5 consecutive  $T_p$  cycles can be considered similar to a dead-zone between the FTL and main PLL operation.

# 5.5. Verilog Simulation Results

The following simulations are run to test the functionality of the RTL. These simulations use ideal VerilogA VCO, ideal sampling clock signals, and ideal VerilogA model of the sampling and amplification blocks for simplicity. Figure 5.8 shows the locking of the FTL to different desired frequencies in the VCO tuning range, when an arbitrary sampling frequency combination of 550 MHz, 750 MHz, and 850 MHz is chosen. Although this simulation does not include any non-linearities of the analog blocks, it shows the efficiency of locking in the presence of rounding errors that occur in alias frequency estimation blocks. It is seen that the VCO is able to lock to the desired frequency within  $2.5 \,\mu s$ . In the 12 GHz locking path, it is seen that there is an incorrect frequency estimation in one cycle, which occasionally happens in coarse locking ( $F_{err} > 100$  MHz), when the observation time is very low. However, the loop can quickly recover after such an error since it is still in fast locking mode. Such incorrect frequency estimations rarely happen when the observation times are increased for much finer locking.



Figure 5.8: Frequency locking with ideal VCO and ideal FTL.



Figure 5.9: Modulation of the observation time  $(T_p)$  for more accuracy in VCO frequency estimation.

Figure 5.9 shows how the observation time is modulated depending on the frequency error  $F_{err}$ . It is seen that the observation time  $T_p$  starts from a minimum of 80 ns. When the frequency error drops below  $1/T_p$ , then  $T_p$  is doubled to increase the accuracy.  $T_p$  is modulated from fast to slow in every frequency estimation cycle. The 40 ns waiting time represented in Figure 5.9 is usually the time allocated for the VCO and the FTL analog blocks to settle, once the VCO frequency is updated.



# Post Layout simulation Results

This chapter presents the post-layout performance of the proposed FTL designed in 40-nm CMOS technology. The simulation results of the locking process are presented in the first section. The next section discusses the break up of the power consumption of each block in the frequency tracking loop. The final section gives a comparison of this design with the state-of-the-art structures.



Figure 6.1: Complete chip top level layout.

The proposed sub-Nyquist sampling-based FTL is designed in TSMC LP 40-nm CMOS process.

Figure 6.1 shows the complete top-level layout of the chip. The other blocks in the main PLL like VCO, OTA, loop filter, reference buffer, 100 MHz pulse generators, test buffer, and the charge sampling phase detector are all reused from an existing design made at TU Delft to save time. A buffer is added at the VCO output in order to have sufficient drive strength to drive the FTL sampler inputs and the load posed by the routing parasitics.

Separate supplies are provided for the VCO, test buffer, FTL analog, digital, OTA, and reference buffer combined with PD. The OTA, the reference path, and the FTL analog block share the same onchip ground and IO ground. The digital block has its own chip ground and IO ground. The VCO and the test buffer share the same IO ground. The total chip area excluding the IO ring and including the decoupling capacitors is 980 um x 1115 um. The total chip area is dominated by decoupling capacitors which are useful to suppress the ripple caused by the bondwire inductances. The total area occupied by the FTL alone is  $0.35 \text{ mm}^2$ ,  $560 \mu m \times 460 \mu m$  for analog, and  $330 \mu m \times 280 \mu m$  for the digital block. The digital block is quite large because of the large look-up table needed for the VCO frequency estimation.



# 6.1. Frequency locking

Figure 6.2: FTL coarse locking process for initial acquisition and error injection.

The performance of the designed FTL is tested at a TT (typical-typical) process corner with a nominal temperature of 27 °*C*, by coarsely locking the PLL to different frequencies in the VCO frequency range. The FTL analog blocks used in these simulations are extracted from their layouts to consider all the layout and routing parasitics. A Verilog model of the digital block is used instead of its large schematic model to reduce the load on the simulator. For simplicity of simulations, a VerilogA model of the main PLL VCO is used, which replicates the L-C characteristics and the non-linear VCO gain. The FTL and

the main PLL use 1.1 V supply. The effects of bondwire inductances are also included in the simulation setup. The sampling clock jitter, as measured in Section 4.1.2, is also presented in these simulations.

Figure 6.2 shows the locking performance of the proposed FTL, aiding a PLL, when the digital input for the desired frequency is 10 GHz. An arbitrary sampling frequency combination of 500 MHz,750 MHz, 850 MHz, is used for the frequency estimation. These frequencies are derived from a set of pairwise co-prime integers 11, 15, 17, and a GCD of 50 MHz. As they all lie within 0.5-1 GHz range, they satisfy all the conditions mentioned in Section 2.3.2.

The first half of the image shows the initial acquisition of the PLL at the start-up. The FTL is able to successfully lock to the desired frequency of 10 GHz within a short time  $<3 \mu s$ , owing to its robust frequency estimation and dynamic speed & accuracy optimization technique. The zoomed-in figure shows that the PLL is coarsely locked by the FTL within 3 MHz error and the fine locking is taken care by the main PLL. The main PLL loop is not turned off during the FTL locking process so that when the PLL is within the lock-in range, it can acquire lock instantaneously. It is evident that the main PLL being ON during the FTL locking process of the main PLL might lead to the loss of lock, hence the FTL's output is frozen and the FTL is put in a standby mode when the main PLL takes over. Thanks to the high and low-frequency amplifiers, and the error correction algorithm, the digital block is able to estimate the VCO frequency accurately in the presence of sampling clock jitter and other non-linearities.

After  $3.5 \mu s$ , an error greater than 1 GHz is injected by switching the VCO settings to a farther value. The FTL, which is still sampling the VCO frequency in the background, is able to detect the large error within a small amount of time and start the locking process. The lock is re-acquired within  $3 \mu s$ . One weakness of this digital block is that, even for error injections as low as 5 MHz, the time to re-lock is still as high as  $2.5-3 \mu s$ . The delay in the acquisition is caused because the FTL turns ON only if the error is consistently present for 5 consecutive observation times. This is done to avoid any false lock failure detection due to noise. Since the observation time for small errors is very high for a better accuracy, the relocking process is slow at lower error injections.

# 6.2. Impact of FTL on PLL performance

Since the FTL controls the VCO only during the coarse locking phase, and is on standby mode during the locked phase, it does not contribute to the in-band phase noise of the PLL. Additionally, since a buffer was added between the VCO and the FTL due to large routing loads, the VCO is completely isolated from the FTL. Hence the spurious tones that could have been caused due to sampling are also avoided. The phase noise and spur performance of the PLL is shown in the below result.



Figure 6.3: Frequency spectrum at main PLL output of 10 GHz.

Figure 6.3 shows the frequency spectrum of the VCO output when the PLL is locked to 10 GHz. The spectrum shows reference spurs of -81.7 dBc at 100 MHz offset from 10 GHz. It can be seen that the reference spur is the worst spur observed in the spectrum. Any spurs caused by the sampling frequencies should appear at their integer multiples around 10 GHz. Since the chosen frequencies are 550, 750 and 850 MHz, the spurs should occur at 9.9 GHz, 9.75 GHz and 10.2 GHz respectively. As seen from the spectrum, any spurs at these frequencies are much lower than the reference spur.

# 6.3. FTL Power consumption

In this section, the worst-case power consumption of the FTL at steady state (after locking) is analyzed. The contribution of all the analog and digital blocks in the FTL to its total power consumption is individually presented in the Table 6.1. In this measurement, all the sampling frequencies are set to 1 GHz, which is the maximum allowed sampling frequency in this project. This frequency is chosen as it leads to the maximum power consumption in most of the analog circuit blocks, and allows a critical analysis of the FTL's power efficiency. During the actual functioning of the FTL, depending on the choice of sampling frequencies, the total power consumption may reduce by  $300-400 \,\mu$ W.

| Block                            | Power consumption (µW)<br>@1GHz | No of blocks | Total power (μW)<br>(3 ON) | <b>Total power (μW)</b><br>(1 ON) |
|----------------------------------|---------------------------------|--------------|----------------------------|-----------------------------------|
| Ring Oscillator                  | 200                             | 3x           | 600                        | 600                               |
| RO pulse generators              | 30                              | 3x           | 90                         | 90                                |
| Sampler                          | 20                              | 3x           | 60                         | 20                                |
| Amplifier                        | 100                             | 3x           | 300                        | 100                               |
| SA Pulse generators              | 100                             | 3x           | 300                        | 100                               |
| Reference divider                | 10                              | 1x           | 10                         | 10                                |
| Digital block<br>(after locking) | <= 200                          | 1x           | 200                        | 200                               |
| Total FTL                        |                                 |              | 1560                       | 1020                              |

Table 6.1: Power consumption of individual blocks in the FTL.

Table 6.1 summarizes the power breakdown of the individual FTL blocks during the steady-state after the locking process is complete. During the locking process, the analog circuit blocks consume a similar power but the digital block power consumption increases to a few mW.



Figure 6.4: FTL Power consumption split up.

As can be gathered from Figure 6.4, and from Table 6.1, the major contributors of high power consumption are the ring oscillators, due to their high frequency and low phase noise operation. The blocks that consume the next highest amount of power are the amplifier chain and the sampling pulse generator circuits. The pulse generators are sized to add very little jitter to the high-frequency sampling clock and hence consume more power. Similarly, the high-gain requirement and high frequency of operation lead to high power consumption in the amplifiers.

Table 6.1 also shows a scenario when two of the three sampling slices are turned off after the locking process is complete, to save power. The remaining one slice may not be sufficient for frequency locking, but it can be used to observe any frequency unlock and turn on the other slices for locking. However, only the pulse generators and amplifiers can be turned off, as the ring oscillators need to be calibrated each time they are turned off and on. Since there is no support for automatic calibration of ROs in this design, they cannot be turned off after locking. Nevertheless, by turning off two loops, the power consumption of the FTL can be reduced by 30% of the total.

The digital blocks consume only 200  $\mu$ W after the locking operation is complete, which corresponds to the clock switching. During the locking process, it consumes much higher power (around a 2 mW) due to the large look up table comparisons, which are avoided after locking.

The total PLL has a power consumption of 6.4 mW, with the VCO and oscillator output buffer together consuming 4 mW. The reference buffer, reference pulse generators and CSPD of the main PLL consume up to 0.7 mW and the OTA uses 0.07 mW. The power of the test buffer is not included in the total power. The FTL consumes almost 25% of the total power consumption, making it the second most power-consuming block after the VCO.

# 6.4. Area and power comparison with State-of-the-art FTLs

The performance summary of the proposed FTL based on multiple sub-Nyquist samplers is presented in Table 6.2. Additionally, a performance comparison with other state-of-the-art PLLs and their frequency-locking aids is also reported. Some of the structures summarised in the table lie within a similar frequency range as this project, while the others function at a mm-Wave frequency. These high frequency architectures are considered in order to compare the impact of frequency scaling. The performance is compared in the aspects of acquisition range, power consumption, area and locking time.

|                                                | JSSC'20               | JSSC'20 | ISSCC'20                   | Yonc Chen                  | JSSC'20              | TMTT'21                 | This                           |
|------------------------------------------------|-----------------------|---------|----------------------------|----------------------------|----------------------|-------------------------|--------------------------------|
|                                                | [29]                  | [12]    | [30]                       | [31]                       | [10]                 | [14]                    | project                        |
| Technology<br>(CMOS)                           | 28 nm                 | 40nm    | 65nm                       | 65nm                       | 65nm                 | 65nm                    | 40nm                           |
| Reference<br>frequency (MHz)                   | 500                   | 200     | 50                         | 103                        | 100                  | 100                     | 100                            |
| Frequency<br>range (GHz)                       | 12.8-15.2             | 12-16.0 | 12-14.5                    | 24.6-29.2                  | 20-25.6 <sup>a</sup> | 40.5                    | 9.8-12.8                       |
| FTL type                                       | Divider +<br>BBPD     | Div+PFD | Freq<br>correction<br>loop | Div +<br>dynamic<br>PFD-CP | Div+<br>PFD          | Reference<br>multiplier | Multiple<br>sub-Nyquist<br>FTL |
| Frequency step                                 | 1GHz                  | NA      | 90MHz                      | NA                         | NA                   | <450MHz                 | >1GHz                          |
| Settling time ( $\mu s$ )                      | 18.5                  | NA      | 0.7                        | NA                         | NA                   | NA                      | 3                              |
| Chip power (mW)                                | 19.8                  | 7.2     | 6.7                        | 10.6                       | 49.5 <sup>b</sup>    | 8.8                     | 6.4                            |
| FTL power (mW)                                 | 1 <sup><i>c</i></sup> | 1.55    | 0.15                       | 4.14 <sup><i>d</i></sup>   | 5.6 <sup>e</sup>     | 2.76 <sup>f</sup>       | 1.56 <sup><i>g</i></sup>       |
| Chip area <sup><math>h</math></sup> ( $mm^2$ ) | 0.17                  | 0.234   | 0.23                       | 0.26                       | 0.58                 | 0.6                     | 1.09                           |

Table 6.2: Performance comparison of State-of-the-Art PLLs.

<sup>a</sup> Frequency range of the first stage.

<sup>b</sup> Total power including other ILFMs.

<sup>c</sup> Calculated from power of divider and BBPD.

<sup>d</sup> Calculated using power of divider and PD-FD + V/I.

<sup>e</sup> Calculated using power of divider and PFD+CP.

<sup>f</sup> Power of reference multiplier.

<sup>g</sup> Worst case power consumption.

<sup>h</sup> Total active area.

#### Acquisition Range:

The acquisition range of the proposed FTL successfully covers the entire tuning range of the main PLL which is 2.4 GHz. However, it is strictly limited to the dynamic range of the chosen sampling frequencies. Nevertheless, there is a flexibility in choosing these frequencies, and hence the locking range can be increased at the cost of other trade-offs. The FTL in [30] exhibits a very low power consumption, but the locking range is limited to a total of 170 MHz. The FTLs in [12], [31], and [10] use divider and PFD for frequency acquisition. As described in Chapter 1, the advantage of having dividers in the feedback along with a PFD is that they give unlimited acquisition range and the locking process is robust. In [29], an FTL based on feedback divider is employed, but a bang-bang PD is used instead of a PFD. Their measurement results show an efficient acquisition for a frequency hop of up to 1 GHz. *Hao Wang, et al.*, [14] used a reference multiplier to increase the speed and range of locking. Although the power consumption is slightly better than dividers at mmWave frequencies, this architecture only locks to a fixed frequency and its lock-in range is limited to a maximum of 450 MHz on each side.

#### **Power Consumption:**

Zhao Zhang, *et al.*, [12] employ a conventional FTL based on divider and phase-frequency detector (PFD) and the FTL has a similar power consumption as the proposed FTL. The most power consuming block of their FTL is their injection locking dividers which consume 1.3 mW. *Santiccioli, et al.*, [29] also used an FTL based on divider, but their power consumption is the best among the compared FTLs including the proposed FTL. However, their chip is designed in a smaller technology than the rest, giving them an added advantage of lower power consumption.

As the frequency of operation increases to mm-Wave range, the power consumption of dividers scales more than linearly. [31] and [10] use high frequency dividers for frequency acquisition and the total power consumption of the FTL is dominated by these dividers. The power consumption of the reference multiplier in [14] is better than the dividers at mmWave frequencies. However, the design has no flexibility in the locking range.

#### Area Overhead:

Coming to the area overhead, the proposed FTL has the highest area overhead compared to all other FTLs, owing to the large look-up tables used in the digital block and usage of multiple PLLs. Even when the area of the decoupling capacitors is ignored, the area of the FTL is limited by the number of sampling paths. As the PLLs move to much higher frequencies, the size of the VCO keeps reducing but the area of the proposed FTL scales only with the number of sampling paths, making it less compliant to area scaling. In some cases of FTLs using ILFD, the area consumed by ILFDs may be higher than the VCO itself due to a larger inductor area [32].

#### Locking Time:

The locking time performance is the best in the proposed FTL, when compared to other FTLs, because it uses a speed optimization algorithm to quickly reduce large errors. The achieved locking time also satisfies the transient time limits of 5G NR as mentioned in Chapter 1.

In summary, the proposed FTL achieves a better locking time, higher area overhead and moderate power consumption compared to FTLs of a similar frequency range. The prospect of scaling it to higher frequency ranges is discussed in the recommendations section 7.1.3.

# Conclusion

Sub-sampling PLLs have been the top performers in the regime of high speed applications owing to their low-power and low phase noise performance. However, they are hindered by their low lock-in range, making them susceptible to loss of frequency lock due to PVT variations, and error injections. The objective of this thesis is to provide a solution that can improve the acquisition performance of the PLLs, while avoiding the high power-consuming blocks like high-frequency dividers. This thesis explores a novel idea of a frequency tracking loop, that uses multiple sub-Nyquist samplers for frequency estimation, which has not been used before in the domain of PLL. The possibility of an unambiguous frequency estimation using three sub-Nyquist frequencies has been mathematically proven in Chapter 2. Additionally, formulae have been derived to calculate the maximum frequency bandwidth in which an unambiguous frequency estimation is possible for a given system of sampling frequencies. In Chapter 5, a speed and accuracy optimization algorithm has also been proposed, which improves the locking time of the FTL.

As a proof of concept, an FTL targeting a frequency tuning range of 9.8-12.2 GHz, is designed in 40-nm CMOS technology. A system of three sampling frequencies (0.5, 0.75 and 0.85 GHz) is chosen such that they are derived from a set of pairwise co-prime integers and a reference frequency of 50 MHz. This system of samplers can support a bandwidth of 9 GHz-12.8 GHz. The designed FTL shows promising results in post-layout simulations, by locking to the desired frequency within 3  $\mu$ s time while consuming a maximum power of 1.56 mW. It is able to successfully cover a large locking range of 2 GHz and correct frequency error injections as high as 1.5 GHz within 3  $\mu$ s. The area overhead caused by the FTL is 0.35 mm<sup>2</sup>. While having a similar power consumption as conventional FTLs that employ high-frequency dividers, the proposed FTL uses a comparatively higher area because of the huge digital blocks, and a large number of analog blocks. However, it is worth investigating the use of this FTL at much higher frequency bands to exploit its advantage of not using high-frequency dividers.

# 7.1. Future Work

## 7.1.1. Acquisition range of Ring-oscillator based frequency multipliers in FTL

As the current version of the proposed FTL is meant to be a proof of concept, the RO-based PLLs used to generate the sampling frequencies in this FTL, are designed to have the required phase-noise performance for a robust VCO frequency estimation. However, they are limited to a manual coarse frequency tuning. These RO PLLs, which do not have an additional frequency acquisition aid, have a risk of locking to any harmonic of the reference 50 MHz rather than the desired frequency and always need a manual tuning at the start up. The lock-in range of these PLLs is also quite low, making them very susceptible to frequency drifts caused by small changes in temperature.

Therefore, in the future versions it is desirable to use a feedback divider based frequency aid for coarse tuning and frequency tracking of these low-frequency PLLs, for a robust and autonomous functioning of the FTL. The power and area overhead of these dividers can be quite low because of their sub-GHz frequency of operation.

# 7.1.2. Power Optimization

This project has focused on simplicity of design for sampling frequency generation and hence integer-N PLLs were preferred for low design complexity as well as to avoid extra spur and phase noise that might be introduced by a fractional-N operation. However, Section 3.3 suggests that the jitter can be relaxed at lower sampling rates. Additionally, the simulation results in Section 4.6 show that the non-linearities at high aliasing frequencies are less dominant at lower sampling rates. Therefore, if sampling rates are reduced to below 500 MHz, the power consumption of the FTL may be reduced to half of the current consumption. However, the increase in the design complexity of the digital block and the power and area overhead caused by circuit blocks for fractional-N operation need to be taken into consideration. Furthermore, the low locking time may need to be sacrificed.

# 7.1.3. Scaling the FTL to higher frequencies

The true potential of the concept of this FTL lies in scaling it to higher frequency ranges. As mentioned in section 2.3.2, it is theoretically possible to use three sampling frequencies under 300 MHz to unambiguously estimate frequencies > 25 GHz and for large dynamic ranges. To analyze the practicality of this FTL at mm-Wave frequencies, consider that similar frequencies as the current FTL (550 MHz, 750 MHz, 850 MHz) are used for a VCO output frequency of 30 GHz. Using the Eq (3.4), it can be calculated that the sampling clock jitter requirement then becomes three times tighter than the current PLL (running at 10 GHz). From Eq (4.2), to achieve a 3 times better jitter, the ring oscillator power should be increased by 9 times. However, the power consumption of the rest of the blocks like the sampler, the amplifier, and the digital block will remain similar.

Alternatively, in the case of high VCO frequencies, the sampling frequencies can be scaled down to keep the jitter requirement constant (according to Eq (3.4)). For example, a jitter of < 2 ps is necessary for both cases of  $f_{vco} = 10 GHz$ ,  $f_s = 1 GHz$  and  $f_{vco} = 30 GHz$ ,  $f_s = 0.33 GHz$ . When an RO-based PLL is used for sampling frequency generation, the same RMS jitter performance can be achieved at both sampling frequencies, given that the power consumption remains constant. As the ring oscillator core consumes 200  $\mu$ W at 1 GHz output to produce 2 ps integrated jitter, to produce the same jitter for the same power consumption at 330 MHz, the PLL should have a Figure of Merit (FoM) of -241 dBc, where the FoM is given by

$$FoM = 20 \times \log_{10}(\frac{jitter}{1s}) + 10 \times \log_{10}(\frac{power}{1mW}).$$
(7.1)

Therefore, to not have a huge increase in power consumption for higher VCO frequencies, the only solution is to explore sampling frequencies with higher co-prime factors and lesser GCD, which leads to usage of fractional-N PLL. This can impact in the below ways.

- A fractional-N PLL in the place of an integer-N PLL will add additional jitter caused by Delta-Sigma Modulator (DSM) quantization noise and fractional spur. The power and noise performance of ring oscillator based fractional-N PLL at such low frequencies needs to be studied.
- The locking time may be degraded because there is a minimum observation time requirement for an unambiguous frequency reconstruction. Furthermore, if all the alias frequencies lie under 150 MHz (because of lowered sampling rates), then the observation time increases drastically, for accurate estimation of frequency.
- Lower sampling frequencies also have higher computation complexity because of high periodicity in their alias frequency spectrum, leading to higher number of equations to be solved. A more robust algorithm needs to be created which can perform accurate frequency estimation with less complexity.
- 4. Moving to higher frequencies and new technologies, as the VCOs become very small, this FTL will become the dominant source of area consumption.

Nevertheless, with a slight sacrifice of power consumption and increased locking times, the proposed FTL shows potential in scaling to higher frequencies of operation.

# Bibliography

- [1] Fatimah Al-Ogaili and Raed M. Shubair. "Millimeter-wave mobile communications for 5G: Challenges and opportunities". In: 2016 IEEE International Symposium on Antennas and Propagation (APSURSI). 2016, pp. 1003–1004. DOI: 10.1109/APS.2016.7696210.
- [2] Hanli Liu et al. "A 265-μW Fractional-N Digital PLL With Seamless Automatic Switching Sub-Sampling/Sampling Feedback Path and Duty-Cycled Frequency-Locked Loop in 65-nm CMOS". In: *IEEE Journal of Solid-State Circuits* PP (Sept. 2019), pp. 1–15. DOI: 10.1109/JSSC.2019.2936967.
- [3] Roland E. Best. "Mixed-Signal PLL Analysis". en. In: Phase-Locked Loops: Design, Simulation, and Applications, Sixth Edition. 6th ed. New York: McGraw-Hill Education, 2007. ISBN: 9780071493758. URL: https://www.accessengineeringlibrary.com/content/ book/9780071493758/chapter/chapter3.
- [4] "Acquisition of Phaselock". In: Phaselock Techniques. John Wiley Sons, Ltd, 2005. Chap. 8, pp. 183–208. ISBN: 9780471732693. DOI: https://doi.org/10.1002/0471732699.ch8.
- [5] Xiang Gao et al. "Spur Reduction Techniques for Phase-Locked Loops Exploiting A Sub-Sampling Phase Detector". In: *IEEE Journal of Solid-State Circuits* 45.9 (2010), pp. 1809–1821. DOI: 10. 1109/JSSC.2010.2053094.
- [6] Jiang Gong et al. "A Low-Jitter and Low-Spur Charge-Sampling PLL". In: *IEEE Journal of Solid-State Circuits* (2021), pp. 1–1. DOI: 10.1109/JSSC.2021.3105335.
- [7] Jieyang Li, Ting Yi, and Zhiliang Hong. "A 23-30.8GHz All-digital Phase-Locked Loop for 5G Communication System". In: 2020 IEEE 15th International Conference on Solid-State Integrated Circuit Technology (ICSICT). 2020, pp. 1–3. DOI: 10.1109/ICSICT49897.2020.9278314.
- [8] Chia-Pang Yen, Yingming Tsai, and Xiaodong Wang. "Wideband Spectrum Sensing Based on Sub-Nyquist Sampling". In: *IEEE Transactions on Signal Processing* 61.12 (2013), pp. 3028– 3040. DOI: 10.1109/TSP.2013.2251342.
- [9] Paul T. Lanza II, John L. Stensby, and Yuri B. Shtessel. "An exact formula for the maximum VCO sweep rate of a PLL". In: *Journal of the Franklin Institute* 351.9 (2014), pp. 4495–4513. ISSN: 0016-0032. DOI: https://doi.org/10.1016/j.jfranklin.2014.05.015.
- [10] Xiaolong Liu and Howard C. Luong. "A Fully Integrated 0.27-THz Injection-Locked Frequency Synthesizer With Frequency-Tracking Loop in 65-nm CMOS". In: *IEEE Journal of Solid-State Circuits* 55.4 (2020), pp. 1051–1063. DOI: 10.1109/JSSC.2019.2954232.
- [11] Wanghua Wu and et al. "A 28-nm 75-fs<sub>rms</sub> Analog Fractional-N Sampling PLL With a Highly Linear DTC Incorporating Background DTC Gain Calibration and Reference Clock Duty Cycle Correction". In: *IEEE Journal of Solid-State Circuits* 54.5 (2019), pp. 1254–1265. DOI: 10.1109/ JSSC.2019.2899726.
- [12] Zhao Zhang, Guang Zhu, and C. Patrick Yue. "A 0.65-V 12–16-GHz Sub-Sampling PLL With 56.4-fs<sub>rms</sub> Integrated Jitter and -256.4-dB FoM". In: *IEEE Journal of Solid-State Circuits* 55.6 (2020), pp. 1665–1683. DOI: 10.1109/JSSC.2020.2967562.
- [13] Teerachot Siriburanon et al. "A 13.2% locking-range divide-by-6, 3.1mW, ILFD using even-harmonicenhanced direct injection technique for millimeter-wave PLLs". In: 2013 Proceedings of the ES-SCIRC (ESSCIRC). 2013, pp. 403–406. DOI: 10.1109/ESSCIRC.2013.6649158.
- [14] Hao Wang and Omeed Momeni. "Low-Power and Low-Noise Millimeter-Wave SSPLL With Subsampling Lock Detector for Automatic Dividerless Frequency Acquisition". In: *IEEE Transactions* on *Microwave Theory and Techniques* 69.1 (2021), pp. 469–481. DOI: 10.1109/TMTT.2020. 3039549.

- [15] G.W. Tunnicliffe, A. Sathyendran, and A.R. Murch. "Performance improvement in GSM networks due to slow frequency hopping". In: 1997 IEEE 47th Vehicular Technology Conference. Technology in Motion. Vol. 3. 1997, 1857–1861 vol.3. DOI: 10.1109/VETEC.1997.605880.
- [16] Jeffrey Prinzie et al. "A Fast Locking 5.8–7.2-GHz Fractional-N Synthesizer With Sub-2-us Settling in 22-nm FDSOI". In: IEEE Solid-State Circuits Letters 3 (2020), pp. 546–549. DOI: 10. 1109/LSSC.2020.3036122.
- [17] Marten Sundberg et al. "On the Impact of Transient Period for Short Transmission Duration". In: 2017 IEEE Globecom Workshops (GC Wkshps). 2017, pp. 1–6. DOI: 10.1109/GLOCOMW. 2017.8269168.
- [18] 3GPP. User Equipment (UE) radio transmission and reception; Part 1: Range 1 Standalone. TS 38.101-1. 3rd Generation Partnership Project (3GPP). URL: https://portal.3gpp.org/ desktopmodules/Specifications/SpecificationDetails.aspx?specificationId= 3283.
- [19] Li Xiao and Xiang-Gen Xia. "Frequency determination from truly sub-Nyquist samplers based on robust Chinese remainder theorem". In: *Signal Processing* 150 (2018), pp. 248–258. ISSN: 0165-1684. DOI: https://doi.org/10.1016/j.sigpro.2018.04.022. URL: https: //www.sciencedirect.com/science/article/pii/S0165168418301488.
- [20] P.E. Pace, R.E. Leino, and D. Styer. "Use of the symmetrical number system in resolving single-frequency undersampling aliases". In: *IEEE Transactions on Signal Processing* 45.5 (1997), pp. 1153–1160. DOI: 10.1109/78.575690.
- [21] Y.T. Chan et al. "Evaluation of various FFT methods for single tone detection and frequency estimation". In: CCECE '97. Canadian Conference on Electrical and Computer Engineering. Engineering Innovation: Voyage of Discovery. Conference Proceedings. Vol. 1. 1997, 211–214 vol.1. DOI: 10.1109/CCECE.1997.614827.
- [22] L. Palmer. "Coarse frequency estimation using the discrete Fourier transform (Corresp.)" In: *IEEE Transactions on Information Theory* 20.1 (1974), pp. 104–109. DOI: 10.1109/TIT.1974. 1055156.
- [23] Jiang Gong et al. "A 2.7mW 45fs<sub>rms</sub>-Jitter Cryogenic Dynamic-Amplifier-Based PLL for Quantum Computing Applications". In: 2021 IEEE Custom Integrated Circuits Conference (CICC). 2021, pp. 1–2. DOI: 10.1109/CICC51472.2021.9431541.
- [24] M.P. Li. Jitter, Noise, and Signal Integrity at High- Speed. Prentice Hall Modern Semiconductor Design Series. Prentice Hall, 2008. ISBN: 9780132429610. URL: https://books.google. nl/books?id=JRAfAQAAIAAJ.
- [25] T. Miyazaki, M. Hashimoto, and H. Onodera. "A performance comparison of PLLs for clock generation using ring oscillator VCO and LC oscillator in a digital CMOS process". In: ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753). 2004, pp. 545–546. DOI: 10.1109/ASPDAC.2004.1337641.
- [26] A. Hajimiri, S. Limotyrakis, and T.H. Lee. "Jitter and phase noise in ring oscillators". In: IEEE Journal of Solid-State Circuits 34.6 (1999), pp. 790–804. DOI: 10.1109/4.766813.
- [27] Belal M. Helal et al. "A Low Jitter Programmable Clock Multiplier Based on a Pulse Injection-Locked Oscillator With a Highly-Digital Tuning Loop". In: *IEEE Journal of Solid-State Circuits* 44.5 (2009), pp. 1391–1400. DOI: 10.1109/JSSC.2009.2015816.
- [28] Long Kong and Behzad Razavi. "A 2.4 GHz 4 mW Integer-N Inductorless RF Synthesizer". In: *IEEE Journal of Solid-State Circuits* 51.3 (2016), pp. 626–635. DOI: 10.1109/JSSC.2015. 2511157.
- [29] Alessio Santiccioli et al. "A 66-fs-rms Jitter 12.8-to-15.2-GHz Fractional-N Bang–Bang PLL With Digital Frequency-Error Recovery for Fast Locking". In: *IEEE Journal of Solid-State Circuits* 55.12 (2020), pp. 3349–3361. DOI: 10.1109/JSSC.2020.3019344.
- [30] Younghyun Lim et al. "17.8 A 170MHz-Lock-In-Range and -253dB-FoM<sub>jitter</sub> 12-to-14.5GHz Subsampling PLL with a 150µW Frequency-Disturbance-Correcting Loop Using a Low-Power Unevenly Spaced Edge Generator". In: 2020 IEEE International Solid- State Circuits Conference - (ISSCC). 2020, pp. 280–282. DOI: 10.1109/ISSCC19947.2020.9062921.

- [31] Zunsong Yang et al. "A 10.6-mW 26.4-GHz Dual-Loop Type-II Phase-Locked Loop Using Dynamic Frequency Detector and Phase Detector". In: *IEEE Access* 8 (2020), pp. 2222–2232. DOI: 10.1109/ACCESS.2019.2962060.
- [32] Luca Bertulessi et al. "A 30-GHz Digital Sub-Sampling Fractional-*N* PLL With -238.6-dB Jitter-Power Figure of Merit in 65-nm LP CMOS". In: *IEEE Journal of Solid-State Circuits* 54.12 (2019), pp. 3493–3502. DOI: 10.1109/JSSC.2019.2940332.