A Digital-Intensive Wakeup Timer based on an RC Frequency-Locked Loop for Internet-of-Things Applications

Zhihao Zhou







# A Digital-Intensive Wakeup Timer based on an RC Frequency-Locked Loop for Internet-of-Things Applications

by

#### Zhihao Zhou

in partial fulfillment of the requirements for the degree of

#### **Master of Science**

in Microelectronics

at the Delft University of Technology, to be defended publicly on Monday, November 27, 2017.

Supervisors: Prof. dr. Fabio Sebastiano, TU Delft

Dr. Yao-Hong Liu, IMEC/Holst Centre Ir. Ming Ding, IMEC/Holst Centre

Thesis committee: Prof. dr. Michiel Pertijs, TU Delft

Prof. dr. Masoud Babaie, TU Delft

This thesis is confidential and cannot be made public until December 1, 2018.

An electronic version of this thesis is available at http://repository.tudelft.nl/.







# **Abstract**

This thesis presents an ultra-low power wakeup timer locked to an RC time constant that can meet the stringent power requirements of the nodes for Internet-of-Things (IoT) applications. The wakeup timer, fabricated in a 40-nm CMOS process, employs a bang-bang digital-intensive frequency-locked loop (DFLL). A self-biased  $\Sigma\Delta$  digitally controlled oscillator (DCO) is locked to an RC time constant via a chopped dynamic comparator and a digital loop filter, enabling an operation down to 0.65 V and a small area of 0.07 mm<sup>2</sup>. The digital-intensive design consumes 181 nW with an output frequency of 417 kHz. Thus, it achieves the best power efficiency (0.43 pJ/Cycle) at the lowest supply voltage (0.7 V nominal) over the state-of-the-art for ultra-low-power timers, while keeping on-par long-term stability (Allan deviation floor below 10 ppm) and temperature stability (106 ppm/°C).

# Contents

| Li | st of                                  | Figures                                                                                                                                                                                                                                                        | vii                                          |
|----|----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------|
| Li | st of                                  | Tables                                                                                                                                                                                                                                                         | хi                                           |
| 1  | 1.1                                    | Internet-of-Tings and wireless sensor networks                                                                                                                                                                                                                 |                                              |
|    | 1.3                                    | Motivation and objectives                                                                                                                                                                                                                                      | 4                                            |
| 2  | 2.1<br>2.2<br>2.3<br>2.4<br>2.5<br>2.6 | y Integrated Time References  RC-based references  LC-based references  TD-based references  MOS-based references  MEMS-based references  State-of-the-art low-power references  2.6.1 Relaxation oscillators  2.6.2 FLL oscillators  Benchmark and conclusion | 10<br>11<br>12<br>13<br>14<br>14<br>15       |
| 3  | A D: 3.1 3.2 3.3                       | igital-Intensive FLL based RC Oscillator Specifications. Proposed architecture 3.2.1 Small signal model. System-level considerations 3.3.1 RC network. 3.3.2 Comparator 3.3.3 Digital loop filter 3.3.4 Multi-phase clock generator 3.3.5 DCO Conclusion       | 19<br>20<br>22<br>26<br>26<br>28<br>29<br>29 |
| 4  | 4.1<br>4.2<br>4.3                      | System overview RC network 4.2.1 Characteristics of integrated resistors and capacitors 4.2.2 Trimming network 4.2.3 Switches Dynamic comparator DCO 4.4.1 Leakage based delay cell 4.2 Subthreshold PTAT bias circuit                                         | 34<br>36<br>37<br>39<br>42<br>44             |

vi

| 4.4.3 Sigma-Delta modulator 4.5 Digital Loop Filter 4.6 Clock divider 4.7 Layout overview  5 Measurement Results 5.1 Chip micrograph 5.2 Measurement setup 5.3 Measurement results 5.3.1 Frequency accuracy vs. temperature variation 5.3.2 Frequency accuracy vs. supply variation 5.3.3 Allan deviation 5.3.4 Power consumption 5.3.5 Other results 5.4 Conclusion and summary  6 Conclusion and Future Work 6.1 Conclusion 6.2 Future work  A Phase Noise, Period Jitter and Allan Deviation Conversions A.1 Phase noise and Allan deviation conversion A.2 Phase noise and period jitter conversion  B MATLAB behavior model of the DFLL | 75                               |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------|
| 4.5 Digital Loop Filter 4.6 Clock divider 4.7 Layout overview  5 Measurement Results 5.1 Chip micrograph 5.2 Measurement setup 5.3 Measurement results 5.3.1 Frequency accuracy vs. temperature variation 5.3.2 Frequency accuracy vs. supply variation 5.3.3 Allan deviation 5.3.4 Power consumption 5.3.5 Other results 5.4 Conclusion and summary  6 Conclusion and Future Work 6.1 Conclusion 6.2 Future work  A Phase Noise, Period Jitter and Allan Deviation Conversions A.1 Phase noise and Allan deviation conversion                                                                                                               | 73                               |
| 4.5 Digital Loop Filter 4.6 Clock divider 4.7 Layout overview  5 Measurement Results 5.1 Chip micrograph 5.2 Measurement setup 5.3 Measurement results 5.4 Power consumption 5.5 A Power consumption 5.5 Chip micrograph 5.5 Measurement results 5.6 Conclusion and Future Work 6.1 Conclusion                                                                                                                                                                                                                                                                                                                                               |                                  |
| 4.5 Digital Loop Filter 4.6 Clock divider 4.7 Layout overview  5 Measurement Results 5.1 Chip micrograph 5.2 Measurement setup. 5.3 Measurement results 5.3.1 Frequency accuracy vs. temperature variation. 5.3.2 Frequency accuracy vs. supply variation 5.3.3 Allan deviation. 5.3.4 Power consumption 5.3.5 Other results.                                                                                                                                                                                                                                                                                                                |                                  |
| 4.5 Digital Loop Filter4.6 Clock divider4.7 Layout overview                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 58<br>59<br>59<br>60<br>61<br>62 |
| 4.4.2 Sigma Dalta madulatar                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 50<br>51                         |

# List of Figures

| 1.1  | Conceptual framework of future IoT                                                                                                                                            |
|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1.2  | (a) A fully integrated IoT node. (b) The smallest crystal.                                                                                                                    |
| 1.3  | (a) Generalized block diagram of a duty-cycled IoT node. (b) Example timing profile of a duty-cycled IoT node that includes guard bands to compensate for timing uncertainty. |
| 1.4  | Power consumption in a duty-cycled wireless system                                                                                                                            |
| 1.5  | Illustration of a long-term stable noisy frequency signal.                                                                                                                    |
| 1.6  | Lifetime voltage of Duracell CR1220 button battery.                                                                                                                           |
| 2.1  | Generalized block diagram of a Wienbridge oscillator.                                                                                                                         |
| 2.2  | Generalized block diagram of an LC oscillator.                                                                                                                                |
| 2.3  | Block diagram of the TD-based frequency reference.                                                                                                                            |
| 2.4  | (a) Block diagram of the MEMS-based frequency reference. (b) Die micrograph of the MEMS-based frequency reference.                                                            |
| 2.5  | Block diagram of a conventional relaxation oscillator                                                                                                                         |
| 2.6  | Generalized block diagram of an FLL RC oscillator.                                                                                                                            |
| 3.1  | Block diagram of the proposed DFLL based RC oscillator.                                                                                                                       |
| 3.2  | Waveforms of DCO output frequency, RC network timing, and $C_{ref}$ voltages of the DFLL under ideal steady state operation.                                                  |
| 3.3  | Small signal model of the proposed oscillator.                                                                                                                                |
| 3.4  | Gain plots of the noise transfer functions.                                                                                                                                   |
| 3.5  | Typical spectrums of $v_n$ and $\phi_{DCO}$ , and their responses on $S_y$ with the DFLL operation.                                                                           |
| 3.6  | Allan deviation simulations with different comparator flicker noise and the same DCO flicker FM.                                                                              |
| 3.7  | Allan deviation simulations with different DCO flicker FM and the same comparator flicker noise                                                                               |
| 3.8  | Allan deviation simulations with different DCO resolutions and comparator flicker noise of 1 $\mu$ V/ $\sqrt{\rm Hz}$ at 10 Hz.                                               |
| 3.9  | Illustration of the RC trimming strategy using a 12-bit coarse-fine overlapped network.                                                                                       |
| 3.10 | Topology of a resistor using first-order temperature compensation and illustration of the TCs of the resistors before and after compensation.                                 |
| 3.11 |                                                                                                                                                                               |
| 3.12 | Illustration of the frequency offset due to the DCO finite resolution.                                                                                                        |
|      | Simulation showing reduced frequency offset due the DCO finite resolution thanks                                                                                              |
|      | to noise.                                                                                                                                                                     |

viii List of Figures

| 4.1         | Architecture of the proposed oscillator, the clocks driving each block are marked                |
|-------------|--------------------------------------------------------------------------------------------------|
|             | in purple.                                                                                       |
| 4.2         | Temperature dependencies of the compensated and uncompensated resistors                          |
| 4.3         | Monte Carlo simulations showing (a) first-order TC variation, and (b) resistance                 |
|             | variation of the compensated resistor.                                                           |
| 4.4         | (a) Process variation of a 1 pF fringe MOM capacitor. (b) Temperature dependency                 |
|             | of the MOM capacitor.                                                                            |
| 4.5         | RC network with trimming capabilities                                                            |
| 4.6         | Schematic of the switches used in the RC network.                                                |
| 1.7         | Schematic of the dynamic comparator                                                              |
| 4.8         | (a) Comparator output voltages showing the decision moment at around 2.5 ns. (b)                 |
|             | Input referred noise of the comparator                                                           |
| 4.9         | (a) Illustration of the comparator offset simulation testbench. (b) Statistical simu-            |
|             | lation result of the comparator offset voltage.                                                  |
|             | Block diagram of the DCO with $\Sigma\Delta$ dithered input.                                     |
|             | DCO tuning range under temperature and process corners                                           |
| 1.12        | DNL of the DCO integer bits                                                                      |
| 4.13        | DCO phase noise performance at the center frequency of 512 kHz                                   |
|             | Schematic of the leakage based delay cell                                                        |
|             | Transient voltages of the internal nodes in one delay cell                                       |
| 1.16        | Schematic of the subthreshold PTAT bias circuit.                                                 |
| 1.17        | (a) Temperature dependency of the PTAT bias current. (b) Supply voltage depen-                   |
|             | dency of the PTAT bias current.                                                                  |
| 1.18        | Illustration of the DCO high rate dithering.                                                     |
|             | Block diagram of the third-order MASH $\Sigma\Delta$ modulator                                   |
|             | Phase noise plot showing the effect of the first-order and third-order $\Sigma\Delta$ modulators |
| 4.21        | Allan deviation plot showing the effect of the third-order $\Sigma\Delta$ modulator              |
| 1.22        | (a) Original combiner logic with signed summation. (b) Improved combiner with                    |
|             | D flip-flops only.                                                                               |
| 1.23        | Block diagram of the DLF and its connection with the DCO                                         |
|             | Block diagram of multi-phase divider                                                             |
|             | Block diagram of the non-overlap clock generator.                                                |
| 1.26        | Waveforms of the timing of the multi-phase divider and the non-overlap clock                     |
|             | generator                                                                                        |
| 1.27        | Layout of the chip.                                                                              |
|             |                                                                                                  |
| 5.1         | Die micrograph of the fabricated chip in the 40-nm process                                       |
| 5.2         | Block diagram of the measurement setup.                                                          |
| 5.3         | PCBs used for measurement                                                                        |
| 5.4         | Frequency accuracy vs. temperature variation                                                     |
| 5.5         | Frequency accuracy vs. supply variation.                                                         |
| 5.6         | Allan deviation measurement results.                                                             |
| 5.7         | Oscillator power breakdown.                                                                      |
| 5.8         | Period jitter and start-up time with different DLF gain.                                         |
| 5.9         | Output spectrum and settling behavior.                                                           |
| 5.10        |                                                                                                  |
|             |                                                                                                  |
| <b>A</b> .1 | Characteristics of the noise processes in spectral density of phase fluctuations.                |

| r' CD'          | •  |
|-----------------|----|
| List of Figures | 10 |
| List of Figures | 12 |
|                 |    |

| A.2 | Characteristics of the noise processes in spectral density of fractional frequency |    |
|-----|------------------------------------------------------------------------------------|----|
| A.3 | fluctuations                                                                       |    |
| B.1 | Pseudo code of the discrete time MATLAB behavior model.                            | 74 |

# List of Tables

| 2.1         | Benchmark table of the integrated oscillators                                                                          | 17 |
|-------------|------------------------------------------------------------------------------------------------------------------------|----|
| 3.1         | Target specifications of the proposed oscillator                                                                       | 19 |
| 3.2         | Design specifications for the major building blocks                                                                    | 31 |
| 4.1         | Characteristics of the available non-silicide poly resistors in the 40-nm process.                                     | 34 |
| 4.2         | Worst-case $R_{on}$ and $I_{leak}$ of the switches at 25 °C and tt corner.                                             | 38 |
| 4.3         | Start up time of the current reference in millisecond across corners                                                   | 47 |
| 4.4         | Summary of the configurable functions in the DFLL                                                                      | 55 |
| 5.1         | Performance summary and comparison with state-of-the-art.                                                              | 66 |
| <b>A.</b> 1 | Translation of frequency instability measures from spectral densities in frequency domain to variances in time domain. | 71 |
|             | domain to variances in time domain.                                                                                    | /  |

1

# Introduction

## 1.1. Internet-of-Tings and wireless sensor networks

Communication and networking technologies have enabled people to connect with each other everywhere at any time. In the meantime, there is a growing demand for networks which can facilitate the communication between human and environment, human and machines, and even machine and machines. The key technology fulfilling this demand is the *Internet-of-Things (IoT)*, which is a framework comprising physical objects equipped with limited hardware to provide computation and networking support.

The first IoT device (a coke machine) was implemented in 1982<sup>1</sup>. From that time on, Internet-connected devices have been around us for decades. However, only recently these devices have permeated our lives and are popularly conceptualized as IoT. The IoT is now regarded as the next revolutionizing technology and economic worldwide after the Internet. In its paradigm, many objects will be connected to networks and embedded into our surroundings ubiquitously and seamlessly. A conceptual framework of future IoT is shown in figure 1.1 below.

As depicted in the figure, the basement of IoT is composed by the *Wireless sensor networks (WSNs)*. In WSNs, distributed autonomous sensor nodes monitor physical conditions such as temperature, humidity, pressure, etc. The acquired raw data is collected and propagated to the cloud for computational purposes. After being processed in the cloud, the raw data becomes useful information and then available to users in different kinds of applications.

Typical WSN applications, e.g., environmental monitoring, require hundreds or even thousands of nodes in the network. In addition to the sensor, each node must also be equipped with other circuitry for wireless communication, such as a micro-controller and a radio transceiver. Considering the functionalities of the node, there are several constraints it should satisfy [1].

- The cost of each node should be low (cheaper than 1 €), so that the entire WSN comprising thousands of nodes is economically viable;
- Small form factor is preferred since the nodes must be seamlessly integrated into the environment;

<sup>1</sup>https://www.cs.cmu.edu/~coke/history long.txt

2 1. Introduction



Figure 1.1: Conceptual framework of future IoT (reproduced from [2]).

• Low-power consumption is required for energy-autonomous operation and long lifetime in battery powered systems.

#### **1.2.** Time reference for IoT node

As mentioned in the previous section, the IoT node in WSNs must be cheap, small, and power efficient. Conventional solutions for IoT nodes may need many discrete ICs and a custom designed printed circuit board (PCB) for assembly. However, this solution is not efficient in either cost or size. One way to solve these problems is integrating the components together. If all of the discrete components are combined into one, the cost of PCB making and chip packaging can be cut off, and the resulting size is also much smaller.

Modern wireless communication protocols require either time or frequency information for synchronization. Thus, the IoT node in WSNs should contain at least one time/frequency reference<sup>1</sup>. Crystal oscillators are most widely used for frequency generation in wireless applications. However, it is bulky and cannot be integrated into an integrated circuit (IC) using standard CMOS technology. In figure 1.2, the smallest crystal is shown next to a fully functional IoT node for size comparison, and thus the limit in size is clearly illustrated.

<sup>&</sup>lt;sup>1</sup>Expressions "oscillator" and "time/frequency reference" are interchangeable



Figure 1.2: (a) A fully integrated IoT node. (b) The smallest crystal. (both reproduced from [3])



Figure 1.3: (a) Generalized block diagram of a duty-cycled IoT node. (b) Example timing profile of a duty-cycled IoT node that includes guard bands to compensate for timing uncertainty (reproduced from [4]).

Moreover, a time reference is also needed to meet the very tight power requirements for IoT applications. Taking advantage of the small average rate required by WSN for IoT, the wireless radio module can be duty-cycled, and power is only dissipated during data transmission. One of the most widely accepted duty-cycled node architectures is shown in figure 1.3a, consisting a radio module for communication and a real-time clock (RTC)<sup>2</sup> for time synchronization [5]. The main radio is not active (i.e., in low-power sleep mode) until the RTC wakeup call arrives. Power consumption in these systems is averaged down. As shown in figure 1.3b (b), the average system power consumption can be expressed as

$$P_{avg} = \frac{P_{SLP}t_{SLP} + P_{ON}(t_{ON} + t_{GB})}{t_{SLP} + t_{ON} + t_{GB}},$$
(1.1)

where  $t_{SLP}$  is the sleep time,  $P_{SLP}$  is the sleep power,  $t_{ON}$  is the active TX/RX time,  $P_{ON}$  is the active power,  $t_{GB}$  is the guard band time, and  $P_{avg}$  is the average IoT node total power over time.

<sup>&</sup>lt;sup>2</sup>Expressions "RTC" and "wakeup timer" are interchangeable

4 1. Introduction

The guard band time is defined as

$$t_{GB} = C_U \cdot t_{SLP} \tag{1.2}$$

where  $C_U$  is the RTC inaccuracy. When the system is aggressively duty-cycled, i.e.,  $t_{SLP} >> (t_{GB} + t_{ON})$ ,  $P_{SLP} \cdot t_{SLP}$  becomes comparable to  $P_{ON}(t_{ON} + t_{GB})$ . Thus, the total power is dominated not by the active mode power of radio front-end and baseband, but instead by a mix of duty-cycled active mode power and the static power of always-on modules, such as bias generators and the RTC.

In the duty-cycled node architecture shown in figure 1.3a, there are three types of frequency references needed [5]:

- RTC: it generates the wakeup calls for all other modules;
- Reference for baseband: clock for digital circuits, ADC, sensors, etc.;
- Reference for wireless front-end: it generates the carrier for radio frequency (RF).

Depending on the application, the baseband reference may have different specifications. Digital circuits can tolerate inaccurate clocks, while the performance of analog circuit such as ADC/DAC may suffer from low-quality clocks. However, it can be combined with the wireless front-end frequency reference which also must show high accuracy and low noise to meet the RF modulation specifications. Although stringent performances are required for the frequency references for the baseband and wireless transmission, they do not have very tight power budget since they are not continuously required and thus can be duty-cycled.

The requirement for the RTC, which is the object of this thesis, will be discussed in the following section.

## 1.3. Motivation and objectives

Recently, several attempts have been made to integrate oscillators on-chip, with the focus on RC-based oscillators, LC-based oscillators, MEMS (microelectromechanical systems) -based oscillators, and thermal-diffusivity (TD) -based oscillators. The ultimate goal of the integrated timing circuits is to replace crystal oscillators with integrated alternatives showing the same accuracy. However, since there is always a power-accuracy trade-off in circuit design, this may not be true for the RTC in duty-cycled IoT applications. In equation 1.1, assuming  $P_{ON} = 1$  mW,  $P_{SLP} = 100$  nW, and  $t_{ON} = 100$   $\mu$ s, figure 1.4 can be plotted showing the average IoT node power versus varying wakeup timer inaccuracy and duty cycles. As depicted in the plot, enough timer accuracy is required to reduce average power. However, for aggressive duty-cycling, the averaged system power is limited by the timer power. Therefore, it is important to find a low-power and standard CMOS compatible RTC solution with an accuracy high enough for the duty-cycling synchronization.

High timing accuracy of an RTC includes two aspects: low sensitivity to internal (i.e., process variation) and external (i.e., temperature and supply variations) changes and high long-term stability. The long-term stability indicates the performance of an oscillator due to noise over time. As an example, the frequency signal of an oscillator plotted in figure 1.5 may be considered unstable in short-term because of the noise. However, when it is used as a wakeup timer, the noise is averaged and filtered, and the resulting average value is very close to the target value. This means it shows a good long-term stability. Since the wakeup signal required by the duty-cycled node occurs with a



Figure 1.4: Average power of a duty-cycled IoT node under varying timer inaccuracy and duty cycles (reproduced from [4]).



Figure 1.5: Illustration of a long-term stable noisy frequency signal.

very long period (from several minutes up to hours or even days), the long-term stability is a more important performance metric over the short-term one.

There are many figure-of-merits (FOMs) that quantify noise in oscillators, and the popular ones are phase noise, period jitter, and Allan deviation. Proposed in [6], Allan deviation is a time-domain measure of frequency instability, and its mathematical representation is given by

$$\sigma_{y}(\tau) = \sqrt{\frac{1}{2} \langle (\overline{y}_{n+1} - \overline{y}_{n})^{2} \rangle}, \tag{1.3}$$

where  $\langle \rangle$  denotes the expectation operator,  $\tau$  is the sample period, and  $\overline{y}_n$  is the *n*th fractional frequency average over  $\tau$ . An Allan deviation of  $\alpha$  over  $\tau = \beta$  should be interpreted as there being an instability in frequency between two samples  $\beta$  apart in time with a relative root mean square (rms) value of  $\alpha$ . For a good oscillator, its  $\sigma_y(\tau)$  converges to a minimum value (floor) over long enough  $\tau$ 's, and this floor indicates the long-term stability of such oscillator. Period jitter shows the deviation of a single clock period (i.e., short-term). Moreover, both period jitter and Allan deviation can be derived from phase noise, and the derivation is elaborated in appendix a. However, only the short-term stability can be evaluated from typical phase noise measurements. Since the long-term stability is important to an RTC, it is reasonable to use the Allan deviation floor as a performance metric.

Apart from the power and accuracy perspectives, RTCs with small size are preferred for cost

6 1. Introduction



Figure 1.6: Lifetime voltage of Duracell CR1220 button battery<sup>1</sup>.

consideration. Low voltage operation is needed for interfacing with various types of energy harvesters/scavengers. In addition, as an example shown in figure 1.6, a low supply voltage also helps prolong the lifetime of battery-powered IoT nodes.

In conclusion, the RTC of the IoT node should have the following general characteristics:

- Compatible with standard CMOS processes;
- · Low-power;
- High timing accuracy;
- Able to handle low supply voltage;
- Small in chip area.

Several state-of-the-art oscillator designs will be discussed and compared in chapter 2, and the quantities of these specifications will be given in chapter 3 based on the comparison.

## **1.4.** Thesis organization

This thesis consists of six chapters. The organization of the following chapters is:

Chapter 2 reviews various fully integrated oscillator designs. Their principles and accuracies are discussed in detail. The adaptability of these approaches into an RTC is analyzed.

**Chapter 3** proposes a novel digital-intensive oscillator, which is suitable as an RTC for IoT applications. The sources of frequency inaccuracy are modeled, and methods are investigated to reduce them. With the help of the analysis, specifications of basic building blocks are derived. Circuit implementation of the proposed structure in an advanced CMOS process (TSMC 40-nm) is presented in **chapter 4**. The layout of this design is also shown at the end of the chapter.

**Chapter 5** describes the PCB of the test chip and the measurement setup. Voltage and temperature stabilities along with long-term stability are measured under different settings. The results are compared with state-of-the-art designs.

<sup>&</sup>lt;sup>1</sup>The data used in the plot is retrieved from https://www.duracell.com/en-us/techlibrary/product-technical-data-sheets?region=262&type=270.

Chapter 6 presents conclusions of the thesis and recommendations for the future work.

# Fully Integrated Time References

As mentioned in chapter 1, a fully integrated time reference required by an IoT node needs to be low-power and moderately accurate. A repetitive and regular physical phenomenon is needed in building any time reference. In the following sections, different types of integrated time references will be discussed based on their physical reference principles. Furthermore, their advantages and disadvantages for the targeted IoT application will be analyzed. In addition, several state-of-the-art low-power sleep timer designs will be reviewed.

#### 2.1. RC-based references

RC oscillators can be implemented as fully integrated time references. The frequency generated by such kind of references is proportional to 1/RC, which can be as low as a few kilohertz. Therefore, it is possible to achieve low power consumption. However, the frequency accuracy is limited by both the passive components and the active circuit implementation. Usually, integrated resistors and capacitors have large process variations of over 10 %, which require trimming to achieve the target frequency accuracy. Regarding the temperature stability, integrated capacitors have negligible temperature dependencies compared to those of integrated resistors. Therefore, the final temperature coefficient (TC) of the generated frequency is determined by the resistor. Luckily, resistors with both positive and negative TCs are available in most CMOS processes, which makes it possible to make first-order temperature compensated resistors by properly combining resistors with positive and negative TCs.

Both linear and nonlinear oscillators can be implemented as RC-based references. Linear oscillators fulfill the Barkhausen criteria for oscillation. One of the popular linear RC oscillators is the Wien-bridge oscillator shown in figure 2.1. By calculating the phase shift of the feedback network, its output frequency  $f_{out}$  can be derived as

$$f_{out} = \frac{1}{2\pi RC'},\tag{2.1}$$

where R and C are the values of the resistors and capacitors in the figure. In equation 2.1, the  $f_{out}$  is determined only by the passive components. However, the amplifier usually consumes high power. In [7], a Wien bridge oscillator is reported with a TC of 86 ppm/°C from 0 °C to 100 °C, but it also has a power consumption of 66  $\mu$ W.



Figure 2.1: Generalized block diagram of a Wienbridge oscillator.

Recently, many nonlinear RC oscillators, such as frequency-locked loop (FLL) based and relaxation oscillators, with ultra-low power consumption (smaller than 1  $\mu$ W) have been introduced [3, 4, 8–12]. Unlike their linear counterpart, the non-linear oscillators do not have to meet specific gain and phase requirements, which allows low-power design. Even though the non-sinusoidal output may limit their usage, they are still suitable as RTC in IoT. Because of such attractive characteristics, low-power designs of nonlinear RC-based oscillators will be described in details in section 2.6.

#### 2.2. LC-based references

LC oscillators are widely used in RF circuits like phase-locked loops (PLLs). A block diagram of a typical CMOS LC oscillator is shown in figure 2.2. It consists of an LC resonator tank and an active circuit, where  $R_L$  and  $R_C$  are the resistive losses of the inductor and capacitor respectively. By calculating the phase shift of the LC tank, the output frequency can be derived as

$$f = f_0 \sqrt{\frac{1 - \frac{CR_L^2}{L}}{1 - \frac{CR_C^2}{L}}} \approx f_0 \sqrt{\frac{1 - \frac{1}{Q_L^2}}{1 - \frac{1}{Q_C^2}}}.$$
 (2.2)

In equation 2.2,  $f_0 = 1/(2\pi\sqrt{LC})$  is the natural frequency of the LC tank. In addition,  $Q_L = 2\pi f L/R_L$  and  $Q_C = 1/(2\pi f CR_C)$  are the quality factors of the inductor and capacitor respectively. Therefore, f only relies on the properties of passive components.

Integrated inductors have much smaller process variations compared to those of integrated capacitors, due to their large dimensions on silicon. As a result, trimming banks are usually implemented along with the capacitors. Regarding the temperature stability, both of the passive components have small TCs. However, the resistive losses show strong dependencies with temperature [1]. For most LC oscillators, the condition  $Q_C >> Q_L$  holds due to their low-enough oscillation frequencies. Thus, the losses of the inductor  $R_L$ , which determines  $Q_L$ , becomes the major source of temperature-dependent frequency drift.

In order to compensate the poor quality factor of the inductor,  $Q_L$ , a large  $g_m$  is required. Consequently, a high power consumption is needed for the active circuit. Furthermore, the temperature dependency of  $R_L$  can be compensated using various circuit techniques. In [13], an LC oscillator is proposed with constant-biased varactors to nullify the overall TC, which is reported to be



Figure 2.2: Generalized block diagram of an LC oscillator.

1.5 ppm/°C from -20 °C to 120 °C. In [14], complex trimming methods are employed to achieve a total inaccuracy of ±400 ppm over temperature (-10 °C to 85 °C) and supply changes (5 V, ±10 %). In [15], a phase shift technique is proposed, which adjusts the phase of the LC tank at a specific temperature null phase to achieve a minimized TC in the frequency. With this technique, the oscillator shows a total frequency inaccuracy of ±100 ppm across temperature (-40 °C to 85 °C), supply (3.0 V to 3.6 V) and load (0 to 15 pF) changes. Despite the high accuracy, the power of LC oscillators have is limited at the milliwatt level by the active circuit required to compensate for the lossy tank, and for the high GHz-range frequency resulting in a high power of the cascaded circuits. The high power consumption of the LC-based references makes them unsuitable as the RTC for IoT applications.

#### 2.3. TD-based references

The time references mentioned earlier are all based on passive electrical components. However, the well-defined TD of IC grade silicon can also be used for frequency generation. Figure 2.3 shows a thermal-diffusivity based frequency reference, in which an FLL locks the frequency of a digitally controlled oscillator (DCO) to the process-insensitive phase shift of an electrothermal filter (ETF). The ETF consists a heater and a temperature sensor, which are close to each other (s≈20  $\mu$ m) in the same silicon substrate [16]. Driven by the DCO output signal,  $V_{heater}$ , an AC temperature variation is generated by the heater and propagate through the substrate. Sensed by the temperature sensor, the temperature gradients are converted back to an electrical signal,  $V_{ETF}$ . The digital phase detector processes the delay between  $\phi_{ETF}$  and  $\phi_{drive}$  and generates a phase delay  $\phi_{diff}$ . The delay is then compared to a phase reference  $\phi_{ref}$ , and the resulting error signal will be integrated to drive the DCO.

The FLL forces the DCO to oscillate at a frequency  $f_{DCO}$ , where  $\phi_{diff} = \phi_{ref}$ . The DCO output frequency depends on the phase-frequency characteristic of the ETF, which can be written as

$$\phi_{diff} \propto s \sqrt{\frac{f_{DCO}}{D}},$$
 (2.3)

where D in the equation is the temperature-dependent thermal-diffusivity of the bulk silicon. Since the D is process-independent at typical substrate doping levels, the accuracy of  $f_{DCO}$  is mainly



Figure 2.3: Block diagram of the TD-based frequency reference (reproduced from [16]).

determined by the accuracy of lithography, which improves with scaling. The temperature dependency of  $f_{DCO}$  is set by the that of D, which is

$$D \propto \frac{1}{T^{1.8}}.\tag{2.4}$$

By substituting equation 2.4 into equation 2.3, the following equation can be derived

$$f_{DCO} \propto \frac{\phi_{ref}^2}{T^{1.8}}. (2.5)$$

Thus, for fixed values of  $\phi_{ref}$ , the resulting output frequencies are highly temperature dependent. However, since  $\phi_{ref}$  can be programmed, compensation on the output frequency can be performed by using a temperature sensor.

In [16], a temperature compensated TD-based frequency reference is proposed. It uses a temperature sensor to generate a temperature-dependent  $\phi_{ref}$ . An absolute inaccuracy of  $\pm 0.1$  % is reported for 16 samples over the military temperature range (-55 °C to 125 °C), which is good enough for demanding applications like USB. However, it draws 7.8 mW power from a 5 V supply. Even though the power can be scaled down with in more processes by reducing the ETF size, a considerable amount of power is still needed for heating up the ETF. Thus, this type of reference is not suitable for as RTC for IoT applications.

#### **2.4.** MOS-based references

It is possible to build frequency references using only MOS transistors and capacitors, such as stabilized ring oscillator [17] and mobility-based references [18, 19]. The periodic phenomenon of these references is the process of charging and discharging a capacitor between two voltages, which is similar to that of relaxation oscillators. Thus, the frequency is set by the reference voltages, the MOS current, and the capacitor value.

In [17], by adapting the bias voltages of a ring oscillator as a function of temperature, an inaccuracy of  $\pm 0.84$  % is achieved from -55 °C to 125 °C. Additional stabilization circuitry is implemented to handle process and supply voltage variations. A high power consumption of 1.5 mW is reported due to the complex biasing. The uncompensated mobility-based relaxation oscillator [18] has an output frequency proportional to  $T^{-1.6}$  with a spread in the order of 1 % after a single-point trimming. By adapting temperature compensation, an inaccuracy of  $\pm 0.5$  % over a temperature range from -55 °C to 125 °C is achieved after a two-point trim, while consuming a power of 51  $\mu$ W. Although they are low-power and show moderate accuracy, MOS-based references suffer from large temperature variations that require complex and accurate temperature compensation circuitry. Therefore, they are not suitable for ultra-low-power applications.

#### 2.5. MEMS-based references

Mechanical properties can also be used as frequency reference thanks to MEMS technology, which enables the fabrication of attaining on-chip microscale mechanical structures. Conventional high Q passives, like film bulk acoustic resonator (FBAR) filters, are not typically used due to their big sizes, but can now be integrated on-chip by micromachining. These high-Q components allow the timing and frequency generation circuit to achieve both good performance and low power consumption at the same time.

In [20], a MEMS oscillator is proposed with an H-style capacitively-transduced tuning-fork resonator and a sustaining circuit ( $G_M$ ). The MEMS resonator has a nominal Q of 52,000, which is comparable to those of crystals. Figure 2.4a shows the system level block diagram of the MEMS oscillator. A PLL is employed in the design to compensate the process variations. The MEMS oscillator itself is temperature stable thanks to the stable elastic property of the silicon. Additional temperature compensation is applied using a temperature sensor to achieve better performance at the cost of higher power. The uncompensated oscillator shows an inaccuracy of  $\pm 3$  ppm over a temperature range from -40 °C to 85 °C, consuming 1  $\mu$ A in a supply voltage rage from 1.4 V to 4.5 V.

Even though MEMS oscillators have overall the best accuracy and the lowest power consumption over integrated time references, they can only be manufactured in dedicated processes or require



Figure 2.4: (a) Block diagram of the MEMS-based frequency reference. (b) Die micrograph of the MEMS-based frequency reference. (both reproduced from [20])

special packaging techniques. Figure 2.4b shows the die micrograph of the previously mentioned MEMS oscillator. The MEMS die is flip-chip bonded to the CMOS die, and this more complex packaging with respect to traditional ICs increases both cost and production time requests in the final product. Considering that a standard CMOS compatible solution is the target of this thesis, this type of references is considered out of the scope of this work.

## **2.6.** State-of-the-art low-power references

Most of the previously discussed frequency references are characterized either by high power consumption (LC and TD) or they are not compatible with standard CMOS process (MEMS). Among the remaining references, MOS-based references suffer from large temperature variations. For those reasons, in the following, the focus will be on RC-based references. In this section, several standard CMOS compatible state-of-the-art low-power (<1  $\mu$ W) RC-based references will be described.

#### 2.6.1. Relaxation oscillators



Figure 2.5: Block diagram of a conventional relaxation oscillator.

RC-based relaxation oscillators are nonlinear oscillators. Figure 2.5 shows the block diagram of a conventional relaxation oscillator. A current source constantly charges a capacitor, and a continuous comparator resets the capacitor when its voltage exceeds the reference voltage  $R \cdot I_{REF}$ . A periodic signal  $(f_{out})$  is then generated based on the capacitor charging and reset iteration, and can be expressed as

$$f_{out} = \frac{1}{RC + 2t_{delay}},\tag{2.6}$$

where  $t_{delay}$  is the delay of the comparator. In most cases, the comparator also has an input offset voltage  $V_{os}$ . Assuming  $V_{os}$  is at the negative input, the output frequency now becomes

$$f_{out} = \frac{1}{RC + V_{os}C/I + 2t_{delay}}. (2.7)$$

Besides the flaws common to all RC oscillators (described in section 2.1), the existence of the continuous comparator gives two additional problems. First, its offset voltage and delay are vulnerable to PVT changes, which may affect the frequency accuracy. Second, the flicker noise of the comparator translates into worse long-term stability [8].

To address the offset/flicker noise of the comparator, an RC relaxation oscillator using chopping technique is reported in [8]. Since the offset is sensitive to voltage and temperature changes, chopping effectively improves the TC of the output frequency. Measurement results also show lower Allan deviation floor (better long-term stability) with respect to the results without chopping. However, the delay problem cannot be solved by this technique. The conventional way to reduce the effect of the delay is to make it small enough by increasing the power consumption of the comparator and an example can be found in the oversampled comparator presented in [1]. Feed-forward frequency control scheme [9] has been introduced to address this problem. However, this architecture requires approximately two times the power and area of the original uncompensated oscillator.

The two current sources of the relaxation oscillator require extra bias generation circuitry, which increases power area overhead. Wang and Mercier [4] proposed a reference free oscillator by using the capacitive discharging process as a relative voltage reference. However, the comparator delay problem is unsolved.

Typically, the continuous comparator is the most power hungry component in the relaxation oscillator. A duty-cycling technique is introduced in [11] to save power from the comparator. It achieves an efficiency of 0.68 pJ/Cycle, while keeping good temperature and voltage stabilities. However, since its core circuit is similar to the conventional ones, it also suffers from disadvantages caused by the comparator offset/flicker noise and delay.

#### 2.6.2. FLL oscillators

Since the accuracy of the relaxation oscillator is limited by the comparator, it is interesting to look for alternative architectures that do not suffer from this limitation. In [3, 12], low-power FLL oscillators based on the RC time constant are introduced. A general block diagram of such references is shown in figure 2.6. Instead of using a comparator to generate the periodic signal, the FLL locks its output of a voltage controlled oscillator (VCO),  $f_{out}$ , to a target frequency,  $f_{ref}$  set by the RC time constant inside a frequency-to-voltage converter (FVC). In the locking process, the analog loop filter integrates the error voltage  $V_0$  generated by the FVC, and provides the regulation voltage  $V_{reg}$  to bias the VCO. The VCO changes its  $f_{out}$  according to  $V_{reg}$  until the  $V_0$  becomes zero eventually.



Figure 2.6: Generalized block diagram of an FLL RC oscillator.

Theoretically, the  $f_{out}$  depends solely on the RC time constant  $(1/f_{ref})$ . However, any offset of the amplifier within the analog loop filter will cause  $f_{out}$  to deviate from  $f_{ref}$ . In addition to the offset, the large flicker noise of the amplifier will degrade the long-term stability. In [3], an FLL oscillator is designed with the chopping technique, which reduces the offset/flicker noise of the amplifier and improves the frequency stability consequently.

With respect to the relaxation oscillators, a bigger chip area is needed due to the loop structure compared to that of the relaxation oscillator. In [12], a switched resistor scheme is implemented in the FVC to boost the resistance, which in turn reduces the overall power consumption and chip area.

#### **2.7.** Benchmark and conclusion

The benchmark of the oscillators based on different principles [3, 4, 7, 13, 16, 19, 20] is in table 2.1.

According to the table, the MEMS oscillator has the best accuracy. However, it is not compatible with standard CMOS processes. The oscillators with low power consumption ( $<1~\mu W$ ) are RC-based. Despite the fact that the accuracies are not as good as those of the LC-based and the TD-based ones, they are still in an acceptable level [1].

Thanks to the elimination of comparator errors, the FLL-based RC oscillator [3] shows better performance than the relaxation one [4]. However, its analog-intensive circuitry requires high supply voltage. Moreover, they are not friendly to process scaling in terms of both area and required supply voltage. In the following chapter, an RC oscillator based on a digital-intensive FLL, which is able to handle low voltage operation and fully exploit the advantages of advanced CMOS processes will be introduced.

Table 2.1: Benchmark table of the integrated oscillators

| Ref. No.                                    | [3]         | [4]         | [7]        | [13]         | [16]         | [19]                 | $[20]^a$             |
|---------------------------------------------|-------------|-------------|------------|--------------|--------------|----------------------|----------------------|
| Principle                                   | RC          | RC          | RC         | ГС           | TD           | MOS                  | MEMS                 |
| Process [nm]                                | 180         | 250         | 65         | 180          | 700          | 65                   | 1                    |
| Frequency [Hz]                              | 70.4 k      | 6.4 k       | 9 W        | 2.09 G       | 1.6 M        | 20                   | 32.768 k             |
| VDD [V]                                     | 1.3         | 8.0         | 1.2        | 1.4          | 5            | 1.2                  | 1.5 - 4.5            |
| Power [W]                                   | 110 n       | 75.6 n      | п 99       | 10.9 m       | 7.8 m        | 51 µ                 | 1.5 - 4.5 μ          |
| Energy/Cycle [pJ/Cycle]                     | 1.56        | 11.81       | 11         | 5.22         | 4875         | $2.55 \times 10^{6}$ | 137.33 <sup>b</sup>  |
| Variation with                              | ±0.23       | ±0.27       |            |              |              |                      | $7.5 \times 10^{-5}$ |
| VDD [%]                                     | 1.2 - 1.8 V | V 6.0 - 9.0 | ı          | ı            | ı            | ı                    | 1.5 - 4.5 V          |
| LJ0/2007                                    | 34.3        | 148         | 98         | 1.5          | 11.2         | 55.6                 | 0.05                 |
|                                             | -40 - 80 °C | -20 - 80 °C | 0 - 100 °C | -20 - 120 °C | -55 - 125 °C | -55 - 125 °C         | -55 - 125 °C         |
| Phase Noise                                 |             |             | -94.6      | -119.4       |              |                      |                      |
| [dBc/Hz]                                    | ı           | ı           | @ 100 kHz  | @ 1M Hz      | ı            | •                    | ı                    |
| Period Jitter [s]                           | ı           | ı           | ı          | ı            | 312 p        | ı                    | ı                    |
| Allan Deviation <sup>c</sup><br>Floor [ppm] | 7 (>12 s)   | 60 (>100 s) | 1          | 1            | ı            | ~1000                | ı                    |
| Area [mm <sup>2</sup> ]                     | 0.26        | 1.08        | 0.03       | 0.158        | 6.75         | 0.2                  | 1                    |

<sup>&</sup>lt;sup>a</sup>The performance of TCXO mode is chosen.

<sup>b</sup>Worst-case value is calculated.

<sup>c</sup>A smaller Allan deviation floor indicates a better long-term stability.

# A Digital-Intensive FLL based RC Oscillator

In this chapter, the specifications on the RTC required by IoT applications will be given, based on the discussion in the previous chapters. A new digital-intensive FLL based RC oscillator which fulfills such specifications will be proposed. Its operating principle will be described, and the analysis of its frequency accuracy will be given.

## 3.1. Specifications

The target specifications of the wakeup timer are shown in table 3.1, and these specifications are derived based on the following reasons.

A 40-nm CMOS process is chosen for the timer because the other parts of the IoT node are implemented in the same technology. Although the wakeup signal usually occurs at a low frequency, the target is set to be higher than 100 kHz because it could also be used for some other blocks in the node. In addition, a supply voltage (0.8 V) lower than the nominal value (1.1 V) in the

| Specification               | Value      |
|-----------------------------|------------|
| Process [nm]                | 40         |
| Frequency [kHz]             | >100       |
| Nominal VDD [V]             | 1.1        |
| Target VDD [V]              | 0.8        |
| Line Regulation [%/V]       | ±2.5       |
| Supply Range [V]            | 0.7 - 0.9  |
| Power [µW]                  | <1         |
| TC [ppm/°C]                 | <100       |
| Temp. range [°C]            | -40 - 125  |
| Allan deviation Floor [ppm] | 20 (>100s) |
| Area [mm <sup>2</sup> ]     | < 0.1      |

Table 3.1: Target specifications of the proposed oscillator

chosen process is targeted to simplify the power management. As a low-power design, the timer should consume less than 1  $\mu$ W power. After comparing the references with different principles in the previous chapter, RC is selected as the frequency defining element to fulfill the power budget. Finally, based on the discussion of the state-of-the-art low-power designs, which is also in the previous chapter, the accuracy and area requirements are derived.

## **3.2.** Proposed architecture

With reference to the discussion in the previous chapter about the state-of-the-art low-power oscillators, it is clear that

- RC-based oscillators are preferred due to their low power consumption and capability with standard CMOS processes;
- The performance of relaxation oscillators is limited by the variations of the comparator delay due to PVT and its offset;
- FLL-based references solve the comparator issues, but they typically require analog-intensive circuitry which is not friendly to scaling and require high supply voltages.

A new digital-intensive FLL based RC oscillator is proposed in figure 3.1 to handle low supply voltages and fully exploit the advantages of advanced CMOS processes. It comprises a bang-bang frequency detector (FD), a digital loop filter (DLF), a DCO and a multi-phase clock generator. Similar to the conventional FLL based RC oscillators [3, 12], the output frequency of the DCO,  $f_{out}$ , is locked to a reference frequency,  $f_{ref}$ , defined by the RC network in the FD. However, after being compared to the  $f_{ref}$ , the error ( $f_{ref} - f_{out}/n$ ) is directly converted into the digital domain, while in traditional FLL-based designs an analog loop filter processes such error. The resulting 1-bit error signal is then filtered by the low-pass function in the DLF, which generates a multi-bit frequency control word (FCW). Finally, the DCO changes its  $f_{out}$  according to the FCW until the FLL reaches its steady state.



Figure 3.1: Block diagram of the proposed DFLL based RC oscillator.



Figure 3.2: Waveforms of DCO output frequency, RC network timing, and  $C_{ref}$  voltages of the DFLL under ideal steady state operation.

As illustrated in figure 3.2, under ideal DFLL steady state operation,  $f_{out}$  toggles between two least significant bits (LSBs) of the DCO (fres indicates its resolution) with a 50 % duty cycle, where its average value,  $f_{nom}$ , satisfies  $f_{nom} = N \cdot f_{ref}$ . The RC network in figure 3.1 is based on the one proposed by Lee in [21]. It consists of a floating capacitor,  $C_{ref}$ , two resistors,  $R_{ref}$ s, and three switch pairs. In order to explain its operation principle, the non-overlapping clocks  $\Phi_{1-3}$ and the voltages on  $C_{ref}$ ,  $V_{ref+}$  and  $V_{ref-}$ , are also plotted in figure 3.2. When  $\Phi_1 = 1$ ,  $C_{ref}$  is reset with  $V_{ref+} - V_{ref-} = -V_{DD}$ . During  $\Phi_2 = 1$ ,  $C_{ref}$  is discharged to  $V_{ref+} - V_{ref-} = V_{DD}$ through  $R_{ref}$ s. The final voltages on  $C_{ref}$  can be expressed as

$$V_{ref+} = V_{DD} \cdot (1 - e^{\frac{-t_{\Phi_2}}{2R_{ref}C_{ref}}}),$$

$$V_{ref-} = V_{DD} \cdot e^{\frac{-t_{\Phi_2}}{R_{ref}C_{ref}}},$$
(3.1)

$$V_{ref-} = V_{DD} \cdot e^{\frac{-t_{\Phi_2}}{R_{ref}c_{ref}}},\tag{3.2}$$

where  $t_{\Phi_2}$  is the positive half of the period of  $\Phi_2$ . When  $\Phi_3 = 1$ ,  $C_{ref}$  is connected to the comparator, and voltage difference between  $V_{ref+}$  and  $V_{ref-}$  is quantized:

$$V_{ref} = V_{ref+} - V_{ref-}$$

$$= V_{DD} \cdot (1 - 2 \cdot e^{\frac{-t_{\Phi_2}}{2R_{ref}C_{ref}}}).$$
(3.3)

With the help of the DLF, the average frequency,  $f_{nom}$ , ensures  $V_{ref} = 0$ , which means that the average value of  $t_{\Phi_2}$  is the zero-crossing time of the differential voltage  $V_{ref}$ . By solving equation 3.3 with  $V_{ref} = 0$ , this average value is given by

$$\overline{t_{\Phi_2}} = 2\ln(2)R_{ref}C_{ref}. \tag{3.4}$$

Since the condition  $t_{\Phi_2} = N/f_{out}$  is ensured by the clock generator, the average output frequency  $f_{nom}$  of the DCO will be

$$f_{nom} = \overline{f_{out}} = \frac{N}{\overline{t_{\Phi_2}}} = \frac{N}{2\ln(2)R_{ref}C_{ref}},$$
(3.5)

and thus the reference frequency defined by the network is considered to be

$$f_{ref} = \frac{1}{\overline{t_{\Phi_2}}} = \frac{1}{2\ln(2)R_{ref}C_{ref}}.$$
 (3.6)

There are three main advantages of the proposed architecture over the conventional ones. First, by replacing the traditional analog loop filter with the dynamic comparator and DLF, it can handle low supply voltages. Second, implementing a large part of the system in the digital domain, this architecture can exploit the energy efficiency of digital circuits in nanometer CMOS processes. Last but not least, being a highly digital-intensive architecture, this design is intrinsically amenable to the CMOS process scaling in terms of both chip area and supply voltage.

#### 3.2.1. Small signal model

The ideal output frequency of the oscillator is described in equation 3.5. However, non-idealities such as PVT variation and noise will make the oscillation deviate from this frequency. In order to investigate the effect of noise on the long-term stability (Allan deviation floor), an s-domain linearized small signal model is described in figure 3.3 for steady state FLL operation.

In the model, the RC network is treated as a linear gain,  $K_{RC}$ . Considering the feedback frequency to the RC network,  $f = f_{out}/N$ , is in the vicinity of  $f_{ref}$  in the DFLL steady state, by calculating



Figure 3.3: Small signal model of the proposed oscillator.

the derivative of  $V_{ref}$  in equation 3.3 with respect to f,  $K_{RC}$  is given by

$$K_{RC} = \frac{dV_{ref}}{df} \bigg|_{f = f_{ref}}$$

$$= \frac{d(V_{DD} \cdot (1 - 2 \cdot e^{\frac{-t_{\Phi_2}}{2R_{ref}C_{ref}}}))}{df} \bigg|_{f = f_{ref}}$$

$$= \frac{d(V_{DD} \cdot (1 - 2 \cdot e^{\frac{-1}{2fR_{ref}C_{ref}}}))}{df} \bigg|_{f = f_{ref}}$$

$$= -\frac{\ln(2)V_{DD}}{f_{ref}}.$$
(3.7)

The comparator is replaced by a gain factor, g. For simplicity, the low-pass DLF is assumed to be an integrator, whose s-domain model is given as  $K_{DLF} \cdot f_{INT}/s$ , where  $K_{DLF}$  is the gain factor of the DLF and  $f_{INT}$  is the integration frequency. The DCO is characterized as a gain factor,  $K_{DCO}$ , and the factors  $2\pi/s$  and  $s/(2\pi)$  are used for adding the phase noise  $\phi_{DCO}$  and related frequency-to-phase and phase-to-frequency domain transfers. In addition, due to the quantization effect of the comparator, an error Q is added, and it is also related to the quantization effect of the DCO [22].

There are three main noise sources in the proposed oscillator:

- $v_{n,RC}$ , the noise of the RC network;
- $v_{n,CMP}$ , the noise of the comparator;
- $\phi_{DCO}$ , the phase noise of the DCO.

Using the model, one can calculate the transfer functions from the spectral densities of these sources to the spectral density of the output fractional frequency fluctuation,  $S_{\nu}$ , which are given by

$$v_{n,RC}$$
 and  $v_{n,CMP}$ :  $\frac{S_y}{S_{v_n}} = \frac{NK_{DLF}K_{DCO}f_{INT}}{Ns + gK_{RC}K_{DLF}K_{DCO}f_{INT}};$  (3.8)  
 $\phi_{DCO}$ :  $\frac{S_y}{S_{\phi_{DCO}}} = \frac{Ns^2}{2\pi(Ns + gK_{RC}K_{DLF}K_{DCO}f_{INT})}.$  (3.9)



Figure 3.4: Gain plots of the noise transfer functions.



Figure 3.5: Typical spectrums of  $v_n$  and  $\phi_{DCO}$ , and their responses on  $S_y$  with the DFLL operation.

By plotting the gain of transfer functions in figure 3.4, it is clear that, for  $v_{n,RC}$  and  $v_{n,CMP}$ , they are first-order low pass filtered, whereas  $\phi_{DCO}$  is second-order high-passed filtered. Typical spectrums of  $v_n$  (including  $v_{n,RC}$  and  $v_{n,CMP}$ ) and  $\phi_{DCO}$ , and the resulting  $S_y$  with these noise spectrums under the DFLL operation is plotted in figure 3.5 using the transfer functions.

According to appendix a, only the flicker frequency modulation (FM) ( $f^{-1}$  noise process) in  $S_y$  contributes to the Allan deviation floor. In figure 3.5, this flicker FM is caused by the flicker noise ( $f^{-1}$  noise process) in the  $S_{vn}$ , and the flicker FM ( $f^{-3}$  noise process) in the  $S_{\phi DCO}$ . Since differential voltage seen by the comparator is the voltage,  $V_{ref}$ , on floating capacitor,  $C_{ref}$ , the  $v_{n,RC}$  is then given by

$$v_{n,RC} = \sqrt{\frac{kT}{C_{ref}}},\tag{3.10}$$

and it does not contribute to the flicker noise of  $S_{vn}$ . As a result, the Allan deviation floor is determined by a mix of the flicker noise of the comparator and the flicker noise of the DCO. However, this conclusion is made with the assumption that the DFLL bandwidth,  $f_p$ , equals to both the corner frequency of the  $S_{vn}$ ,  $f_{c1}$ , and the  $f^{-3}$  corner of the  $S_{\phi DCO}$ ,  $f_{c2}$ , which may not be true in real world. Moreover, the comparator gain factor in the transfer functions, g, is related to its input and output power and thus hard to estimate [23], and consequently the estimation of  $f_p$  becomes complicated. In order to investigate the contribution of the comparator and the DCO to the Allan deviation floor quantitatively, simulations are performed using a z-domain MATLAB behavior model of the DFLL (a brief description of this model can be found in appendix b).

Although theoretically, the DCO flicker FM will contribute to the Allan deviation floor, MATLAB simulations in figure 3.6 and figure 3.7 show that the Allan deviation floor is dominant by the comparator flicker noise. Except the flicker noise and flicker FM, the other settings of the two simulations are the same, and they are  $V_{DD} = 0.8 \text{ V}$ ,  $R_{ref} = 5.5 \text{ M}\Omega$ ,  $C_{ref} = 4 \text{ pF}$ ,  $K_{DLF} = 1/8$ ,  $K_{DCO} = 250 \text{ Hz}$ , N = 16, and the choices of noise values of the comparator and DCO are based



Figure 3.6: Allan deviation simulations with different comparator flicker noise and the same DCO flicker FM.



Figure 3.7: Allan deviation simulations with different DCO flicker FM and the same comparator flicker noise.

on the simulated value, which will be covered in the next chapter. According to figure 3.6, the designed comparator should have a flicker noise with a value smaller than 1  $\mu$ V/ $\sqrt{\rm Hz}$  at 10 Hz to achieve the long-term stability specification (Allan deviation floor better than 20 ppm).

The quantization error, Q, may also affect the Allan deviation floor. Since Q is related to the DCO resolution,  $f_{res}$ , Allan deviation is simulated with varying  $f_{res}$  using the MATLAB model to show the effect of Q. Figure 3.8 shows the result, where smaller Allan deviations are achieved with finer  $f_{res}$ s. When fully randomized, the quantization energy is similar to thermal noise, and thus a finer  $f_{res}$  will result in less thermal noise energy in the output frequency. Consequently, the  $\tau^{-1/2}$  thermal noise process Allan deviation is reduced. Due to the reduced thermal noise energy,



Figure 3.8: Allan deviation simulations with different DCO resolutions and comparator flicker noise of 1  $\mu$ V/ $\sqrt{\rm Hz}$  at 10 Hz.

the input power of the comparator is also reduced. With DFLL locking, the output power of the comparator is expected to be the same, and thus g becomes larger with a smaller  $f_{res}$ . A larger g makes the low-frequency gain smaller in equation 3.8 and equation 3.9, and consequently, less flicker noise is transferred from the comparator and DCO to the output. In such a way, the Allan deviation floor also gets improved with smaller  $f_{res}$ . In order to achieve the Allan deviation floor specification (better than 20 ppm), a  $f_{res}$  of 250 Hz is required.

## **3.3.** System-level considerations

In this section, the system-level considerations for each building block are described. Specifically, the impacts of inaccuracy sources (i.e., PVT variations and noise) within each building blocks on the output frequency accuracy will be analyzed, and their solutions will be proposed.

### **3.3.1.** RC network

As mentioned in the previous chapter, both integrated resistors and integrated capacitors have large process variations. At least a one-point trim is needed at room temperature to ensure the same oscillation frequency among different samples. The range of the trimming network should be large enough to cover the variations of the RC. Meanwhile, the accuracy of the frequency after trimming depends on the resolution of the trimming network, and thus the trimming step should be small enough to achieve a high frequency accuracy.

Assuming a trimming range of  $\pm 40$  % with a resolution of better than  $\pm 0.05$  % with respect to the target reference frequency is needed, building a monotonic DAC to realize such trimming network can be trivial. Therefore, as shown in figure 3.9, a 12-bit coarse-fine overlapped network can be adopted, where 10 bits are in the fine bank, and 2 bits are the in coarse bank.

After trimming, the frequency inaccuracy due to the RC network depends only on the TC. Usually,



Figure 3.9: Illustration of the RC trimming strategy using a 12-bit coarse-fine overlapped network.



Figure 3.10: Topology of a resistor using first-order temperature compensation and illustration of the TCs of the resistors before and after compensation.

the TC of the integrated resistor is the major contributor [1]. A popular first-order temperature compensated resistor topology can be used to cancel such TC, as shown in figure 3.10. By modeling the resistance as

$$R = R_0(\alpha(T - T_0) + 1), \tag{3.11}$$

where T is the absolute temperature,  $R_0$  is the resistance at  $T_0$ , and  $\alpha$  is the TC of R. The compensated resistor can be expressed as

$$\begin{split} R_{comp} &= R_{pos} + R_{neg} \\ &= R_{pos0} (\alpha_{pos} (T - T_0) + 1) + R_{neg0} (\alpha_{neg} (T - T_0) + 1) \\ &= (R_{pos0} + R_{neg0}) (\alpha_{comp} (T - T_0) + 1). \end{split} \tag{3.12}$$

Ideally,  $\alpha_{comp}$  should be 0. However, integrated resistors have higher-order temperature dependencies, which add to the final compensated TC. Moreover, the process variations of  $R_{pos0}$  and

 $\alpha_{neg}$  also make the compensation inaccurate (same conditions apply to  $R_{neg0}$  and  $\alpha_{neg}$ ) [1]:

$$|\Delta\alpha_{comp,\Delta R_{poso}}| < \frac{|\alpha_{pos}| + |\alpha_{neg}|}{2} \left| \frac{\Delta R_{poso}}{R_{poso}} \right|, \tag{3.13}$$

$$|\Delta\alpha_{comp,\Delta\alpha_{pos}}| < \frac{|\alpha_{pos}| + |\alpha_{neg}|}{2} \left| \frac{\Delta\alpha_{pos}}{\alpha_{pos}} \right|, \tag{3.14}$$

$$|\Delta \alpha_{comp,\Delta \alpha_{pos}}| < \frac{|\alpha_{pos}| + |\alpha_{neg}|}{2} \left| \frac{\Delta \alpha_{pos}}{\alpha_{pos}} \right|, \tag{3.14}$$

where  $\Delta \alpha_{comp}$  is the residual TC,  $\Delta R$  is the spread in the resistance, and  $\Delta \alpha_{pos}$  the spread in the TC. Taking into account the effect of the higher-order dependencies and the process variations, a residual TC of approximately 100 ppm/°C is expected [1].

Regarding the switches, they should have small on-state resistance and small off-state leakage, because the resistance and the current are vulnerable to PVT changes. Since these two terms are dependent on the circuit implementation, they will be covered in details in the next chapter. On the other hand, since the value of  $C_{ref}$  relatively large for kilohertz range oscillators, the charge injection of those switches is less important compared to their resistance and leakage.

### 3.3.2. Comparator

The error sources of the comparator are its offset voltage,  $V_{os}$ , and noise,  $v_{n,CMP}$ .

The model of the comparator with a offset voltage,  $V_{os}$ , during  $\Phi_2$  of the RC network is shown in figure 3.11a. Since  $V_{os}$  is directly added to  $V_{ref}$ , the original zero-crossing time of the differential voltage  $V_{ref}$  time will deviate. As a result, a DC frequency offset will be present on the nominal output frequency  $f_{nom}$ . The deviation and resulting frequency offset are given by

$$\overline{t_{\Phi_2, V_{os}}} = 2(\ln(2) - \ln(1 + \frac{V_{os}}{V_{DD}}))R_{ref}C_{ref}$$
(3.15)

$$\overline{t_{\Phi_2,V_{os}}} = 2(\ln(2) - \ln(1 + \frac{V_{os}}{V_{DD}}))R_{ref}C_{ref}$$

$$f_{nom,V_{os}} = \frac{N}{2(\ln(2) - \ln(1 + \frac{V_{os}}{V_{DD}}))R_{ref}C_{ref}}.$$
(3.15)



Figure 3.11: (a) Model of the comparator with a offset voltage,  $V_{os}$ , during  $\Phi_2$  of the RC network. (b) Frequency offset with varying  $V_{os}$  under ideal DFLL steady state.

Using equation 3.5 and equation 3.16, the fractional frequency offset due to  $V_{os}$  can be calculate as

$$y_{os} = \frac{f_{nom,V_{os}} - f_{nom}}{f_{nom}}$$

$$= \frac{\ln(1 + \frac{V_{os}}{V_{DD}})}{\ln(2) - \ln(1 + \frac{V_{os}}{V_{DD}})}.$$
(3.17)

With the  $V_{DD} = 0.8$  V, a  $V_{os}$  of 2 mV will cause a fractional frequency offset of 0.36 % on  $f_{nom}$ . This result also agrees with the result generated by the MATLAB behavior model shown in figure 3.11b.

In order to achieve a ensure the offset frequency due to  $V_{os}$  is better than 0.1 %, an  $V_{os}$  of better than 0.5 mV is expected. This  $V_{os}$  may not be easy to achieve. Therefore, chopping technique could be used for the comparator. By using chopping,  $V_{os}$  along with the low-frequency flicker noise is modulated to the chopping frequency and then suppressed by the low-pass dynamics of the DFLL. As a result, the frequency error due to  $V_{os}$  and the flicker noise is reduced, and the specifications on the  $V_{os}$  and the flicker noise can be lowered.

### 3.3.3. Digital loop filter

As a digital circuit, the DLF provides a low-pass function, which does not affect the frequency accuracy against PVT variations. However, the long-term stability could be changed with different gain factors,  $K_{DLF}$ s. From equation 3.8 and equation 3.9, it can be derived that a larger  $K_{DLF}$  means the more noise from the FD and less noise from the DCO will be transferred to the output, and vice versa. Although, the  $K_{DLF} = 1/8$  used in the MATLAB simulation in section 3.2.1 ensures the best Allan deviation floor with the given flicker noise sources (comparator and DCO). However, FLL-based oscillators generally have longer start-up time with respect to the relaxation ones. Such startup time can be reduced with a larger  $K_{DLF}$ . Therefore, it is good to have a configurable  $K_{DLF}$  which can be set to a large value for fast start-up and tuned for a better long-term stability when the DFLL is locked.

### **3.3.4.** Multi-phase clock generator

The clock generator divides the  $f_{out}$  by a factor of N, and feedback this frequency,  $f_{out}/N$ , to the FD. In addition, it also provides the clocks to drive other blocks, such as  $f_{INT}$ .

According to equation 3.5, any instability in *N* will make the output frequency to change proportionally. Therefore, a stable *N* is required to make the output frequency accurate against PVT changes. In reality, the clock generator can be implemented with fast digital cells, and hence it had negligible influence on the frequency accuracy.

#### 3.3.5. DCO

The frequency of a free running oscillator will deviate from its steady-state value due to PVT variations. The DFLL counteracts this effect by properly adapting the FCW of the DCO. In order to ensure locking, the frequency range of the DCO should be large enough to cover its own PVT changes.



Figure 3.12: Illustration of the frequency offset due to the DCO finite resolution.



Figure 3.13: Simulation showing reduced frequency offset due the DCO finite resolution thanks to noise.

As shown in figure 3.2, the DCO toggles between its two LSBs with an average value that gives no locking error in the ideal DFLL locking case. However, in reality, there is a possibility that the DCO locks to a point where its average output frequency,  $f_{avg}$ , does not equal to the target nominal value,  $f_{nom}$ . As explained in figure 3.12, the frequency offset between the average frequency and the nominal frequency is given by

$$f_{os} = f_{nom} - f_{avg}$$

$$= Nf_{ref} - \frac{f_1 + f_2}{2}$$

$$= Nf_{ref} - f_2 - \frac{f_{res}}{2},$$
(3.18)

where  $f_1$  and  $f_2$  are the frequencies of two consecutive LSBs, and  $f_{res}$  is the resolution of the DCO. According to equation 3.18, the worst case  $f_{os}$  is given by  $f_{res}/2$ , when  $f_2$  is very close to  $N \cdot f_{ref}$ . As  $f_{res}$  may be vulnerable to PVT variations, it should be small enough to ensure a high locking accuracy. However, thanks to the random noise at the input of the comparator,  $f_{os}$  is reduced, as

3.4. Conclusion 31

shown in figure 3.13, where the DCO changes its frequency in several noisy levels instead of two. This is very similar to dithering, but the noise sources here are intrinsic to the DFLL.

Although the requirement on DCO resolution is relaxed thanks to the DFLL intrinsic noise,  $f_{res}$  still needs to be small enough (e.g., 250 Hz) to achieve the Allan deviation requirement, as discussed in section 3.2.1.

### 3.4. Conclusion

In this section, a DFLL based RC oscillator is proposed, which is suitable for low supply voltage and low-power operation. After describing the operation principle, its frequency accuracy against PVT is discussed based on the analysis of the error sources within each building block. An s-domain model is proposed to predict the effect of noise on the long-term stability, and simulations are performed to calculate the resulting Allan deviation using a z-domain MATLAB behavior model. Table 3.2 summaries the specifications of each block based on the analysis in this chapter.

| Item                     | Value                          | Frequency Error         |
|--------------------------|--------------------------------|-------------------------|
| RC trimming resolution   | 0.05 %                         | 0.05 %                  |
| TC of $R_{ref}$          | 100 ppm/°C                     | 100 ppm/°C              |
| Comparator offset        | 0.5 mV                         | 0.1 %                   |
| Comparator flicker noise | $1 \mu V \sqrt{Hz}$ at $10 Hz$ | 20 ppm $\sigma_y$ floor |
| DCO resolution           | 250 kHz                        | 20 ppm $\sigma_y$ floor |
|                          |                                | 0.15 %                  |
| Total <sup>1</sup>       | -                              | 100 ppm/°C              |
|                          |                                | 20 ppm $\sigma_y$ floor |

Table 3.2: Design specifications for the major building blocks.

<sup>&</sup>lt;sup>1</sup>The frequency error due the Allan deviation floor should not be added up because its contributors are correlated.

# Circuit Implementation

In the chapter 3, a new DFLL based RC oscillator is proposed. High-level design considerations are given based on accuracy and stability analysis. The circuit implementation of the proposed oscillator will be introduced in this chapter.

# **4.1.** System overview

A detailed block diagram of the proposed oscillator is shown in figure 4.1. Since one of the objectives is to relax the requirements of the power management unit, the oscillator is fully functional with only one supply.

The clock division ratio N is set to 16, and the time constant is chosen to provide a reference fre-



Figure 4.1: Architecture of the proposed oscillator, the clocks driving each block are marked in purple.

quency of 32 kHz. Consequently, the nominal output frequency,  $f_{nom} = \overline{f_{out}}$ , should be 512 kHz. In addition to the simplified block diagram shown in figure 3.1, the dynamic comparator is chopped to cancel its own offset and flicker noise. Moreover, the DCO has a  $\Sigma\Delta$  dithered input to enhance its resolution. Last but not least, the divider is implemented with two separate blocks to provide the essential clocks which drive the DFLL internal blocks.

The blocks inside are clocked at different frequencies to achieve a high energy efficiency. The FD and DLF work at a frequency of  $f_{out}/32$ , except the choppers which are clocked at  $f_{out}/256$ . The  $\Sigma\Delta$  modulator has the highest frequency of  $f_{out}/2$ . Since the amplifiers in traditional analog FLLs are substituted with a dynamic comparator, the oscillator operates at low supply voltages with low power consumption. The details about each block are given in the following.

### 4.2. RC network

### **4.2.1.** Characteristics of integrated resistors and capacitors

In the 40-nm process used, there are three types of resistors available with different resistivity: metal, diffusion and poly resistors. To save chip area and eliminate voltage dependency in the time constant, the non-silicide poly resistors are chosen. Specifically, the process and temperature characteristics of the N-poly resistor and the P-poly resistor over the -40 °C to 125 °C temperature range are summarized in table 4.1.

As mentioned in the previous chapter, the TC of the resistor can be first-order compensated by combining two resistors in series with opposite temperature dependencies. The temperature dependency of the compensated resistor is plotted in figure 4.2 along with the ones of the poly resistors, where the ratio of the N-poly and P-poly resistor used is  $R_n/R_p = \alpha_p/\alpha_n \approx 0.37$ .

As observed in the figure, apart from the first-order TC,  $\alpha$ , the compensated resistor also has a strong second-order temperature dependency,  $\beta$ . This  $\beta$  mainly comes from that of the P-poly resistor,  $\beta_p$ , and as a result, the overall TC is degraded from near 0 to 26.29 ppm/°C. Regarding the accuracy of  $\alpha$ , according to equation 3.13 and equation 3.14, it will be deteriorated due to process variations. Process variations have been analyzed using Monte Carlo simulations, and the results are shown in figure 4.3.  $\alpha$  is extracted using least-squares method in figure 4.3a, where its mean value,  $\mu_{\alpha}$ , is -2.16 ppm/°C with 3- $\sigma_{\alpha} = \pm 0.42$  ppm/°C. The non-zero  $\mu_{\alpha}$  is caused by the imperfect ratio resistance ratio,  $R_n/R_p$ . Taking into account into all these effects, simulations show that for the overall TC, its process variation is 3- $\sigma = \pm 0.18$  ppm/°C, which is smaller than the 3- $\sigma_{\alpha}$ . In addition, as shown in figure 4.3b, the resistance variation of the compensated resistor is  $\pm 19$  % (3- $\sigma/\mu$ ).

There are only two types of integrated capacitors available in this process: MOS capacitors (MOSCAPs), which are voltage controlled capacitors (varactors) and metal-oxide-metal (MOM) capacitors. The MOSCAP has high temperature and process dependencies, so it is not preferred here. The MOM

|                                                                                     | N-poly | P-poly  |
|-------------------------------------------------------------------------------------|--------|---------|
| Resistance Process Variation (3-σ) [%]                                              | ±18%   | ±19%    |
| First-order TC α [ppm/°C]                                                           | 187    | -69     |
| First-order TC Process Variation $\Delta \alpha$ (3- $\sigma$ ) [ppm/ $^{\circ}$ C] | ±1.04  | ±0.0403 |

Table 4.1: Characteristics of the available non-silicide poly resistors in the 40-nm process.

4.2. RC network



Figure 4.2: Temperature dependencies of the compensated and uncompensated resistors



Figure 4.3: Monte Carlo simulations showing (a) first-order TC variation, and (b) resistance variation of the compensated resistor.



Figure 4.4: (a) Process variation of a 1 pF fringe MOM capacitor. (b) Temperature dependency of the MOM capacitor.

capacitors use multiple metal layers and the oxide between the layers as the insulator, while the widely used metal insulator metal (MIM) capacitors [3, 7] only use one layer of metal. Since there are 8 layers of metal available in this process, the density of MOM capacitors can be much higher than that of the MIM capacitors. As the network structure is differential, fringe MOM capacitors are used for their symmetry. The process variation of a 1pF fringe MOM capacitor is simulated and shown in figure 4.4a, where  $3-\sigma/\mu=\pm19$  %. Plotted in figure 4.4b, the TC of the capacitor is 6.46 ppm/°C, which is much smaller than those of the resistors, and hence negligible.

### **4.2.2.** Trimming network

In order to provide a time constant that satisfies  $1/(2 \ln(2)R_{ref}C_{ref}) = 32 \text{ kHz}$ , the nominal value of  $R_{ref}$  was originally chosen to be 5.6 M $\Omega$ , and consequently  $C_{ref}$  was 4 pF. They were tunable to ensure the 32 kHz reference frequency against process variations as illustrated in figure 3.9, and the boundaries for the tuning values are given by

$$(1 - 3\sigma_R)R_{max} \cdot (1 - 3\sigma_C)C_{max} \ge R_{ref}C_{ref}$$

$$(1 + 3\sigma_R)R_{min} \cdot (1 + 3\sigma_C)C_{min} \le R_{ref}C_{ref}$$
(4.1)

where  $\sigma$  is given in the previous subsection, and the max and min denote the maximum and minimum values of the corresponding components. A compensated resistor bank with 2 binary bits and a coarse step size of 1.05 M $\Omega$  was put in series with a compensated resistor of 4.5 M $\Omega$ , and thus  $R_{ref}$  was tunable from 4.5 M $\Omega$  to 7.6 M $\Omega$ . Meanwhile, a custom designed metal capacitor bank with 10 binary bits and a fine step size of 1.5 fF was put in parallel with a MOM capacitor of 3.22 pF, and thus  $C_{ref}$  was tunable from 3.22 pF to 4.75 pF. In addition, the 3.22 pF MOM capacitor was split into two MOM capacitors of 1.61 pF for symmetry. With this original setup, the boundary conditions 4.1 were satisfied. The frequency is trimmable from 62 % to 154 % with respect to the nominal value, and the fine capacitor tuning step ensures the trimming resolution better than 0.5 % at tt corner.

4.2. RC network



Figure 4.5: RC network with trimming capabilities.

However, since the TC of  $f_{nom}$  should be mainly determined by  $R_{ref}$ , any error in the SPICE model of the TC in  $R_{ref}$  would result in TC degradation of the  $f_{nom}$  generated by the fabricated chip. To avoid this degradation, an emergent decision was made to add the tunability of TC in the resistor bank just before the tape-out. Due to the limitation of time, the original trimming setup was changed to the one shown in figure 4.5. The  $C_{ref}$  setup is preserved, while the new  $R_{ref}$  setup only has the capability of changing TC but not nominal value of the compensated resistor. As a result, the trimming range of the RC network is reduced. The new  $R_{ref}$  has a nominal value of 6.6 M $\Omega$ , with 3 parts connected in series, which are a compensated resistor of 4.5 M $\Omega$ , a P-poly resistor of 1.55 M $\Omega$ , and an N-poly resistor bank with 4 binary bits and a step size of 68.8 k $\Omega$ . Thus, by calculation,  $R_n/R_p$  ratio can be changed from 0.24 to 0.45 with a step of 0.014, and the corresponding  $\alpha$  of  $R_{ref}$  changes from -24 ppm/°C to 15 ppm/°C with a step of 2.6 ppm/°C. However, also due to time limitation, the overall TC (including the higher-order effects) tuning rage and step size is not checked by simulations.

Due to the change in  $R_{ref}$ , the nominal  $f_{ref}$  changes from 32 kHz to 27 kHz, and the corresponding nominal  $f_{nom}$  changes from 512 kHz to 433 kHz. The change in nominal frequency has no negative impact on the DFLL itself as long as the DCO range is large enough, which will be covered in section 4.4. However, the trimming range is also reduced by the change in  $R_{ref}$ , the boundary conditions 4.1 are now violated and thus  $f_{nom}$  cannot be trimmed to the same value against extreme process variations. Luckily, since the IoT node wakeup signal is usually generated by an oscillator plus a counter, the value in the counter can be changed to counteract the process variation. Although, the  $R_{ref}$  should be improved in the next design.

#### **4.2.3.** Switches

There are four different types of switches used in the network, namely SW<sub>1-4</sub> in figure 4.5. The function of the switches is to control the reset and discharging processes in the RC network, and they are implemented as transmission gates, which is shown in figure 4.6.



Figure 4.6: Schematic of the switches used in the RC network.

Between two terminals, an ideal switch provides 0 resistance when it is closed, and 0 leakage current when it is open. However, a CMOS transmission gate has nonzero on-state resistance, Ron nonzero off-state leakage current,  $I_{leak}$ . For simplicity, assume the carrier mobility, the dimensions, and the threshold voltages are the same for  $M_1$  and  $M_2$  in figure 4.6, and thus the  $I_{leak}$  and  $R_{on}$  between terminal A and B are approximately given by

$$R_{on} = \frac{1}{\mu C_{ox} \frac{W}{L} (V_{gs} - V_{th})},$$

$$I_{leak} = I_0 e^{\frac{q(V_{gs} - V_{th})}{nkT}},$$
(4.2)

$$I_{leak} = I_0 e^{\frac{q(V_{gS} - V_{th})}{nkT}},\tag{4.3}$$

where  $\mu$  is the carrier mobility,  $C_{ox}$  is the oxide capacitance per unit area, W is the gate length, L is the gate length,  $V_{gs}$  is the gate-to-source bias,  $V_{th}$  is the threshold voltage,  $I_0$  is the current at  $V_{gs} = V_{th}$ , and the slope factor n is given by  $n = 1 + C_D/C_{OX}$  with  $C_D$  = capacitance of the depletion layer and  $C_{OX}$  = capacitance of the oxide layer. Because  $R_{on}$  and  $I_{leak}$  are temperature and supply dependent, small values are preferred to minimizing any possible instabilities in  $f_{ref}$ and thus  $f_{rnom}$ . A large W/L should be chosen for minimize  $R_{on}$ , while a small W/L is preferred for a small  $I_{leak}$ , with  $I_0$  proportional to W/L. Therefore,  $R_{on}$  and  $I_{leak}$  trade off with each other in transistor sizing.

Some switches can be optimized for either a small  $R_{on}$  or a small  $I_{leak}$  based on their functions to avoid the sizing trade-off.  $SW_1$  is sized for a small  $l_{leak}$  since it is connected to the rails and always open after the capacitor reset. SW<sub>2</sub> is designed for a small  $R_{on}$  because seriesly connected to the  $R_{ref}$ . However, the capacitor bank trimming switches, SW<sub>3</sub>, should have small  $R_{on}$  for the connected capacitors and small  $I_{leak}$  for the disconnected capacitors simultaneously. Considering the sizing trade-off, simulations should be done to find an optimal point where the total instabilities in the  $f_{ref}$  due to  $R_{on}$  and  $I_{leak}$  of SW<sub>3</sub> are minimized. However, due to time limitation, SW<sub>3</sub> is sized for low leakage, which may not be optimal. For  $SW_4$ , its  $R_{on}$  and  $I_{leak}$  are not important because it is not connected to either the rails or the  $R_{ref}$ .

Regarding the charge injection of the switches, because the parasitic capacitances of the switches

|                 | $R_{on}\left[ k\Omega \right]$ | I <sub>leak</sub> [pA] |
|-----------------|--------------------------------|------------------------|
| $SW_1$          | 224                            | 38.7                   |
| $SW_2$          | 58.1                           | 129                    |
| SW <sub>3</sub> | 311                            | 32.9                   |

Table 4.2: Worst-case  $R_{on}$  and  $I_{leak}$  of the switches at 25 °C and tt corner.

 $<sup>^{1}\</sup>mu$  in  $R_{on}$ ,  $I_{0}$ , n, and T in  $I_{leak}$  have temperature dependencies;  $V_{as}$  in both terms are determined by the supply voltage.

are much smaller (a few fF) than the  $C_{ref}$  (nominal value 4 pF), they are neglected. Finally, worst-case values<sup>2</sup> of the  $R_{on}$  and the  $I_{leak}$  of SW<sub>1-3</sub> are shown in table 4.2.

### **4.3.** Dynamic comparator

The DFLL uses a two-stage dynamic comparator, which is shown in figure 4.7. The first stage is a voltage amplification stage with differential input and output. The second stage contains both a simple voltage amplifier and a positive-feedback amplifier which acts as a latch. The differential input pair is pre-charged during the low clock voltage CMP = 0. A rising edge of CMP stops the pre-charging and starts the amplification in the first stage. The common-mode output voltage decreases during the amplification. When the common-mode output approximates the threshold voltage of the input pair of the second stage, the voltage amplification in the second stage takes over. In the second stage, the common-mode output voltage increases gradually during the amplification, and finally, the latch takes over and provides rail-to-rail output.

Because the operation is fully dynamic, a low power consumption of 10nW is simulated with  $f_{CMP} = 16 \ kHz$ . Since the gain of the first stage suppresses the noise contribution from the second stage, the noise is then determined by the input differential pair. To achieve a better frequency long-term stability, the flicker noise of the comparator should be optimized. Therefore, the input differential pair is implemented with large dimension transistors. To check the input referred noise spectrum of the dynamic comparator, a periodic steady state (PSS) testbench based on the simulation method described in [24] is adopted. By simulating the noise voltage at the decision moment shown figure 4.8a, an input referred noise plot of figure 4.8b is generated. As shown in the noise plot, the spot noise due to the flicker noise is around  $2.4 \ \mu V/\sqrt{Hz}$  at 10 Hz.

In order to meet the Allan deviation floor specification of 20 ppm, the flicker spot noise at 10 Hz needs to be below 1  $\mu$ V/ $\sqrt{\rm Hz}$ , according to figure 3.6. Thus, two choppers are implemented at both input and output of the comparator to reduce this flicker noise.

The offset voltage of the comparator is evaluated with Monte Carlo simulations as shown in fig-



Figure 4.7: Schematic of the dynamic comparator.

<sup>&</sup>lt;sup>2</sup>Because the  $R_{on}$  and the  $I_{leak}$  also depend on the voltages of A and B, the largest values are shown.



Figure 4.8: (a) Comparator output voltages showing the decision moment at around 2.5 ns. (b) Input referred noise of the comparator.



Figure 4.9: (a) Illustration of the comparator offset simulation testbench. (b) Statistical simulation result of the comparator offset voltage.

ure 4.9a. The comparator is biased with a common mode voltage  $V_{CM} = V_{DD}/2$ , and a step ramp up differential voltage,  $V_{diff}$ , is applied to the non-inverting input. For each voltage step, a 700-run Monte Carlo simulation is done. Since the dimensions of the transistors are randomized, the distribution of  $V_{os}$  is Gaussian. Thus, the probability of positive outputs,  $P(V_{out} = 1)$ , of each Monte Carlo simulation run can be plotted versus varying  $V_{diff}$  in figure 4.9b, and it should be in-

terpreted as the integral of the probability density function of the Gaussian distribution [25]. Using the plot, the 3- $\sigma$  value of the offset can be calculated as 3.11 mV, which gives a PVT vulnerable fractional frequency offset of 0.56% according to equation 3.17. This offset frequency necessitates the implementation of the choppers.

Since the output is in digital domain, its chopper is implemented as a multiplexer (MUX), as shown in figure 4.1. When CHOP = 1, the input is connected normally and output MUX selects  $V_{outp}$ . When CHOP = 0, the input is inverted, and output MUX selects  $V_{outn}$ . In this way, the input offset and flicker noise are modulated from low frequencies to the chopping frequency,  $f_{out}/256$ , while most energy of the input signal stays at DC. The low-pass DLF suppresses the modulated offset/flicker noise. Consequently, the overall frequency accuracy and long-term stability will become better.

### 4.4. DCO

The block diagram of the DCO is shown in figure 4.10. It consists a current mode  $\Sigma\Delta$  modulated DAC with a subthreshold PTAT biasing, and four-stage leakage based ring oscillator. The leakage based ring oscillator is chosen for its ultra-low power consumption. The frequency of the ring oscillator can be digitally controlled by changing its leakage current using the current mode DAC. Because the resolution of the DCO limits the Allan deviation floor of the DFLL output, which is covered in the previous chapter, a  $\Sigma\Delta$  modulator is used to enhance such resolution.

The nominal frequency of the ring oscillator is 512 kHz, and its tuning range is large enough to counteract its own variations. Under nominal the nominal condition (tt corner and 25 °C), the DCO can be tuned from 226 kHz to 908 kHz. Figure 4.11 shows the DCO frequency versus input code across process corners and temperature changes. As observed in the plot, except one extreme condition (i.e., ff corner and 125 °C), the DCO can be locked back to the targets, before and after



Figure 4.10: Block diagram of the DCO with  $\Sigma\Delta$  dithered input.



Figure 4.11: DCO tuning range under temperature and process corners.

4.4. DCO 43



Figure 4.12: DNL of the DCO integer bits.



Figure 4.13: DCO phase noise performance at the center frequency of 512 kHz.

the nominal value change of the RC network, by the DFLL. While the out-of-range condition is caused by both the variation of the ring oscillator and the PTAT bias, it can be compensated by the trimming bits within the subthreshold biasing of the DAC.

The DCO has a digital input range of 11 binary bits, which can be divided into 8 integer bits and 3 fractional bits. The integer bits are decoded to control 255 unary weighted current cells, and the resulting frequency resolution of 2 kHz. The fractional bits are dithered by a third-order  $\Sigma\Delta$  modulator with a clock frequency, SDM,  $16\times$  higher than the update rate of the integer bits, and the resulting frequency resolution is better than 250 Hz. Since the DFLL always locks the DCO frequency back to the RC network, the integral nonlinearity (INL) is not important. However, the deviation of the DCO resolution could change the Allan deviation floor, and thus it is checked by simulation as shown in figure 4.12, where a result of better than  $\pm 0.14$  LSB is observed. Since the current cell is implemented with thick oxide transistors in the layout, the current provided by each cell is not the same due to the well proximity effect. Moreover, since the layout of the cells is in a  $16\times 16$  fashion, the notch in the plot is caused by the row switching of the current cells.

Excluding the decoder and the  $\Sigma\Delta$  modulator, simulation shows that the DCO draws 60 nW power from the 0.8 V supply to generate a 512 kHz output frequency. The ring oscillator core consumes 48 nW, and the rest is dissipated in the DAC. The power of the decoder and  $\Sigma\Delta$  modulator is excluded because they are synthesized together with the DLF. The phase noise of such output frequency is simulated, and the result is shown in figure 4.13. As observed in the plot, the spectrum is dominated by the flicker noise. If one calculates the corresponding Allan deviation using the noise data and the conversion table in appendix a, a floor of 9000 ppm can be derived, which is far worse than the specification. However, thanks to the DFLL operation, the DCO phase noise contribution is suppressed and the Allan deviation floor is set by the comparator flicker noise to an acceptable level.

The detailed design of the delay cell, subthreshold biasing, and the  $\Sigma\Delta$  modulator is given in the following.

### 4.4.1. Leakage based delay cell

In order to achieve a low oscillation frequency (i.e., several hundred kHz), the conventional current-starved ring oscillator can be used. However, its power consumption cannot be scaled down along with the frequency due to the leakage current in its output buffer [26]. Figure 4.14 shows the



Figure 4.14: Schematic of the leakage based delay cell.

4.4. DCO 45



Figure 4.15: Transient voltages of the internal nodes in one delay cell.

schematic of the delay cells used in the ring oscillator, which is proposed by Lee [26] as a solution for low-frequency oscillators with frequency-independent energy efficiency. The delay cell includes input transistors ( $M_{1A}$ ,  $M_{1B}$ ,  $M_{4A}$ , and  $M_{4B}$ ) and back-to-back inverters ( $M_{2A}$ ,  $M_{2B}$ ,  $M_{3A}$ , and  $M_{3B}$ ) which act as a latch. The internal signals of the delay cell D1 in figure 4.10 are shown in figure 4.15.

The input and output transistors are differentially configured. For simplicity, only the left half of the delay will be analyzed for the input low-to-high transition and assuming that the threshold voltages are equal  $(V_{thn} = V_{thp})$ . When the input voltage IN+ goes above  $V_{thn}$ , the switch transistor  $M_{4A}$  closes and  $M_{1A}$  opens. With OUT+ also lower than  $V_{thn}$  at the same moment,  $M_{2A}$  is closed and  $M_{3A}$  is open. Therefore, node OUT- goes up slowly with a charging current of  $I_{biasn}$ , which is controlled by the bias voltage  $V_{biasn}$  on  $M_{NB}$ . After OUT+ rises above  $V_{thn}$ ,  $M_{2A}$  is open and  $M_{3A}$  is closed. Immediate discharging on node OUT- and charging on node OUT+ happen at the same time.

The discharged output on OUT- increases the strength of  $M_{2B}$  and raise the speed at which the complementary output voltage OUT+ is pulled up. The high OUT+ enhances the conductance of  $M_{3A}$  and increases the speed at which OUT- is pulled down. The positive feedback process done by the back-to-back inverters enables rapid voltage transition. It effectively reduces the time spent in the input and output voltage ranges between  $V_{thn}$  and  $V_{DD} - V_{thp}$ . As a result, the short-circuit current in the oscillators itself and proceeding buffers, which conventional low-frequency current starved ring oscillators suffer from are minimized [26].

Due to the sharp transition enabled by the latch, the period of the oscillation is then determined by the bias current  $I_{biasn}$  and  $I_{biasp}$ . Therefore, building a DCO with the ring oscillator using these delay cells can be done by plugging a voltage mode DAC on the bias voltages  $V_{biasn}$  and  $V_{biasp}$ . However, designing such a circuit with nanowatt level of power consumption could result in huge chip area which contradicts the design specification. Therefore, the ring oscillator is biased with a current mode DAC with approximately 50 pA resolution. The bias circuit for the DAC is introduced in the following.

### 4.4.2. Subthreshold PTAT bias circuit

The schematic of the subthreshold current bias circuit is shown in figure 4.16. All of the transistors in the circuit work in the deep subthreshold region. It provides a 4 nA PTAT current with a start-up



Figure 4.16: Schematic of the subthreshold PTAT bias circuit.



Figure 4.17: (a) Temperature dependency of the PTAT bias current. (b) Supply voltage dependency of the PTAT bias current.

circuit ensuring appropriate operation under all conditions. By equating  $V_{GS,M_1} + R \cdot I_{ref}$  and  $V_{GS,M_1}$ , the current can be derived and it is given by

$$I_{ref} = \frac{V_t}{R} \ln(\frac{W_1/L_1}{W_2/L_2}),\tag{4.4}$$

where  $V_t$  is the thermal voltage. A large P-poly resistor with 8 M $\Omega$  resistance is used to make sure the current is small. The resistor has a 3-bit binary tuning bank and thus can be changed from 6.5 M $\Omega$  to 10 M $\Omega$  with a step of 0.5 M $\Omega$  to counter its own process variations. This tunability is proven to be necessary by the DCO range simulations shown in figure 4.11.

The simulated bias current with  $R=8~\mathrm{M}\Omega$  against supply and temperature changes are shown in figure 4.17. As observed from figure 4.17a, the current changes from 4.03 nA to 4.80 nA over a

4.4. DCO 47

| Time [ms] Corner Temp. [°C] | SS  | tt  | ff  |
|-----------------------------|-----|-----|-----|
| -40                         | 4.6 | 3.8 | 3.1 |
| 25                          | 2.7 | 2.1 | 1.2 |
| 125                         | 0.4 | 0.2 | 0.1 |

Table 4.3: Start up time of the current reference in millisecond across corners.

temperature range of -40 °C to 125 °C with a nominal value of 4.07 nA at 25 °C. Its instability is within  $\pm 0.67$  % with a supply variation from 0.6 V to 0.8 V.

The start-up time is summarized in table 4.3, which proves that the reference works under all conditions. In addition,  $M_3$  and  $M_4$  are implemented with 80 parallel transistors each, so that a unit current of 50 pA can be mirrored to the DAC with a mirroring ratio of 80:1.

### **4.4.3.** Sigma-Delta modulator

Even with the DAC current resolution of 50 pA, the DCO integer bit can only achieve a very coarse resolution of 2 kHz, and according to figure 3.8, the Allan deviation floor with such coarse DCO resolution and a comparator flicker noise budget of 1  $\mu$ V/ $\sqrt{\rm Hz}$  at 10 Hz cannot meet the specification. The plot also indicates an 8× finer resolution, 250 Hz, (or 3 more fractional bits) can secure the Allan deviation floor to be low enough. Further improving the DCO integer bit resolution re-



Figure 4.18: Illustration of the DCO high rate dithering.



Figure 4.19: Block diagram of the third-order MASH  $\Sigma\Delta$  modulator.

quires a finer DAC current resolution, which in turn requires a smaller bias current. In order to make the bias current smaller, at least an  $8\times$  larger bias resistor (64 M $\Omega$ ) is needed according to equation 4.4. Such large resistor will easily dominant the chip area. Moreover, building a monotonic DAC with 11-bit resolution would be at the limit of feasibility and anyhow occupy a large area.

In order to improve the resolution without sacrificing chip area, a time domain high rate dithering technique is adopted, which is illustrated in figure 4.18. Instead of applying a constant input that would select current  $I_1$  or  $I_2$ , where  $I_2 = I_1 + \Delta I$ , with  $\Delta I = 50$  pA being the LSB step of the DAC, during an integer bit update cycle, the selection alternates between  $I_1$  and  $I_2$  several times during the cycle. In the example of figure 4.18,  $I_2$  is chosen 1/8 of the time and  $I_1$  is chosen for the rest 7/8 of the time. Consequently, the averaged current value will be  $I_1 + \Delta I/8$ . It should be mentioned that the resolution of the time-averaged value is related to the dithering speed. The dithering rate should be higher than the integer bit update rate times the inverse of enhanced resolution (8 in this case) [27].

The frequency pattern shown in figure 4.18 is not randomized and a spur, in the frequency spectrum would be created consequently. This spur is not critical to the DFLL because it is averaged out when the DFLL is used to generate a wakeup signal with a long period. However, since the DFLL could also be used for other blocks in the IoT node, it is nice to have a spurious-free spectrum. In order to reduce this spur, higher-order  $\Sigma\Delta$  modulation can be employed. The modulator used in this design is a third-order multi-stage noise shaping (MASH) one proposed in [28] and it is shown in figure 4.22a. It consists of three accumulators, three delay logics, and one combiner logic. The function it performs is given by

$$D_{out}(z) = f_{in}(z) + (1 - Z^{-1})^3 E_{q3}(z), \tag{4.5}$$

where the  $D_{out}(z)$  is the dithered output, the  $f_{in}$  is the fractional input and the  $E_{q3}$  is the quantization error of the third accumulator.

The accumulator data width is 3 bits, which can be concatenated with the integer bit to represent a fractional resolution of 1/8. The modulator is clocked at  $f_{out}/2$ , which is  $16 \times$  the integer bit update frequency (or the DLF clock frequency since the decoder is not clocked). The clock speed is  $2 \times$  higher than the lowest possible dithering clock, which ensures the quality of resolution enhancement.

Figure 4.20 shows the comparison of the DFLL output phase noise with different SDM configurations. As observed in the figure, the first-order  $\Sigma\Delta$  modulator almost removes the original quantization spur. However, it also introduces spurs at higher frequencies due to its deterministic pattern in the frequency domain. The third-order  $\Sigma\Delta$  modulator removes these high-frequency spurs by further randomizing the pattern with respect to the first-order one. The effect of the third-order  $\Sigma\Delta$  modulator in resolution enhancement is also checked with Allan deviation simulation as shown in figure 4.21. Thanks to the noise shaping, the quantization energy is moved to higher frequencies, resulting in an Allan deviation much lower than the one of  $f_{res} = 2$  kHz. With the  $\Sigma\Delta$  modulator, the Allan deviation shares the same floor with that of  $f_{res} = 250$  Hz, it is higher for  $\tau < 0.1$  s due to the fact that the quantization energy is moved to higher frequencies. This plot is simulated with the phase noise shown in figure 4.13, so the DCO intrinsic phase noise is small enough to meet the Allan deviation specifications.

The combiner logic in [28] is plotted in figure 4.22b. The three one-bit carry-out data stream are combined such that the resulting dithered output fulfills the  $\Sigma\Delta$  characteristics. Its mathematical

4.4. DCO 49



Figure 4.20: Phase noise plot showing the effect of the first-order and third-order  $\Sigma\Delta$  modulators.



Figure 4.21: Allan deviation plot showing the effect of the third-order  $\Sigma\Delta$  modulator.



Figure 4.22: (a) Original combiner logic with signed summation. (b) Improved combiner with D flip-flops only.

representation is given by

$$D_{out}(z) = ovf_1 + ovf_2(1 - Z^{-1}) + ovf_3(1 - Z^{-1})^2,$$
(4.6)

where  $ovf_1$ ,  $ovf_2$ ,  $ovf_3$  are the carry-outs of the first, second, and third accumulator respectively. The arithmetic operation requires logic which is able to handle signed summation. The signed output must be further added together with the DLF output and then sent into the DAC. This may result in trivial digital circuit with significant power overhead. Therefore, an improved version of the combiner is proposed in [27]. As shown in the figure, only D flip-flops (DFFs) with complementary outputs are now required. It should be noted that, due to the lack of signed operation, a DC offset of 3 fractional bits is now introduced. However, this DC offset will be canceled by the FLL operation. The outputs of the improved combiner can directly control the switches of the unary DAC current cells, making the DAC design much simpler.

# **4.5.** Digital Loop Filter

The DFLL requires a low pass filter to generate a stable output frequency. Therefore, the DLF is implemented as an up/down counter, which provides a first order low pass function. The DLF is

4.6. Clock divider 51



Figure 4.23: Block diagram of the DLF and its connection with the DCO.

synthesized with the Verilog-HDL language, and its block diagram and the connection with the DCO is shown in figure 4.23.

The DLF takes the comparator output as the input of an up/down counter, up/dn. The 18-bit adder/subtractor logic is configured as an accumulator with an 18-bit configurable input var[17:0]. Using the most significant bit (MSB) of the accumulator output, o[18] and the up/dn signal, control logic detects any possible overflow or underflow in the low 18-bit output of the accumulator, o[17:0], and clips it if necessary. Finally, the 18-bit clipped data, d[17:0] is synchronized with an array of DFFs, whose output becomes the DLF output dout[17:0].

The high 8 bits of DLF output, dout[17:10], is taken by the binary-to-thermometer (B2T) decoder in the DCO input front-end, and becomes the 8 DCO integer bits. The following 3 bits, dout[9:7], are taken by the  $\Sigma\Delta$  modulator, and becomes the 3 fractional bits. The low 7 bits, dout[6:0], are not used. By changing the configurable input var[17:0], the DLF gain has a configurable range from  $2^{-11}$  to  $(2^8-2^{-11})$ , and a resolution of  $2^{-11}$ , in the unit of the DCO integer LSB. The DCO front-end is synthesized together with the DLF using Verilog-HDL, and the decoder is implemented in a row-column fashion to save power and area.

### **4.6.** Clock divider

In figure 4.1, there are 7 clocks (purple color) required by the DFLL internally. Some of them must be non-overlapping, while the others may require some delay to secure the correct DFLL operation. In order to generate the required clock signals, the divider is implemented with two separated blocks, namely the multi-phase divider and the non-overlap clock generator.

The block diagram of the multi-phase divider is shown in figure 4.24. It contains two types of DFFs, which can be reset to different initial states, which are indicated on the DFF symbols. All



Figure 4.24: Block diagram of multi-phase divider.



Figure 4.25: Block diagram of the non-overlap clock generator.

4.6. Clock divider 53



Figure 4.26: Waveforms of the timing of the multi-phase divider and the non-overlap clock generator.

of the DFFs are configured in a divide-by-two fashion. Thus, the clock for the  $\Sigma\Delta$  modulator can be generated with the DCO output using a single 1/2 divider. By concatenating four 1/2 dividers together, the  $f_{out}/16$  signal is then generates. By changing the types of DFFs, constant phase shifts are created in the  $f_{out}/16$  outputs, which will be used by the non-overlap clock divider.

Figure 4.25 shows the block diagram of the non-overlap clock divider. It processes the signals from the multi-phase divider, and generate the clocks required by the FD and DLF. Its input and resulting clocks are plotted together in figure 4.26. As indicated in the waveform, the  $\Phi_1$ ,  $\Phi_2$ , and  $\Phi_3$  signals are separated with one DCO clock cycle, while the positive half cycle of  $\Phi_2$  is set to  $16/f_{out}$ . Since  $\Phi_2 = 16/f_{out}$  will be evaluated with the FD, the accuracy of this division will directly affect the accuracy of the DFLL nominal output frequency,  $f_{nom}$ . The 1/16 division ratio is checked with corner and temperature simulations, and the worst case error is 0.75 ppm, which is small enough and hence negligible.

### 4.7. Layout overview

Figure 4.27 shows the chip level layout and the zoomed-in the DFLL layout, where major blocks are marked with rectangles showing their dimensions. Since the ring oscillator is the core of the DFLL, it is put at the center of the layout. The comparator is close to the RC and far away from the ring oscillator, to ensure a low distortion due to the coupling from other signals on  $V_{ref}$ . Because the 262 input signals of the DAC current cells are provided by the decoder and  $\Sigma\Delta$  modulator, which are synthesized together with the DLF, the DAC and DLF are next to each other. Moreover, the subthreshold biasing, DAC, and ring oscillator are put close to each other to ensure voltage drops on  $V_{bias}$ ,  $V_{biasp}$ , and  $V_{biasn}$ . The layout occupies approximately 0.07 mm<sup>2</sup> chip area, which fulfills the area requirement.



Figure 4.27: Layout of the chip.

Table 4.4 summaries the configurations in the DFLL that can be changed through a Serial Peripheral Interface bus (SPI). The measurement results with different configurations will be covered in the next chapter.

| Register      | Function                                               |
|---------------|--------------------------------------------------------|
| TC_cal[3:0]   | TC calibration of $R_{ref}$                            |
| c_cal[9:0]    | Trimming bits of $C_{ref}$                             |
| chop_sel[1:0] | Chopping frequency selection                           |
| var[17:0]     | DLF gain factor $K_DLF$ configuration                  |
| en_sdm        | $\Sigma\Delta$ modulator enable/disable selection      |
| sdm_fctrl     | $\Sigma\Delta$ modulator dithering frequency selection |
| dco_cal       | Calibration of the DCO current biasing                 |
| mode          | DCO locking/free running mode selection                |
| decoder[7:0]  | Decoder value for DCO free running mode                |
| rst_dig       | Reset for the digital blocks (e.g., DLF)               |
| rst_osc       | Ring oscillator hard reset for start-up                |

Table 4.4: Summary of the configurable functions in the DFLL.

# Measurement Results

# **5.1.** Chip micrograph

Figure 5.1 shows the photo of the fabricated chip of the proposed oscillator in the TSMC 40-nm process, where major blocks are marked with rectangles.



Figure 5.1: Die micrograph of the fabricated chip in the 40-nm process.

# **5.2.** Measurement setup

The proposed oscillator is tested with different supply voltages at different temperatures. The block diagram of the measurement is shown in figure 5.2 and the functionality of each equipment is described in the following.



Figure 5.2: Block diagram of the measurement setup.



Figure 5.3: PCBs used for measurement.

- Test PCB (shown in figure 5.3 left) with a wire-bonded chip on it. Low-drop out regulators (LDOs) provide the voltages required by the oscillator, and the power consumption of three major blocks (FD, digital, and DCO) can be measured separately. Level shifters (not drawn) are employed to deal with the high voltage I/O signals from the field-programmable gate array (FPGA) board.
- FPGA board (shown in figure 5.3 right) for communication with the PC and chip configuration. On the board, a USB232 chip receives the information from the PC and translates it into the RS232 format, which will be processed by the FPGA. The FPGA chip acts as an SPI master and sets the oscillator to different test modes.
- CTS climate chamber to test the chip over temperature. It is capable of changing the temperature from -40 °C to 125 °C with an inaccuracy better than  $\pm 0.3$  °C.
- Agilent power supply E3631 providing required voltages for the test PCB.
- Keithly source meter for power consumption measurement.
- Agilent 53230A frequency counter for Allan deviation and period jitter measurement.
- Rohde & Schwarz FSV spectrum analyzer for debugging and start-up time measurement.

In the measurement process, if not explicitly specified, the default settings for the trimming bits and chip functions are

- resistor bank, 001;
- capacitor bank, 0111111111;
- DLF gain, 1/8 of the DCO integer LSB;
- chopping function, on @  $f_{out}/256$ ;
- $\Sigma\Delta$  function, on @  $f_{out}/2$ .

#### **5.3.** Measurement results

#### **5.3.1.** Frequency accuracy vs. temperature variation

The chip is measured over a temperature range from -20 °C to 80 °C, and the results with different chopper and  $\Sigma\Delta$  modulator settings are plotted in figure 5.4. The temperature range is reduced from the original range of -40 °C to 125 °C because of two reasons. First, the wire-bonded chip cannot work safely at high temperatures over 80 °C, as proven by two chips failing during the characterization. Second, although the claimed lower limit of the oven is -45 °C, it never settled to temperatures below -36 °C during the measurement. Also due to limited available testing time, the lower temperature limit was chosen to be -20 °C.

In figure 5.4, the nominal frequency of the output frequency is 416.7 kHz. Enabling the chopper slightly improves the accuracy at low temperatures (0.1 % at -20 °C). This can be interpreted as the comparator offset changes dramatically at low temperatures, and the chopper cancels it. Due to the intrinsic noise in the circuit, the locking accuracy is not affected by the DCO resolution. Consequently, the accuracy is not improved by enabling the  $\Sigma\Delta$  modulator, and the best TC is 106 ppm/°C for the results with chopper enabled.



Figure 5.4: Frequency accuracy vs. temperature variation.

The TC of the output frequency is much larger than that of the compensated resistor in the simulation (26 ppm/°C). Changing the TC calibration bits in the resistor bank will change the TC of the frequency. However, the result shown in the plot are the ones after such calibration. The excess temperature dependency could be caused by the temperature sensitive leakage and resistance of the switches both in the network and in the trimming banks. As proof, it is observed that changing the trimming bits of the capacitor bank also results in the change of the TC, and the bits are set accordingly to ensure the best TC (106 ppm/°C).

#### **5.3.2.** Frequency accuracy vs. supply variation

A good wakeup timer should have high supply rejection, so that there is no need for any additional LDOs on the system level. Theoretically, the oscillation frequency is not sensitive to any supply changes due to the supply independent RC constant. However, the active circuits which rely on certain supply voltages the will deteriorate the frequency accuracy.

During the measurement, the oscillator shows the capability of working under lower supply voltages (0.7 V) with respect to simulation (0.8 V), which fits the purpose of this work. Therefore, the supply measurement range is changed accordingly. Figure 5.5 shows the measured frequency accuracy versus supply changes. With chopper and  $\Sigma\Delta$  modulator both active, the frequency shows an inaccuracy of  $\pm 0.6$  % with supply changes from 0.65 V to 0.8 V. Ideally, the frequency should be insensitive to supply changes because of the supply independent RC time constant. Therefore, enabling the chopper and  $\Sigma\Delta$  modulator settings will not improve the stability because the time constant is not changed, and the comparator offset voltage is not strongly supply dependent. The excess supply dependency could be caused by the supply sensitive leakage and resistance of the switches both in the network and the trimming banks. The capacitor bank switches could be a major contributor to the excess dependency due to their high threshold voltage transistors, and it is also observed in this measurement that changing capacitor bank trimming bits results in the change of the supply dependency.



Figure 5.5: Frequency accuracy vs. supply variation.

#### **5.3.3.** Allan deviation

Figure 5.6 shows the Allan deviation results of the oscillator with different settings. The measurement shows  $1.5\times$  smaller Allan deviation floor when enabling the chopper. This is due to the reduction of the flicker noise in the comparator, which is the major contributor of the flicker frequency modulation in the output frequency. The  $\Sigma\Delta$  modulator improves both the Allan deviation in the thermal noise process  $(\tau^{-1/2})$  and the floor  $(\tau^0)$  with finer DCO resolution, and this result



Figure 5.6: Allan deviation measurement results.

is consistent with the simulation in section 4.4.3, and the reason for this improvement has been discussed in section 3.2.1.

The ripples in the Allan deviation plot come from the single frequency modulation of the DFLL operation [29]. The ripple is not seen in the MATLAB simulation plots because the number of simulated data points are limited. With chopper and  $\Sigma\Delta$  modulator both enabled, the Allan deviation is smaller than 20 ppm for  $\tau > 10$  s, and smaller than 9 ppm for  $\tau > 90$  s.

#### **5.3.4.** Power consumption

The power consumption of the oscillator is 181 nW, and its breakdown is shown in figure 5.7. With the frequency of 416.7 kHz, the oscillator reaches an efficiency of 0.43 pJ/Cycle, which is the best over the state-of-the-art low-power ( $<1 \mu W$ ) RC oscillators.



Figure 5.7: Oscillator power breakdown.

#### **5.3.5.** Other results

Under the nominal DLF gain configuration,  $K_{DLF} = 1/8$ , the measured period jitter is 15.8 ns, which is 0.66 % of the oscillation period.

Compared to relaxation oscillators, one of the shortcomings of the FLL based oscillators is their long start-up time. By varying the DLF gain, the bandwidth of the DFLL can be changed. Consequently, the start-up time of the DFLL can be configured according to application requirements. Due to the changed loop dynamics, the jitter performance is also changed. The start-up time and jitter measurement results with different DLF gain is shown in figure 5.8.



Figure 5.8: Period jitter and start-up time with different DLF gain.

Figure 5.9 shows the measured output spectrum with the  $\Sigma\Delta$  function on and off (top), and the settling behavior of the DFLL under various gain ( $K_{DLF} = 1/4, 1/8, and 1/16$ ; bottom). The spectrum plot shows the effect of the  $\Sigma\Delta$  noise reduction at close-in frequencies, and the settling plot is measured using the analog demodulation feature of the spectrum analyzer.

### 5.4. Conclusion and summary

The performance of the proposed oscillator is compared with the other low-power ( $<1~\mu W$ ) RC oscillators in table 5.1. Thanks to the digital-intensive architecture, it works with the lowest supply voltage. With the digital circuit operating at a low frequency, the proposed oscillator achieves the best power efficiency. While keeping an on-par long-term stability, the chip area is also smaller than those of the other FLL based oscillators. A benchmark plot with the Allan deviation floor and energy efficiency of different oscillators is shown in figure 5.10.



Figure 5.9: Output spectrum and settling behavior.



Figure 5.10: Benchmark: Allan deviation floor versus energy efficiency.

Table 5.1: Performance summary and comparison with state-of-the-art.

| Area [mm <sup>2</sup> ] | Allan deviation<br>Floor [ppm] | TC [ppm/°C]         | VDD [%]       | Variation with | Energy/Cycle<br>[pJ/Cycle] | Power [nW] | VDD [V]     | Frequency [kHz] | Process [nm] | Architecture             |                           |
|-------------------------|--------------------------------|---------------------|---------------|----------------|----------------------------|------------|-------------|-----------------|--------------|--------------------------|---------------------------|
| 0.07                    | 9 (>100 s)                     | 106<br>-20 - 80 °C  | 0.65 - 0.8 V  | $\pm 0.6$      | 0.43                       | 181        | 0.7         | 417             | 40           | DFLL                     | This Work                 |
| 0.005                   | -                              | 96<br>0 - 145 °C    | 0.9 - 2 V     | $\pm 0.54$     | 0.68                       | 920        | 1.4         | 1350            | 65           | Relaxation<br>Oscillator | Savanth [11]<br>ISSCC'17  |
| 0.032                   | 20 (>100 s)                    | 85<br>-40 - 90 °C   | 0.95 - 1.05 V | $\pm 0.25$     | 6.5                        | 130        | 1           | 18.5            | 65           | Relaxation<br>Oscillator | Paidimarri [8]<br>JSSC'16 |
| 1.08                    | 60 (>100 s)                    | 148<br>-20 - 80 °C  | 0.6 - 0.9 V   | $\pm 0.27$     | 11.8                       | 75.6       | 0.8         | 6.4             | 250          | Relaxation<br>Oscillator | Wang [4]<br>JSSC'16       |
| 0.26                    | 7 (>12 s)                      | 34.3<br>-40 - 80 °C | 1.2 - 1.8 V   | $\pm 0.23$     | 1.56                       | 110        | 1.3         | 70.4            | 180          | Analog<br>FLL            | Choi [3]<br>JSSC'16       |
| 0.5                     | 63 (>100 s)                    | 13.8<br>-25 - 85 °C | 0.85 - 1.4 V  | $\pm 0.14$     | 1.6                        | 4.7        | 0.85 - 1.4  | 3               | 180          | Analog<br>FLL            | Jang [12]<br>ISSCC'16     |
| 0.015                   | 4 (>2 s)                       | 38<br>-20 - 90 °C   | 1.1           | $\pm 0.14$     | 5.8                        | 190        | 1.15 - 1.45 | 33              | 65           | Relaxation<br>Oscillator | Griffith [10]<br>ISSCC'14 |
| 0.12                    | ı                              | 105<br>-40 - 90 °C  | 0.725 - 0.9 V | $\pm 0.82$     | 2.8                        | 280        | 0.8         | 100             | 90           | Relaxation<br>Oscillator | Tokairin [9]<br>VLSI'12   |

## Conclusion and Future Work

#### **6.1.** Conclusion

The duty-cycled IoT node can achieve extremely low power consumption, but it requires an accurate wakeup timer for synchronization. Such timer must be compatible with standard CMOS process and occupy the minimum area to save cost and size. Since it is continuously active, the timer must consume ultra-low power (<1 µW), while operating at the lowest possible supply voltage for compatibility with a wide range of energy sources (e.g., button batteries, energy scavengers) and to simplify the power management. Because of the size and power limitations, the RC-based oscillator is the preferred choice. However, the stability of conventional RC relaxation oscillators is limited by the delay of power-hungry continuous-time comparators, which are vulnerable to PVT variations and noise. FLL-based solutions circumvent such limitations, but they heavily rely on analog-intensive circuits, which require significant power, area, and high supply voltages. Hence, they are not friendly to technology scaling in terms of both area and required supply voltage.

This thesis presents a wakeup timer employing a DFLL architecture to fully exploit the advantages of advanced CMOS processes. A self-biased  $\Sigma\Delta$  digitally controlled oscillator (DCO) is locked to an RC time constant via a chopped dynamic comparator and a digital loop filter. In the measurement, the proposed timer achieves the best power efficiency (0.43 pJ/Cycle) at the lowest supply voltage (0.7 V) with respect to the state-of-the-art, while keeping on-par long-term stability (Allan deviation floor below 10 ppm) in a small area (0.07 mm² in 40-nm CMOS). Furthermore, the timer shows a TC of 106 ppm/°C in over the -20 °C to 80 °C temperature range, and a stability of  $\pm 0.6\%$  over the supply changes from 0.65 V to 0.8 V. The TC and the line regulation are comparable to the level reached in the state-of-the-art, but they are worse than expected from the RC network. In the measurement, the switches in the RC network and the trimming banks are identified as the probable cause for the performance degradation, while the exact reason is still under investigation and should be checked by comprehensive system-level simulations. Since the trimming bits are optimized for minimizing the TC, the frequency is not calibrated to the target value (512 kHz). This is a drawback of this design but can be circumvented in the IoT node by appropriately tuning the counting circuit cascaded to the proposed timer.

#### **6.2.** Future work

The most straightforward improvement is redesigning the RC network. The redesigned RC network should fully cover the process variations of the resistors and capacitors with a high trimming accuracy. The TC calibration should be combined into the RC value trimming.

The temperature dependency of the custom designed metal capacitors should be checked. Although unlikely, if they show a high temperature dependency, an alternate capacitor value trimming bank should be designed. The switches in the RC network and the trimming banks are identified as the cause of the high instability over temperature and supply changes. One could investigate the contribution of the leakage and the resistance of each switch to the frequency instability, and then optimizes them accordingly. Since it is hard for conventional analog switches to achieve both low leakage and low resistance simultaneously at low supply voltages, methods such as bootstrapping could be a solution for optimizing the switches.

The digital circuits consume 38 % of the total system power, where 50 % of the consumed power is due to leakage (simulation result). In this design, they are synthesized with high threshold voltage transistors. The synthesis could be done with thick oxide transistors which have smaller leakage. In addition, the simulation shows the 8-bit binary-to-thermometer decoder consumes over 50 % of the digital active power(>9 % for the system). Segmentation of the DAC could be implemented to save power in the decoder.



# Phase Noise, Period Jitter and Allan Deviation Conversions

#### **A.1.** Phase noise and Allan deviation conversion

As a reasonable and accurate model of frequency and phase instabilities in the frequency domain, the phase noise,  $\mathcal{L}(f)$ , is one half of the double-sideband spectral density of phase fluctuations,

$$\mathcal{L}(f) \equiv \frac{1}{2} S_{\phi}(f). \tag{A.1}$$

Furthermore,  $S_{\phi}(f)$  can be expressed as the sum of five independent noise processes representing the random fluctuations

$$S_{\phi}(f) = \begin{cases} \sum_{\beta=-4}^{0} v_0^2 h_{\alpha} f^{\beta} & \text{for } 0 < f \le f_h, \\ 0, & \text{for } f \ge f_h, \end{cases}$$
(A.2)

where  $v_0$  is the nominal frequency,  $h_{\alpha}$  is constant,  $\beta$  is integer, and  $f_h$  is high-frequency cut-off of an infinitely sharp low pass filter. The characteristics of the five processes in  $S_{\phi}(f)$  is plotted in figure A.1.



Figure A.1: Characteristics of the noise processes in spectral density of phase fluctuations.



Figure A.2: Characteristics of the noise processes in spectral density of fractional frequency fluctuations.



Figure A.3: Characteristics of the noise processes in Allan variance.

In order to elaborate the relation between the phase noise and the Allan deviation, the spectral density of fractional frequency fluctuations,  $S_{\nu}(f)$ , should be calculated from the phase noise

$$S_{y}(f) = \frac{f^{2}}{v_{0}^{2}} S_{\phi}(f). \tag{A.4}$$

 $S_{\nu}(f)$  can be rewritten with the five noise processes, and given as

$$S_{y}(f) = \begin{cases} \sum_{\alpha=-2}^{2} h_{\alpha} f^{\alpha} & \text{for } 0 < f \le f_{h}, \\ 0, & \text{for } f \ge f_{h}, \end{cases}$$
(A.5)

where  $\alpha \equiv \beta + 2$ . The Characteristics of these noise processes in  $S_{\nu}(f)$  is plotted in A.2.

In time domain, oscillators are used together with counters, which averages the frequency over a time  $\tau$ . This averaging process can be interpreted as a filtering operation with the transfer function of H(f). As a result, the M-sample time domain frequency instability can be expressed as

$$\sigma^2(M, T, \tau) = \int_0^\infty S_{\mathcal{Y}}(f) |H(f)|^2 \mathrm{d}f, \tag{A.7}$$

where 1/T is the measurement rate. In the case of two-sample variance,  $\tau = 1/T$  applies, and  $\sigma^2(M,T,\tau)$  becomes the Allan variance and  $|H(f)|^2$  is  $2(\sin^4(\pi\tau f))/(\pi\tau f)^2$ . Thus, the Allan variance can be computed from

$$\sigma_y^2(\tau) = \int_0^{f_h} S_y(f) \frac{\sin^4(\pi \tau f)}{(\pi \tau f)^2} \mathrm{d}f,\tag{A.8}$$

where  $f_h$  is the high-frequency cut-off of an infinitely sharp low-pass filter. Equation A.8 is only valid when  $2\pi f_h >> 1$ , and it can also be written using the noise processes:

$$\sigma_y^2(\tau) = h_{-2} \frac{(2\pi)^2}{6} \tau + h_{-1} 2 \ln 2 + h_0 \frac{1}{2\tau} + h_1 \frac{1.038 + 3 \ln 2\pi f_h \tau}{2\pi^2 \tau^2} + h_2 \frac{3f_h}{4\pi^2 \tau^2}.$$
 (A.9)

| Description of noise process     | $S_{y}(f)$     | $S_{\phi}(f)$       | $\sigma_y^2(\tau)$ |
|----------------------------------|----------------|---------------------|--------------------|
| Random walk frequency modulation | $h_{-2}f^{-2}$ | $h_{-2}v_0^2f^{-4}$ | $Ah_{-2}\tau^1$    |
| Flicker frequency modulation     | $h_{-1}f^{-1}$ | $h_{-1}v_0^2f^{-3}$ | $Ah_{-1}\tau^0$    |
| White frequency modulation       | $h_0 f^0$      | $h_0 v_0^2 f^{-2}$  | $Ah_0\tau^{-1}$    |
| Flicker phase modulation         | $h_1f^1$       | $h_1 v_0^2 f^{-1}$  | $Ah_1\tau^{-2}$    |
| White phase modulation           | $h_2f^2$       | $h_2 v_0^2 f^0$     | $Ah_2\tau^{-1}$    |

Table A.1: Translation of frequency instability measures from spectral densities in frequency domain to variances in time domain.

The Characteristics of these noise processes in the square root of Allan variation (i.e., Allan deviation,  $\sigma_{\nu}(f)$ ), is plotted in figure A.3.

Finally, table A.1 summaries the coefficients of the conversion from frequency domain to time domain using the noise processes [30]. In reality, the random walk frequency modulation is very difficult to measure, because it is very close to the carrier. The Allan deviation floor ( $\tau^0$  term) is what people usually use for to quantifying the long-term stability of an oscillator. From the table, if one takes the square root of  $\sigma_y^2(\tau)$ , it is clear that such floor is only caused by the flicker frequency modulation, which is the  $f^{-3}$  term in  $S_{\phi}(f)$ , and the  $f^{-1}$  term in  $S_y(f)$ .

### **A.2.** Phase noise and period jitter conversion

By definition, period jitter,  $J_{PER}$ , compares two similar instants in time of a clock source such as two successive rising edges. Since the two instants are separated in time by approximately one period, it is reasonable to expect that higher frequency jitter components will contribute more to period jitter than lower frequency jitter components.

The basic relationship between  $J_{PER}$  and  $S_{\phi}(f)$  is given by

$$J_{PER} = \frac{T_0}{2\pi} \sqrt{\Delta \varphi_{rms}^2}$$

$$= \frac{T_0}{\pi} \int_0^\infty S_{\phi}(f) \sin^2(\pi f \tau_0) df,$$
(A.10)

where  $\tau_0$  is the inverse of the carrier frequency. Therefore, the period jitter is the rms value of the integral of phase noise with a frequency weighting factor  $4\sin^2(\pi f \tau_0)$ .

Using the weighting factor, the integration of phase noise is not sensitive to the contributions from offset frequencies well below the half carrier frequency. Moreover, the weighting factor reaches its first maximum at the half carrier frequency and becomes periodic thereafter. It is clear that phase noise in the vicinity of this offset will make a substantial contribution to the period jitter.



# MATLAB behavior model of the DFLL

To predict the long-term stability (Allan deviation floor) of the DFLL, a z-domain MATLAB behavior model is built. The model is based the time domain PLL model in [31].

The RC network is treated as an FVC with the equation 3.7. The comparator quantizes the voltage with a sign function. Depending on the sign of comparator output, the DLF increases or decreases its internal value with a step of  $K_{DLF}$ . The DCO changes its output period according to the DLF output with a function of

$$\Delta T = \frac{K_{DCO}}{f_0(f_0 + K_{DCO})}$$
 (B.1)

, where  $f_0$  is the DCO initial frequency and  $K_{DCO}$  is the frequency deviation due to DLF. The period deviation in equation B.1 is different from the one reported in [31], where the approximation of equation B.1 is used. However, such approximation is only valid with  $K_{DCO} << f_0$ , which is not the case for the DFLL with coarse  $K_{DCO}$ .

The white noise sources can be built using Gaussian randomization. Since the bandwidth of the FLL is  $f_{ref}/2$ , the standard deviation of the Gaussian randomization is then known by calculating the product of the bandwidth and wanted noise spectrum. The flicker noise is created by shaping the spectrum of white noise in the frequency domain and rebuild it by the inverse Fourier transformation to the voltage domain. The phase noise of the DCO is modeled as jitter in the time domain using methods proposed in [32].

Finally, a pseudo code for the algorithm is shown in figure B.1.

```
For each FLL_cycle i
       Calculate integer part of the FCW: FCWint
       Calculate fractional part of the FCW: FCW<sub>frac</sub>
        Calculate DCO period deviation due to the FCWint: Tdev_int
       While DCO_cycle j < 2*N*i
               DCO_cycle + 1
                Generate \Sigma\Delta modulated stream with FCW_{frac} at every 2 DCO_cycle
                Calculate DCO period deviation due to \Sigma\Delta output stream: T_{\text{dev\_frac}}
                Accumulate DCO period deviation:
                T_{\text{dev}}(j) = T_{\text{dev}}(j-1) + T_{\text{dev\_int}} + T_{\text{dev\_frac}}
                Generate DCO clock transition:
               t(j) = j*T_0 + T_{dev}(j) + T_{dev\_pn}
       End
       Divide the the clock by N: T_{avg} = (t(j) - t(j - 2*N))/2
       RC network generate FVC voltage
       Comparator quantize the FVC voltage with noise voltages
       Accumulate FCW with KDLF
End
```

Figure B.1: Pseudo code of the discrete time MATLAB behavior model.

## Bibliography

- [1] F. Sebastiano, L. J. Breems, and K. A. Makinwa, *Mobility-based time references for wireless sensor networks* (Springer Science & Business Media, 2013).
- [2] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, *Internet of Things (IoT): A vision, architectural elements, and future directions,* Future Generation Computer Systems **29**, 1645 (2013).
- [3] M. Choi, T. Jang, S. Bang, Y. Shi, D. Blaauw, and D. Sylvester, *A 110nW Resistive Frequency Locked On-Chip Oscillator with 34.3ppm/°C Temperature Stability for System-on-Chip Designs*, IEEE Journal of Solid-State Circuits **51**, 2106 (2016).
- [4] H. Wang and P. P. Mercier, A Reference-Free Capacitive-Discharging Oscillator Architecture Consuming 44.4pW/75.6nW at 2.8Hz/6.4kHz, IEEE Journal of Solid-State Circuits 51, 1423 (2016).
- [5] F. Sebastiano, Frequency references for Internet of Everything, in Solid-State Circuits Conference (ISSCC) short courses, 2014 IEEE International (IEEE).
- [6] D. W. Allan, Statistics of atomic frequency standards, Proceedings of the IEEE 54, 221 (1966).
- [7] V. De Smedt, P. De Wit, W. Vereecken, and M. S. Steyaert, A 66μW 86ppm/°C Fully-Integrated 6 MHz Wienbridge Oscillator With a 172dB Phase Noise FOM, IEEE Journal of Solid-State Circuits 44, 1990 (2009).
- [8] A. Paidimarri, D. Griffith, A. Wang, G. Burra, and A. P. Chandrakasan, *An RC oscillator with comparator offset cancellation*, IEEE Journal of Solid-State Circuits **51**, 1866 (2016).
- [9] T. Tokairin, K. Nose, K. Takeda, K. Noguchi, T. Maeda, K. Kawai, and M. Mizuno, *A 280nW*, 100kHz, 1-cycle start-up time, on-chip CMOS relaxation oscillator employing a feedforward period control scheme, in VLSI Circuits (VLSIC), 2012 Symposium on (IEEE) pp. 16–17.
- [10] D. Griffith, P. T. Roine, J. Murdock, and R. Smith, A 190nW 33kHz RC oscillator with±0.21temperature stability and 4ppm long-term stability, in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International (IEEE) pp. 300–301.
- [11] A. Savanth, J. Myers, A. Weddell, D. Flynn, and B. Al-Hashimi, A 0.68nW/kHz supply-independent Relaxation Oscillator with ±0.49%/V and 96ppm/°C stability, in Solid-State Circuits Conference (ISSCC), 2017 IEEE International (IEEE) pp. 96–97.
- [12] T. Jang, M. Choi, S. Jeong, S. Bang, D. Sylvester, and D. Blaauw, A 4.7nW 13.8ppm/°C self-biased wakeup timer using a switched-resistor scheme, in Solid-State Circuits Conference (ISSCC), 2016 IEEE International, pp. 102–103.

76 Bibliography

[13] Y. Wang, K. T. Chai, X. Mu, M. Je, and W. L. Goh, A 1.5±0.39ppm/°C Temperature-Compensated LC Oscillator Using Constant-Biased Varactors, IEEE Microwave and Wireless Components Letters 25, 130 (2015).

- [14] M. S. McCorquodale, J. D. O. Day, S. M. Pernia, G. A. Carichner, S. Kubba, and R. B. Brown, *A Monolithic and Self-Referenced RF LC Clock Generator Compliant With USB 2.0*, IEEE Journal of Solid-State Circuits 42, 385 (2007).
- [15] N. Sinoussi, A. Hamed, M. Essam, A. El-Kholy, A. Hassanein, M. Saeed, A. Helmy, and A. Ahmed, A single LC tank self-compensated CMOS oscillator with frequency stability of ±100ppm from -40°C to 85°C, in Frequency Control Symposium (FCS), 2012 IEEE International (IEEE) pp. 1–5.
- [16] S. M. Kashmiri, M. A. P. Pertijs, and K. A. A. Makinwa, *A Thermal-Diffusivity-Based Frequency Reference in Standard CMOS With an Absolute Inaccuracy of* ±0.1% From -55°C to 125°C, IEEE Journal of Solid-State Circuits 45, 2510 (2010).
- [17] K. Sundaresan, P. E. Allen, and F. Ayazi, *Process and temperature compensation in a 7-MHz CMOS clock oscillator*, IEEE Journal of Solid-State Circuits 41, 433 (2006).
- [18] F. Sebastiano, L. J. Breems, K. A. Makinwa, S. Drago, D. M. Leenaerts, and B. Nauta, *A low-voltage mobility-based frequency reference for crystal-less ULP radios*, IEEE journal of solid-state circuits 44, 2002 (2009).
- [19] F. Sebastiano, L. J. Breems, K. A. Makinwa, S. Drago, D. M. Leenaerts, and B. Nauta, A 65-nm CMOS temperature-compensated mobility-based frequency reference for wireless sensor networks, IEEE journal of solid-state circuits 46, 1544 (2011).
- [20] S. Zaliasl, J. C. Salvia, G. C. Hill, L. W. Chen, K. Joo, R. Palwai, N. Arumugam, M. Phadke, S. Mukherjee, and H.-C. Lee, *A 3 ppm 1.5×0.8mm*<sup>2</sup> *1.0μA 32.768kHz MEMS-Based Oscillator*, IEEE Journal of Solid-State Circuits **50**, 291 (2015).
- [21] J. Lee, P. Park, S. Cho, and M. Je, A 4.7MHz 53µW fully differential CMOS reference clock oscillator with -22dB worst-case PSNR for miniaturized SoCs, in Solid-State Circuits Conference-(ISSCC), 2015 IEEE International (IEEE) pp. 1–3.
- [22] X. Wang, B. Busze, J. Romme, R. Vinella, C. Zhou, K. Philips, and H. De Groot, *A multi-GHz 130ppm accuracy FLL for duty-cycled systems*, in *Radio Frequency Integrated Circuits Symposium (RFIC)*, 2011 IEEE (IEEE) pp. 1–4.
- [23] M. J. Pelgrom, *Analog-to-digital conversion*, in *Analog-to-Digital Conversion* (Springer, 2013) pp. 325–418.
- [24] J. Kim, B. S. Leibowitz, J. Ren, and C. J. Madden, *Simulation and analysis of random decision errors in clocked comparators*, IEEE Transactions on Circuits and Systems I: Regular Papers **56**, 1844 (2009).
- [25] A. Graupner, *A methodology for the offset-simulation of comparators*, The Designer's Guide Community **1**, 1 (2006).
- [26] I. Lee, D. Sylvester, and D. Blaauw, A constant energy-per-cycle ring oscillator over a wide frequency range for wireless sensor nodes, IEEE Journal of Solid-State Circuits 51, 697 (2016).

Bibliography 77

[27] R. B. Staszewski, Digital Deep-Submicron CMOS Frequency Synthesis for RF Wireless Applications, (2002).

- [28] B. Miller and B. Conley, A multiple modulator fractional divider, in Frequency Control, 1990., Proceedings of the 44th Annual Symposium on (IEEE) pp. 559–568.
- [29] D. W. Allan, *Historicity, strengths, and weaknesses of Allan variances and their general applications*, Gyroscopy and Navigation 7, 1 (2016).
- [30] IEEE standard definitions of physical quantities for fundamental frequency and time metrology—random instabilities, IEEE Std Std 1139-2008, c1 (2008).
- [31] I. L. Syllaios, R. B. Staszewski, and P. T. Balsara, *Time-domain modeling of an RF all-digital PLL*, IEEE Transactions on Circuits and Systems II: Express Briefs **55**, 601 (2008).
- [32] R. B. Staszewski, C. Fernando, and P. T. Balsara, *Event-driven simulation and modeling of phase noise of an RF oscillator*, IEEE Transactions on Circuits and Systems I: Regular Papers **52**, 723 (2005).

## Acknowledgement

First, I would like to give my deepest thanks to my supervisors. Professor Fabio Sebastiano was always able to give support when I needed. This work cannot be done without his patient and responsible guidance. Ming has been my daily supervisor at IMEC-NL, and I would like to give special thanks to him for all the time he spent on this project. He was willing to share his expertise, and the hands-on instructions he gave me is priceless. Moreover, I am particularly grateful to Yao-Hong for offering this opportunity and sponsoring the project.

I am grateful to Johan Dijkhuis for his support of the top-level layout and Stefano Traferro for the help with the digital design. Words cannot express my gratitude to my dear friends Jiang Gong, Jialue Wang, Zheyi Li, Yuefeng Wu, and many others at IMEC-NL. It has always been a pleasure to discuss technical and non-technical issues with them.

Last but not least, I would like to thank my family. It is their understanding, support, and encouragement that drive me forward.

Zhihao Zhou Delft, November 2017