## **MASTER THESIS**

#### AN ENERGY-EFFICIENT, HIGH-DATA-RATE IR-UWB TRANSMITTER WITH HYBRID MODULATION FOR BRAIN-COMPUTER INTERFACES

#### AN ENERGY-EFFICIENT, HIGH-DATA-RATE IR-UWB TRANSMITTER WITH HYBRID MODULATION FOR BRAIN-COMPUTER INTERFACES

### **Master of Science Thesis**

For the degree of Master of Science in Electrical Engineering at Delft University of Technology.

### Yu HUANG

Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS), Delft University of Technology Supervisors:

Dr. Minyoung Song (imec) Dr. Morteza S. Alavi (TU Delft)

Thesis Committee Members:

Prof.Marco Spirito (TU Delft) Dr. Sijun Du (TU Delft)

Dr. Morteza S. Alavi (TU Delft)



This work is performed in imec High Tech Campus 31, 5656 AE Eindhoven Netherlands

## **ACKNOWLEDGEMENTS**

I am happy that I have managed the unexpected circumstance (the tape-out deadline was shifted by 45 days earlier) and come out with expected outcomes in my first tapeout. In this thesis project, I have received a great deal of support and assistance from many people. Without their continued help, this project would not be possible.

First of all, I would like to express my sincere appreciation and gratitude to Dr. Minyoung Song for his supervision, support and encouragement during the tape-out. I have been so lucky to have a daily supervisor who keeps answering my questions so promptly and patiently. I am also impressed by his diligence and passion for work. Sometimes, we have discussions about the project even at midnight. His working attitude always motivates me to keep moving forward. Besides, he guided me into the RF IC design world step by step, with his impressive expertise in this field.

Special thanks to my supervisor in the university, Dr. S.M.Alavi, for his invaluable guidance and help. In our weekly meeting, he was always enthusiastic to answer my research questions and explained them explicitly. His insightful feedback pushed me to sharpen my thinking and brought my work to a higher level. Except for the research project, he also spent numerous time helping me improve the write-up of my thesis report. It is my great pleasure to work under his supervision.

I want to thank my great co-workers at imec-NL deeply. Give many thanks to Yao-Hong Liu, who offered me this opportunity and helped intensely in the technical direction as well as the measurement part of this project. I always enjoyed the technical discussions with him a lot. Particular thanks to my teammates Chengyao Shi and Yiyu Shen for their outstanding contributions and wonderful calibrations. I will never forget the days when they helped me design and debug the layout until morning to catch up with the deadline. Furthermore, I am very grateful to other colleges, Huib Visser, Mario Konijnenburg, Arjan Breeschoten, Jac Romme, Yuming He. My job was done easier, and the life at Holst Centre was way more enjoyable with them.

I also owe my sincere gratitude to my friends and fellow classmates in Delft and Eindhoven for their friendly accompanies in the Netherlands. The time I spent with you is the most intriguing memory of the last two years.

Last but not least, my thanks would go to my beloved family for their loving considerations and encouragement through these two years. They always love and care about me unconditionally.

> Yu Huang Eindhoven, August 2021

## ABSTRACT

In recent years, implantable neural sensing and Brain-computer Interfaces (BCIs) have manifested their potential in medical and clinical uses. However, their wide adoptions are impractical without reliable wireless telemetry systems for transmitting recorded neuron activities to external devices.

The rapid growth of the number of intracortical extracellular neural sensing channels not only increases the capability and the precision of the BCIs system, but also triggers the demand for a reliable wireless telemetry system with a data rate on the order of Gbps. The implants are generally battery-operated and should operate as long as possible, therefore the high data throughput should be achieved without sacrificing the power consumption. Furthermore, to improve the reliability of the wireless link against antenna misalignment, a transmission range longer than 2 cm is preferable. Impulse-Radio Ultra-Wide-Band is a promising communication technique for such a high-datarate, short-range, and low power implantable transmitter. In this master thesis project, a transcutaneous IR-UWB transmitter that utilizes a hybrid modulation scheme for BCIs applications is verified and designed. A low-jitter 8-phase oscillator, a linear switchedcapacitor power amplifier and a power-efficient delay generator are implemented to enable the energy-efficient hybrid modulation scheme that includes phase-shift keying (PSK), pulse position modulation (PPM) and pulse amplitude modulation (PAM) for high-data-rate transmission while keeping the power below 10 mW.

Fabricated in TSMC 28 nm CMOS process, the chip occupies an area of only 0.155 mm<sup>2</sup>. In the measurement setup, the transmitter is inserted into a multi-layer porcine skin tissue with a 15 mm thickness. The measurement results demonstrate that this transmitter can achieve a data rate of 1.66 and 1.43 Gbps in 2 and 15 cm transmission range, respectively, for bit error rate lower than  $10^{-4}$  with only 9.6 mW DC power consumption. The IR-UWB transmitter achieves a pulse energy efficiency of 5.8 pJ/bit and a normalized energy efficiency for 1 m transmission distance of 45 pJ/bit/m, which are the best among the state-of-the-art high-data-rate transcutaneous UWB transmitters.

## **CONTENTS**

| Acknowledgements v |              |                                                    |    |  |  |  |  |  |  |  |
|--------------------|--------------|----------------------------------------------------|----|--|--|--|--|--|--|--|
| AI                 | ABSTRACT vii |                                                    |    |  |  |  |  |  |  |  |
| 1                  | Intr         | Introduction                                       |    |  |  |  |  |  |  |  |
|                    | 1.1          | BCIs system.                                       | 1  |  |  |  |  |  |  |  |
|                    | 1.2          | IMPULSE RADIO ULTRA-WIDE-BAND                      | 3  |  |  |  |  |  |  |  |
|                    |              | 1.2.1 Overview of IR-UWB                           | 3  |  |  |  |  |  |  |  |
|                    |              | 1.2.2 Definition of IR-UWB                         | 3  |  |  |  |  |  |  |  |
|                    | 1.3          | objectives and Targeted Specifications             | 5  |  |  |  |  |  |  |  |
|                    | Refe         | erences                                            | 5  |  |  |  |  |  |  |  |
| 2                  | The          | UWB transmitter                                    | 9  |  |  |  |  |  |  |  |
|                    | 2.1          | PATH LOSS Estimation                               | 9  |  |  |  |  |  |  |  |
|                    |              | 2.1.1 The setup of path loss measurement.          | 9  |  |  |  |  |  |  |  |
|                    |              | 2.1.2 Path loss measurement result                 | 10 |  |  |  |  |  |  |  |
|                    | 2.2          | The modulation scheme and link budget calculation  | 12 |  |  |  |  |  |  |  |
|                    |              | 2.2.1 The hybrid modulation scheme                 | 12 |  |  |  |  |  |  |  |
|                    |              | 2.2.2 link budget calculation.                     | 12 |  |  |  |  |  |  |  |
|                    | 2.3          | The UWB transmitter with hybrid impulse modulation | 13 |  |  |  |  |  |  |  |
|                    | Refe         | rences                                             | 15 |  |  |  |  |  |  |  |
| 3                  | The          | 8-nhase DCO design                                 | 17 |  |  |  |  |  |  |  |
| Ŭ                  | 31           | Ton schematic of the DCO                           | 19 |  |  |  |  |  |  |  |
|                    | 3.2          | The negative skewed ring oscillator                | 20 |  |  |  |  |  |  |  |
|                    | 33           | The DCO core design                                | 21 |  |  |  |  |  |  |  |
|                    | 3.4          | Injection locking                                  | 26 |  |  |  |  |  |  |  |
|                    | 3.5          | The load-balancing                                 | 28 |  |  |  |  |  |  |  |
|                    | 3.6          | The DCO layout design and its simulation results   | 29 |  |  |  |  |  |  |  |
|                    | Refe         | erences                                            | 31 |  |  |  |  |  |  |  |
|                    |              |                                                    |    |  |  |  |  |  |  |  |
| 4                  | The          | Power amplifier and Matching network design        | 33 |  |  |  |  |  |  |  |
|                    | 4.1          |                                                    | 33 |  |  |  |  |  |  |  |
|                    | 4.2          | The PA design                                      | 36 |  |  |  |  |  |  |  |
|                    |              | 4.2.1 Top schematic of the SCPA                    | 36 |  |  |  |  |  |  |  |
|                    |              | 4.2.2 Design for a linear PAM modulation           | 37 |  |  |  |  |  |  |  |
|                    |              | 4.2.3 Optimizations for low power design           | 38 |  |  |  |  |  |  |  |
|                    | 4.3          | The on-chip matching network                       | 41 |  |  |  |  |  |  |  |
|                    |              | 4.3.1 Matching strategy with ideal components      | 41 |  |  |  |  |  |  |  |
|                    |              | 4.3.2 The tunable matching network design          | 42 |  |  |  |  |  |  |  |
|                    |              | 4.3.3 Non-linearity issue of the matching network. | 43 |  |  |  |  |  |  |  |

|   | 4.4 LAYOUTs design AND the post-layout simulation results | 45 |
|---|-----------------------------------------------------------|----|
|   | References                                                | 46 |
| 5 | Measurement results                                       | 47 |
|   | 5.1 transmitter front-end measurement                     | 48 |
|   | 5.2 The DCO measurement results                           | 49 |
|   | 5.2.1 DCO tuning range                                    | 49 |
|   | 5.3 Wireless measurement for the UWB transmitter          | 51 |
|   | 5.3.1 Measurement for the UWB transmitter                 | 51 |
|   | 5.3.2 Performance Summary and Benchmarks of the UWB TX    | 54 |
|   | References                                                | 56 |
| 6 | Conclusion                                                | 57 |
|   | 6.1 Master thesis conclusion                              | 57 |
|   | 6.2 Recommendations for future work                       | 58 |
|   | References                                                | 59 |

# 1

## **INTRODUCTION**

Wireless healthcare monitoring systems receive biological signals from implanted or onbody sensor nodes (hubs) and transmit them to a remote location [1].

Compared to the traditional wiring health monitoring system, the wireless implantable counterpart has the following advantages :1) Flexibility: people and animals can move freely without the constraints imposed by cables. 2) Less invasive: implant devices typically have a small form factor. 3) Durability: Implanted devices are sustained by batteries and operate inside the human body for days and months before the next charge.

Wireless Brain-Computer Interfaces (BCIs) have drawn researchers' attention because of the aforementioned merits and the potential to enable people to control machines such as prosthetic arms or communication tools using only neuronal signals. The target for this master thesis project is to explore the possibilities, challenges and implementation of the transmitter for such a wireless implantable system.

Impulse Radio Ultra-Wideband (IR-UWB), a short-range, high-data-rate and low-power demand communication technique that fulfils the requirements of the wireless BCIs system, is chosen to be the data transmission method in this project.

This chapter provides background information about BCIs and the UWB system of this master thesis project. The basic principles, applications, requirements of wireless BCIs are presented in Section 1.1, while Section 1.2 covers the technical details of the UWB technique. The design objectives and targeted specifications are introduced in Section 1.3.

#### 1.1. BCIS SYSTEM

Over the past decades, many researchers and laboratories have started to explore the possibilities and methods to build direct connections between brain activities and the outside world. BCIs or Brain-Machine Interfaces (BMIs) aims to provide links between neurons and external machines or computers, bypassing peripheral nerves and muscles. BCIs have been explored as a radically new communication option for those with neuromuscular impairments that prevent them from using conventional augmentative

communication methods [2]. Some encouraging results, manifesting the prospects of medical applications for BCIs systems, have been published in [3] and [4]. It is estimated that more than 1.6 million Americans live with limb loss and 5.5 million people experience some form of paralysis, who have an eager need to assistive devices, such as a robotic, a computer, a powered wheelchair that controlled by the BCIs system, to perform routines independently [5].

However, wide clinical daily and at-home uses will be impractical and unsuitable for wiring BCI systems because they rely on external electronics, thus triggering the need for the implantable BCIs that offer flexibility, less invasiveness and durability.

The intracortical BCIs (iBCIs) are suitable for neuroscience research because of their features of high spatial resolution, resistance to noise, substantial robustness over long recording periods, and signal fidelity. Fig 1.1 shows the concept of an implantable wireless (iBCIs) system. This system comprises an electrodes network, subcutaneous cables, an wireless telemetry module, and an external receiver (RX). Micro-scale electrodes are inserted into the brain cortex layer. These sensors are sensitive enough to pick up the action potential and the summed voltage fluctuations from small to large numbers of neurons [5]. These biological signals are delivered to the wireless telemetry module via subcutaneous cables. The external RX receives and de-modulates the radiated signals from the wireless telemetry module. In the final step, these de-modulated signals are sent to a signal-processing device that is able to translate these signals to the language that computer can understand for other peripherals control.



Figure 1.1: Conceptual diagram of intracortical neural sensing.

A major design challenge for this wireless implantable system is establishing a stable and reliable telemetry link for data transmission. Several typical requirements for such a link are listed as follows: 1) High-data-rate capability: a massive amount of information is needed for the neuron recording system to improve the performances of temporal and spatial resolutions. Accordingly, the transmitter should be capable of high data throughput [6]. In [7], a 966-Electrode Neural Probe with 384 configurable channels has been presented. Considering a 30-kHz 10-bit analog-to-digital converter (ADC), the demand for transmitting data can go up to 1 Gbps. 2) Low power consumption: high data rate

must be achieved without sacrificing the low power consumption, which is crucial for the power-limited system and avoiding tissue damage due to the heat generated by the implants. 3) small form factor: to diminish invasive effect to the human body.

#### **1.2.** IMPULSE RADIO ULTRA-WIDE-BAND

#### 1.2.1. OVERVIEW OF IR-UWB

Multiple telemetry methods have been studied and implemented for the links between implants and external devices. For example, a transcutaneous link for medical implants using inductively coupled coils is reported [8], while its highest data rate is restricted to a few Mbps, and the maximum communication distance between two coils is limited to a few centimeters. Med-Radio band (401 – 406MHz) has also been accepted worldwide to support the communication of diagnostic or therapeutic functions associated with medical implant devices [9], because the signal in this band has good conductivity in the human body. However, the small form factor design of antenna is challenging. Bluetooth Low Energy (BLE) technique has been employed in the implantable devices while it does not satisfy the server power budget, small form factor constraints while maintaining this ultra-high-data-rate transmission [10].

IIR-UWB technology has drawn a lot of attention from researchers and industries [11]. An IR-UWB transmitter (TX) directly radiates a train of short pulses (<1 ns), each typically representing one symbol. The direct transmission of impulses results in a high data rate transmission as the symbol period can be nearly as small as the duration of the individual impulses [12]. Besides, the UWB band (3-10 GHz) is a reasonable choice for the cm-scale antenna. Indeed, in pulse-based UWB, the transmitter only needs to operate during the pulse transmission, producing a strong duty-cycling potential on the radio to minimize the expensive baseline power consumption [13]. Therefore, the IR-UWB technique is a proper choice for this high-data-rate and low power iBCIs system.

#### **1.2.2.** DEFINITION OF IR-UWB

IR-UWB technology is popular for short-range data communication systems. Fig 1.2 shows the UWB signal design point. By definition, radiated signal whose 10-dB spectrum bandwidth is larger than 500MHz, and 20-dB bandwidth is situated within the UWB frequency band can be defined as a UWB signal.

As shown in Fig 1.2, UWB characterizes transmission systems with spectral occupancy in excess of 500MHz, equivalently, short pulse (<2 ns) in the time domain [15]. The UWB frequency band is designated to be in the range of 3.1GHz to 10.6 GHz, overlayering the coexisting RF systems.

At the same time, these coexisting systems must not suffer intolerable interference from the UWB radios. Regulatory considerations over such a wide bandwidth limit the radiated power of the UWB signal [16]. The regulatory body such as Federal Communications Commission (FCC), set the average power limit and peak power limit to be -41.3 dBm/MHz and 0 dBm/50MHz, respectively (Fig 1.3), for devices operating in the 3.1–10.6 GHz band, following the International Telecommunication Union (ITU) recommendation [17].

Therefore, a solid design for the bandwidth controller and power amplifier is needed



Figure 1.2: Definition of UWB signal [14].



Figure 1.3: FCC mask for UWB applications [18].

for spectrum mask compliance. On top of that, due to the high-frequency bandwidth, high path loss (PL) in the human body may limit the implant depth and transmission distance. As a result, a solid link budget analysis is required.

4

#### **1.3.** OBJECTIVES AND TARGETED SPECIFICATIONS

In this master thesis project, an IR-UWB TX for iBCIs applications is proposed, designed and verified. The path loss measurement and a solid link budget analysis is requisite before starting the circuit design. Besides, conventional architectures of IR-UWB TXs can not fulfil the system requirements therefore a new architecture and hybrid modulation scheme of the transmitter is needed for high-data-rate communication (> 1Gbps). The transmitter should operate at the high UWB band (6 to 8 GHz) with FFC regulation compliance to avoid overlapping the coexisting RF systems. The expected transmission distance must be larger than 10 centimetres with 10 to 15 mm implant depth to improve the reliability and flexibility of the link. The total power consumption should be lower than 10 mW to minimize the tissue heating caused by the thermal flux from the module's surface. The above specifications are displayed in table 1.1.

| Supply (V)                      | 0.9      |
|---------------------------------|----------|
| Maximum data rate (Gbps)        | >1       |
| Total DC power consumption (mW) | 10       |
| Frequency range (GHz)           | 6-8      |
| Implant depth (mm)              | 10 to 15 |
| Transmission distance (cm)      | >10      |

Table 1.1: Design specifications of the TX.

The primary contributions of this project comprise the link budget analysis, systemlevel simulation and verification, top-level layout design, Digitally-Controlled Oscillator (DCO), Power Amplifier (PA) and matching network design, as well as the measurement performed on the system. Chapter 2 discusses the link budget calculation and the TX architecture. The design of DCO and PA is shown in Chapter 3 and 4, respectively. The measurement part is introduced in Chapter 5.

#### REFERENCES

- R. Patel, P. Patel, J. Lalwani, M. Sarkar, and S. Nagaraj, "Investigating the feasibility of multiple uwb transmitters in brain computer interface (bci) applications," in 2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN), 2016, pp. 236–241.
- [2] J. R. Wolpaw, N. Birbaumer, W. J. Heetderks, D. J. McFarland, P. H. Peckham, G. Schalk, E. Donchin, L. A. Quatrano, C. J. Robinson, T. M. Vaughan *et al.*, "Braincomputer interface technology: a review of the first international meeting," *IEEE transactions on rehabilitation engineering*, vol. 8, no. 2, pp. 164–173, 2000.
- [3] J. L. Collinger, B. Wodlinger, J. E. Downey, W. Wang, E. C. Tyler-Kabara, D. J. Weber, A. J. McMorland, M. Velliste, M. L. Boninger, and A. B. Schwartz, "High-performance neuroprosthetic control by an individual with tetraplegia," *The Lancet*, vol. 381, no. 9866, pp. 557–564, 2013.

- [4] L. R. Hochberg, D. Bacher, B. Jarosiewicz, N. Y. Masse, J. D. Simeral, J. Vogel, S. Haddadin, J. Liu, S. S. Cash, P. Van Der Smagt *et al.*, "Reach and grasp by people with tetraplegia using a neurally controlled robotic arm," *Nature*, vol. 485, no. 7398, pp. 372–375, 2012.
- [5] D. J. H. L. Homer ML, Nurmikko AV, "Implants and decoding for intracortical brain computer interfacesg," *Annual review of biomedical engineering*, vol. 15, no. 2, pp. 383–405, 2013.
- [6] H. Bahrami, S. A. Mirbozorgi, L. A. Rusch, and B. Gosselin, "Integrated uwb transmitter and antenna design for interfacing high-density brain microprobes," in 2015 IEEE International Conference on Ubiquitous Wireless Broadband (ICUWB), 2015, pp. 1–5.
- [7] C. M. Lopez, S. Mitra, J. Putzeys, B. Raducanu, M. Ballini, A. Andrei, S. Severi, M. Welkenhuysen, C. Van Hoof, S. Musa, and R. F. Yazicioglu, "22.7 a 966-electrode neural probe with 384 configurable channels in 0.13μm soi cmos," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), 2016, pp. 392–393.
- [8] H. Ali, T. J. Ahmad, and S. A. Khan, "Inductive link design for medical implants," in 2009 IEEE Symposium on Industrial Electronics & Applications, vol. 2. IEEE, 2009, pp. 694–699.
- [9] M. N. Islam and M. R. Yuce, "Review of medical implant communication system (mics) band and network," *Ict Express*, vol. 2, no. 4, pp. 188–194, 2016.
- [10] L. Zhou, X. Chen, Y. Li, and J. Li, "Bluetooth low energy 4.0-based communication method for implants," in 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), 2017, pp. 1–5.
- [11] K. Ture, A. Devos, F. Maloberti, and C. Dehollain, "Area and power efficient ultrawideband transmitter based on active inductor," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 65, no. 10, pp. 1325–1329, 2018.
- [12] N. Soltani, H. Kassiri, H. M. Jafari, K. Abdelhalim, and R. Genov, "0.13m cmos 230mbps 21pj/b uwb-ir transmitter with 21.3% efficiency," in ESSCIRC Conference 2015 - 41st European Solid-State Circuits Conference (ESSCIRC), 2015, pp. 352–355.
- [13] J. Ryckaert, C. Desset, A. Fort, M. Badaroglu, V. De Heyn, P. Wambacq, G. Van der Plas, S. Donnay, B. Van Poucke, and B. Gyselinckx, "Ultra-wide-band transmitter for low-power wireless body area networks: design and evaluation," *IEEE Transactions* on Circuits and Systems I: Regular Papers, vol. 52, no. 12, pp. 2515–2525, 2005.
- [14] K. Siwiak and D. McKeown, Ultra-wideband radio technology. John Wiley & Sons, 2005.
- [15] L. Yang and G. Giannakis, "Ultra-wideband communications: an idea whose time has come," *IEEE Signal Processing Magazine*, vol. 21, no. 6, pp. 26–54, 2004.

- [16] Y. Rahayu, T. A. Rahman, R. Ngah, and P. Hall, "Ultra wideband technology and its applications," in 2008 5th IFIP International Conference on Wireless and Optical Communications Networks (WOCN'08). IEEE, 2008, pp. 1–5.
- [17] H. W. Pflug, J. Romme, K. Philips, and H. de Groot, "Method to estimate impulseradio ultra-wideband peak power," *IEEE Transactions on Microwave Theory and Techniques*, vol. 59, no. 4, pp. 1174–1186, 2011.
- [18] B. Schleicher and H. Schumacher, "Impulse generator targeting the european uwb mask," in *2010 Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems (SiRF)*, 2010, pp. 21–24.

2

## **THE UWB TRANSMITTER**

In this chapter, the overall specifications and high-level structure of the UWB transmitter are presented. Path loss analysis is described in Section 2.1. The modulation scheme and link budget calculation is demonstrated in Section 2.2, while the top architecture and specifications of the transmitter are discussed in Section 2.3.

#### **2.1. PATH LOSS ESTIMATION**

The high tissue loss in the human body hinders high-frequency IR-UWB (3.1-10 GHz) telemetry link design. If the signal is attenuated severely, despite the merits of high data rate and low power consumption, IR-UWB is no longer attractive for data transmission between implants and external devices in BCIs system, since the communication range and implant depth is strictly limited.

Consequently, to verify the feasibility of IR-UWB communication and establish a solid link budget analysis, the path loss for the UWB signal is needed to be characterized.

#### **2.1.1.** The setup of path loss measurement

Implanting electronic devices into the human body to carry out path loss analysis is not usually preferred because of security issues and complexity. Then, some works are based on software simulation using existing human body voxelmodels [1]. Nevertheless, software simulations cannot reproduce all the channel conditions, and they can be expensive and imply a high computational cost [2]. Other research measures, such as *in vivo* studies, conduct tests on living animals, organisms, or cells to obtain accurate results. However, the high surgical cost for propagation measurement and requirement of the operating room is less attractive in our lab. Therefore, the liquid phantom, which has the close electrical property of the human body, is the most reasonable choice for this measurement because it is easy to build or measure. Besides, its dielectric characteristics can be maintained up to tens of days in our lab environment.

Reported in [3], the recipe is composed of sucrose, salt, and water can approximate the complex permittivity of the human muscle tissue well is used to build up the liquid

#### phantom for the test.

Two TAOGLAS flex UWB antennas with SMA connectors are chosen to be both transmitting and receiving antennas because of their low return loss (-9 dB) in 6.0-8.2 GHz band. The peak gain of the antennas is 4.8 dBi in the 6.0-8.2 GHz band. The dimensions of the antenna are shown in Fig 2.1, featuring a small form factor (35\*24.5\*0.2 mm).



Figure 2.1: The antenna used for path loss measurement.

The transmitting antenna was coated with a layer of plastic to avoid the direct physical contact between the fluid and antenna. The effect of this plastic seal has been proved to be negligible to affect the antenna gain, by comparing the antenna gain before and after the coating.

The free space path loss can be modeled by:

$$P_f = D_t D_r (\frac{\lambda}{4\pi d})^2 \tag{2.1}$$

Where  $P_f$  is the free space path loss and  $D_t$  and  $D_r$  is the directivity of the transmitting and receiving antenna, respectively.  $\lambda$  is the signal wavelength and d is the distance between two antennas. By measuring the free space S21 parameter between these two antennas, interference from the environment is proved to be trivial because when the distance is halved, a 6-dB S21 difference can be observed immediately.

The transmitting antenna is immersed in the liquid phantom with different distances to the edge of the container, mimicking the implant depth. The receiving antenna is attached to the other side of the container's wall. Both two antennas are connected with a Keysight N9918A Microwave analyzer to measure the S21 parameter.

#### **2.1.2.** PATH LOSS MEASUREMENT RESULT

The measured S21 from 7.5 GHz to 8.5 GHz is demonstrated in Fig 2.2. As expected, the magnitude of the path loss is proportional to the signal frequency as well as the implant depth. The path loss for 15 mm and 50 mm implant depth in 8 GHz is roughly -50 and -62 dB, respectively.

The output peak power of the implanted transmitter is usually boosted to overcome the high path loss and improve the transmission range, as long as the radiated peak power in the surface is lower than -41.3 dBm/MHz imposed by the regulation.



Figure 2.2: Measured S21 in the liquid phantom.

To push the transmission range to the limit, the peak output power of an implanted transmitter with a 50 mm implant depth should be larger than 10 dBm to overcome the 62 dB path loss, considering 10 dBi antenna gain. However, the PA module's power consumption will also increases drastically and reach 20 mW, assuming a system efficiency of 50%. This excessive power dissipation violates the low power design target and generates tremendous heat in human tissue.

For the iBCIs applications, where implants are placed in 10 to 20 mm depth, the -41.3 dBm/MHz limitation can be reached by around 0 dBm output power with a 50 dB path loss and 10 dB antenna gain. This power level can be realized with a 0.9V supply and features a significantly lower power dissipation than the 50 mm implant case.

In conclusion, the path loss in the human body is closely related to the signal frequency and implant depth, thus limiting the application scenario for low-power implants. For a shallow implant (15 mm) such as the iBCIs system, the 50 dB path loss is still acceptable and can be overcome with boosted output power inside the human body without violating the regulation mask. However, for deeper implant (>50 mm) application, the maximum transmission range is achieved at high DC power consumption expense.

It is worth mentioning to the accuracy of the path loss measurement. The far field distance of an antenna is defined by [4]:

$$d = \frac{2D^2}{\lambda} \tag{2.2}$$

Where *D* is the largest dimension of the antenna while  $\lambda$  is the wavelength of the EM wave. By substituting those parameters with 8 GHz frequency and 35 mm maximum dimension, the calculated far-field distance is 60 mm. So, due to the short distance (15 mm and 50 mm) between these two antennas in the measurement, they are not considered to work in the far-field condition, making it hard to separate the path loss with the antenna gains. Accordingly, in this work, the S21 parameters are used to estimate the path

loss measurement result. Similar path loss data are expected for other antennas with a similar size due to the near-field effect. The *in virto* measurement result to validate this path loss estimation is discussed in the measurement part.

#### **2.2.** The modulation scheme and link budget calculation

#### **2.2.1.** The hybrid modulation scheme

One signal UWB pulse signal does not have any value by itself. Digital information must be added to this analog pulse by means of modulation. Some traditional modulation methods, such as frequency modulation (FM), are hard to be implemented in the UWB system because of the characteristics of the pulse itself.

By far, the most common UWB signal modulation methods can be categorized into two kinds: time-based techniques and shape-based techniques. Benefiting from its low power consumption dissipation, pulse position modulation (PPM) is a widely-used time-based technique for UWB communication, where pulses are delayed or sent in advance of a specifying time scale depending on the modulation information [5]. Other well-known shape-based techniques, such as on-off keying (OOK), multiple phase-shift keying (MPSK), pulse amplitude modulation (PAM), are also available for UWB signal modulation.

A hybrid modulation scheme comprises 4PAM, 8PSK and 4PPM is selected to achieve a high data rate. The signal-to-noise ratio (SNR) requirement of such a hybrid modulation scheme is mitigated significantly, compared to the one in the modulation scheme that simply increases the bits per symbol.

PAM is the modulation technique that changes the pulse amplitude according to the digital information and can be realized by a switched-capacitor power amplifier (SCPA), with good linearity to achieve high output SNR. The PAM modulation would not consume extra power because no other subblocks are required in the TX design.

The 8PSK modulation is accomplished by an 8-phase digitally controlled oscillator (DCO) combined with an 8-inputs phase mux, choosing one of the phases among the 8-phase clock. The PPM modulation is achieved by the delay generator, which delays the impulse based on the RF clock period and the PPM code.

Based on the 238 Mbps digital baseband data rate and the hybrid modulation scheme, the maximum transmitting data rate can be calculated as follows:

$$Datarate = 0.238 \times (2 + 3 + 2) = 1.66Gbps \tag{2.3}$$

#### **2.2.2.** LINK BUDGET CALCULATION

The received signal quality can be modeled by  $E_b/N_0$  for a digital communication system, where  $E_b$  is the received energy per bit and  $N_0$  is the noise spectral density. To achieve a bit error rate (BER) better than  $10^{-4}$ ,  $E_b/N_0$  for the 4PAM, 8PSK and 4PPM that with a step of 500ps (assuming 8 GHz DCO frequency) should achieve 10,13 and 17 dB, respectively [6]. In this case, the 4PPM modulation mainly limits the overall  $E_b/N_0$  of the system. To calculate the required PA peak output power,  $E_b/N_0$  is first converted to  $E_s/N_0$  by:

$$10log(\frac{E_s}{N_0}) = 10log(\frac{E_b}{N_0}) + 10log(M) = 25.45 \quad (dB)$$
(2.4)

Where *M* is the number of bits per symbol and  $E_s$  is the received energy per symbol that can be calculated by:

$$E_s = \frac{P_T T_T G_T G_R}{PL} \quad (J) \tag{2.5}$$

Where  $G_T$  and  $G_R$  is the gain of transmitting and receiving antenna, respectively. In this calculation, these two values are set to be 0 to eliminate the effects from the antenna for more general applications. *PL* is the path loss at the transmitted frequency (55 dB with 5 dB margin at 8 GHz), and  $T_T$  is the UWB pulse width which is 1 ns according to our design target.  $P_T$  is the PA peak output power that need to be determined.  $N_0$  is given as below:

$$N_0 = NF * k * T = -169 \quad (dBm/Hz) \tag{2.6}$$

Where *k* is the Boltzmann's constant, *T* is the room temperature, and *NF* is the noise figure (5 dB) of the UWB receiver will be used during the measurement.

By substituting all the variables and combing equations 2.4, 2.5 and 2.6, the only unknown parameter  $P_T$  is calculated to be 1.55 dBm. Finally, the peak-to-average-ratio (PAPR) of the UWB signal with 4-level PAM modulation that is given by:

$$PAPR = 10\log(\frac{1^2 + 0.75^2 + 0.5^2 + 0.25^2}{4}) = 3.3 \quad (dB)$$
(2.7)

This value is added to the calculated  $P_T$ , resulting in a PA peak output power of 4.85 dBm at 8 GHz. Note that with 50 dB to 55 dB path loss, the peak emitted power is bound to be lower than -41.3dBm/MHz, which meets the requirement of FCC UWB regulation.

#### **2.3.** The UWB transmitter with hybrid impulse modulation

There are various kinds of TX architectures to generate IR-UWB pulses. In the edgecombining TX, the UWB pulses are generated in the time domain by the programmable pulse generator without up-conversion mixers or RF clock signals. The benefits of the edge-combining technique are that the circuit's complexity and the power consumption are relaxed [7], [8]. However, a pulse generator with precise delay is required to set the operating frequency. Besides, simultaneous PSK and PPM modulations are hard to perform in the pulse generator since both are based on delays in the time domain. Furthermore, this method requires complicated calibrations for generating FCC compliant pulses [9]. On the other hand, the carrier-based TX is similar to the conventional narrow-band transmitters, utilizing a local oscillator (LO) for baseband up-conversion, thus providing easy control to the output power spectral density (PSD) and a wider frequency tuning range to cover the PVT variations. In [10], an up-conversion TX achieves a data rate of 1.8 Gbps using finite-impulses response (FIR) filtering based on the LO is reported. However, this method is not energy-efficient, and an extra PA is needed to increase the output power. Alternatively, the traditional I/Q IR-UWB transmitter is also not preferred in this design because of the power-hungry mixer for up-conversion [11].

The architecture of the proposed up-conversion IR-UWB polar TX of this project is shown in Fig 2.3. The baseband-to-RF up-conversion is accomplished by an asynchronous Pulse Shaper and a digital power amplifier without a mixer, thus enabling an energy-efficient hybrid modulation scheme for high-data-rate transmission. The transmitter is designed and fabricated in TSMC 28nm technology, with 0.9 V nominal supply voltage. This project targets high-frequency UWB channels (6 to 8 GHz) for smaller antenna designs. Besides, the rigorous output spectrum requirement in low frequency from 1.6 to 3 GHz imposed by FCC regulation is relieved, thus good resilience to other narrow-band wireless standards could be achieved.



Figure 2.3: High-level architecture of the proposed UWB transmitter.

The 7-bit PPM, PAM, and PSK modulation signals run at 238MHz are synchronized with the 476 MHz system external clock. When the transmitter is enabled, a rectangular pulse in the time domain is generated every 4.2 ns by the Bandwidth Adjuster. The pulse width can be tuned to between 2 ns and 1 ns to select 500 MHz or 1 GHz bandwidth mode.

The 2-bit PPM modulation is accomplished by the Delay Generator, which can delay the baseband pulse with a step of 600 ps (a quarter period of the DCO output), within one symbol period. The remaining 1.8 ns in the symbol period is reserved for guarding and preventing wrong PPM modulation.

Then, this delayed rectangular pulse is shaped by an asynchronous Pulse Shaper to perform FIR filtering in the RF domain with its 7 delay cells. After shaping, the pulse is converted into an 8-bit staircase-shaped pulse sequence [12]. The calibration scheme is integrated into the pulse shaper to adjust the delay, enabling the bandwidth and pulse shape adjustment.

At the same time, one phase of the DCO outputs is selected by the phased mux based on the PSK signal. The selected clock modulates the shaped pulse sequence in the digital power amplifier (DPA) and converts them into UWB pulse, with adjustable amplitude base on the input PAM code. After passing through a tunable on-chip LC switchable matching network designed to boost the output transmitted power and cover the antenna impedance variation, the UWB RF signal is radiated by a printed circular monopole antenna.

#### REFERENCES

- H. Yamamoto, J. Zhou, and T. Kobayashi, "Ultra wideband electromagnetic phantoms for antennas and propagation studies," *IEICE Transactions on Fundamentals* of *Electronics, Communications and Computer Sciences*, vol. 91, no. 11, pp. 3173– 3182, 2008.
- [2] S. Gabriel, R. Lau, and C. Gabriel, "The dielectric properties of biological tissues: Iii. parametric models for the dielectric spectrum of tissues," *Physics in medicine & biology*, vol. 41, no. 11, p. 2271, 1996.
- [3] T. Van Nunen, E. Huismans, R. Mestrom, M. Bentum, and H. Visser, "Diy electromagnetic phantoms for biomedical wireless power transfer experiments," in 2019 IEEE Wireless Power Transfer Conference (WPTC). IEEE, 2019, pp. 399–404.
- [4] R. Johnson, H. Ecker, and J. Hollis, "Determination of far-field antenna patterns from near-field measurements," *Proceedings of the IEEE*, vol. 61, no. 12, pp. 1668– 1694, 1973.
- [5] M. Ghavami, L. Michael, and R. Kohno, *Ultra wideband signals and systems in communication engineering*. John Wiley & Sons, 2007.
- [6] M. Viswanathan, "Performance comparison of digital modulation techniques," https://www.gaussianwaves.com/2010/04/performance-comparison-of-digitalmodulation-techniques-2/, 2014.
- [7] X. Tong and J. Li, "A sub-ghz uwb data transmitter with enhanced output amplitude for implantable bioelectronics," in 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS), 2017, pp. 1–4.
- [8] A. Ebrazeh and P. Mohseni, "30 pj/b, 67 mbps, centimeter-to-meter range data telemetry with an ir-uwb wireless link," *IEEE Transactions on Biomedical Circuits* and Systems, vol. 9, no. 3, pp. 362–369, 2015.
- [9] K. Ture, A. Devos, F. Maloberti, and C. Dehollain, "Area and power efficient ultrawideband transmitter based on active inductor," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 65, no. 10, pp. 1325–1329, 2018.
- [10] M. Demirkan and R. R. Spencer, "A pulse-based ultra-wideband transmitter in 90nm cmos for wpans," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2820– 2828, 2008.
- [11] N.-S. Kim and J. M. Rabaey, "A high data-rate energy-efficient triple-channel uwbbased cognitive radio," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 4, pp. 809– 820, 2016.

[12] E. Allebes, G. Singh, Y. He, E. Tiurin, P. Mateman, M. Ding, J. Dijkhuis, G.-J. v. Schaik, E. Bechthum, J. v. d. Heuvel, M. E. Soussi, A. Breeschoten, H. Korpela, Y.-H. Liu, and C. Bachmann, "21.2 a 3-to-10ghz 180pj/b ieee802.15.4z/4a ir-uwb coherent polar transmitter in 28nm cmos with asynchronous amplitude pulse-shaping and injection-locked phase modulation," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 304–306.

# 3

## **THE 8-PHASE DCO DESIGN**

There are many different ways to generate 8-phase LO clock signal for 8PSK modulation. One is interpolating the LO in the PA directly that is not power-efficient verified by simulation. Another method is to implement a 4-phase oscillator followed up with four 2-to-1 clock interpolators, which are shown in Fig 3.1a. Two clock signals with different phases, namely Clock1 and Clock2, are fed to identical inverters that share the same output node. Ideally, the output signal (Clock1+Clock2)/2 lies precisely halfway between these two inputs. However, if the first input finishes the low-to-high transition before the second input rises, or in other words, the time difference, i.e., t2-t1 is too large, the interpolated output might exhibits a "kink," suffering higher jitter and slower time-slop. As a result, the interpolation with two 90-degree separated 8 GHz clocks is impractical since the transition time is hard to fulfil the timing requirement. Other structures, like current starving analog interpolators, are not preferable due to their mW-level static power consumption [1]. Therefore, to achieve a power-efficient 8PSK modulation, the phases are generated from the oscillator directly without extra circuitry in this design.



(a) Simple concept of a clock interpolator.

(b) Output transition of a clock interpolator if the inputs have slow rise time .

Figure 3.1: A simple clock interpolator [2].

Compared with the LC oscillator, the ring oscillator has a lower quality factor and

phase noise performance. However, the ring oscillator is chosen because it occupies a smaller area and more importantly, it can provide 8-phase outputs directly without the need for delay-locked loop (DLL), which will increase the complexity of design significantly. As for the poor phase noise performance, it can be improved reasonably by the injection locking technique. In a traditional ring oscillator, the oscillation frequency can be calculated by [2]:

$$f = \frac{1}{2 \times N \times T_D} \tag{3.1}$$

Where f is the oscillation frequency, N is the number of stages in the ring oscillator, and  $T_D$  is the delay of each stage. The formula indicates that adding more stages for more output phases decreases the operating frequency. Therefore, the key challenge and the primary goal for this ring oscillator design is to maintain the indispensable 8 phases generation and high-frequency tuning range (6 to 8 GHz) simultaneously.

Furthermore, the transmitted SNR depends on the DCO phase mismatch and phase noise performance in such a high-frequency 8PSK modulation. Severe phase mismatch and phase noise deteriorate the received signal Error vector magnitude (EVM) and makes its BER significantly high. The phase error should be below 5.7 °to achieve a low EVM better than -20 dB for the 8PSK modulation. To guarantee reasonable phase noise performance, the injection locking technique is implemented in this design. As for the phase mismatch issue, the layout is designed to be as symmetrical as possible, and the load-balacing module is applied to avoid mismatched load capacitance for the 8-phase RF clock signal. The targeted DCO specifications are shown in Table 3.1.

| Supply (V)                      | 0.9      |
|---------------------------------|----------|
| Total DC Power consumption (mW) | 3        |
| Tuning range (GHz)              | 5.5-10.4 |
| Maximum phase mismatch (°)      | 5.7      |
| Parasitic cap (fF)              | 8.5-9.5  |
| Output duty cycle (%)           | 50       |
| FoM @10MHz(dB)                  | 150-160  |

Table 3.1: Design specifications of the DCO.

To overcome the PVT variations and the coupling capacitance that may affect the DCO output frequency, a 30% margin in frequency tuning range is reserved in the design. Therefore, in the simulation, the targeted maximum output frequency is 10.4 GHz. The parasitic cap in the table is the total hypothesis coupling and interconnects capacitance from the layout that each DCO oscillation node can see. In all schematic-level simulations, these caps are added to each node to bridge the difference between post-layout and schematic level simulation. To make the data throughput as high as possible, the duty-cycling technique, which is widely used to reduce standby power consumption, is not of our interest here. Therefore, fast settling of the DCO is unnecessary for the design.

This chapter is organised as follows: the top schematic of the DCO is introduced in Section 3.1 and the principle of the negative skewed ring oscillator is discussed in Section 3.2. Section 3.3, 3.4 and 3.5 focus on the design of the DCO core, injection locking buffers,

as well as load balancing buffers, respectively. The last part of this chapter, Section 3.6, presents the layout and simulations of the DCO.

#### **3.1.** TOP SCHEMATIC OF THE DCO

Fig 3.2 describes the top-level schematic of the injection-locked DCO (IR-DCO). Two separate power sources, VDD\_DCO and VDD\_DELAY, is connected to the nominal 0.9V power supply and the current bank output, respectively. The nominal supply provides power for the injection locking and the output buffers. The current bank provides tunable supply current for the delay cells and therefore changes their delay, namely, changes the DCO frequency. Capacitor tuning is not used in the design because the parasitic and off-state capacitances usually decrease the oscillation frequency. Compared with other tuning techniques, such kind of current bank tuning makes the oscillation frequency less sensitive to the supply noise since the current bank can be regarded as a noise blocker due to its high impedance.

The 5-bit binary PVT bank and 6-bit binary tracking bank are used to tune the DCO frequency coarsely, while the 32-bit tracking bank is designed for fine-tuning. All these banks are implemented by parallel PMOS current sources. Moreover, some current sources are always turned on in the PVT bank to provide the minimum current. The current sources from different banks are sized individually to provide different steps for its frequency tuning. Table 3.2 summarizes the minimum step of each current bank from the post-layout simulation.



Figure 3.2: The top-level schematic of the IR-DCO.

The injection locking pulse generator (INJ PG) is fed from an external system clock to generate pulses every 4.2 ns with adjustable pulse width. These pulses are the references for the injection locking buffers to clean up the phase noise periodically. The details of the injection locking technique are presented in Section 3.4.

|              | Current step (uA) | Maximum current (uA) |
|--------------|-------------------|----------------------|
| PVT bank     | 50                | 2376                 |
| ACQ bank     | 3                 | 183                  |
| TRK bank MSB | 0.037             | 0.529                |
| TRK bank LSB | 0.0023            | 0.037                |

Table 3.2: Simulated steps of current bank.

#### **3.2.** THE NEGATIVE SKEWED RING OSCILLATOR

Conventional CMOS ring oscillators have been used widely for multi-phase clock signals generation because of their simple structure, less layout area and larger tuning range [3], compared with the LC oscillator. However, it is tricky to maintain its high operating frequency and multiple-phases output simultaneously (equation 3.1), which is the main hindering factor to realize the high-frequency 8PSK modulation in this design. Therefore, the negative skewed technique, a promising approach for diminishing the cell delay, and hence increasing the maximum achievable output frequency, is employed in the DCO design.

Fig 3.3 shows the basic concept of the negative skewed delay cell, which can be represented by a CMOS inverter with negatively skewed input to PMOS so that the input to the PMOS comes earlier than the NMOS [4].



Figure 3.3: Concept of negative skewed delay cell.

Generally, due to the negatively skewed input, the PMOS is turned on prematurely, compensating its slow low-to-high transition time, therefore leading to a higher achievable oscillation frequency than the conventional ring oscillator. Fig 3.3b shows the output waveform of the delay cell when square wave input is applied. Before t = T1, only the PMOS is on, so the output voltage is pulled up to the supply level. Between T1<t<T2, the output is floating and hence holds the previous node voltage. When t=T2, only the NMOS is on, so the output is pulled down afterwards. At t= T3, NMOS and PMOS are

both turned on by  $\tau$  s, pre-charging the output node to a certain level, depends on the strength of NMOS and PMOS. After t=T4, NMOS is turned off, thus, the output is pulled up to the supply by the PMOS. Based on the analysis, the unique pre-charge stage in T3<t<T4 accelerates the low-to-high transaction, so the overall delay of the inverter is reduced compared with a conventional delay cell. More details about the comparison of delay between conventional delay cell and negatively skwed delay cell are described in [5].

#### **3.3.** THE DCO CORE DESIGN



Figure 3.4: 4 delay cells connected in a negatively ring scheme to form the DCO core.

Fig 3.4 depicts the high-level DCO core schematic, consisting of 4 delay cell units connected in a negatively skewed structure to achieve high operation frequency. This pseudo-differential negative skewed ring oscillator can provide an 8-phase clock without using other subblocks.

The direct output voltage from the DCO is not rail-to-rail because the current bank consumes voltage headroom. Besides, as shown in Fig 3.3b, the duty cycle of the wave-form at the delay cell output is not 50% due to the negatively skewed scheme. Furthermore, the amplitude jump caused by the injection locking introduces amplitude noise that must be filtered out to prevent deteriorated PAM modulation.

Therefore, 8 output buffers with feedback duty cycle correction shown in Fig 3.5 are chosen and implemented at the oscillation node to correct the duty cycle errors and



Figure 3.5: The DCO buffer.

guarantee the rail-to-rail swing. These buffers are chosen beacause they only occupy a small area and have low power consumption. The buffers are placed right beside the delay cells to avoid extra coupling capacitance caused by routing in the layout design.

One downside of the negative skewed ring oscillator is that larger short-circuit power consumption is expected because of the simultaneous turn-on of NMOS and PMOS in the delay cell. The delay cell utilizes this time window  $\tau$  to pre-charge the output node to a certain level based on the P-N ratio. Consequently, excessive  $\tau$  not only entails greater power consumption but also decreases the speed. Thus this delay should be chosen to be as short as possible to reduce the time when a direct path from supply to the ground is formed. As shown in Fig 3.4, the negative delay is obtained by deriving the PMOS input one phase ahead of the NOMS input. Ideally, the time when direct current path from supply to the ground is formed only occupies one eight of the oscillation period, making it a reasonable cost for higher speed.

Fig 3.6 shows up to 30% improvement of operating frequency between the negative skewed oscillator shown in Fig 3.4 and the conventional ring oscillator when sweeping the PVT bank code index. These two ring oscillators have the same structure and identical delay cells, sharing the same current bank. The only difference is that in the negative skewed ring oscillator, the PMOS input is one phase ahead of the NOMS input while both PMOS and NOMS are fed to the same phase in the conventional ring oscillator. The negative skewed ring oscillator can operate up to 10.6 GHz while the conventional one can only achieve 7.6 GHz, thus failing to meet our design target.

Fig 3.7 illustrates the schematic of the DCO delay cell and the injection locking buffer. The two differential outputs are connected to the output buffer for 50% duty cycle correction and level recovery. Except for the two inverters,  $M_{p3}$ ,  $M_{p4}$ ,  $M_{n3}$  and  $M_{n4}$  compise two CMOS cross-coupling pairs (XPCs) that provide positive feedback for the ring oscillator, making the transition even faster.

Fig 3.8 depicts how the XPCs accelerate the high-to-low transition in one delay cell by large-signal analysis. In region 1, the INN- node is charged by the previous stage to a level higher than  $V_{tn}$ . Meanwhile,  $M_{n4}$  is also turned on due to the high gate voltage,



Figure 3.6: Comparison of the operating frequency between conventional and negative skewed ring oscillator.



Figure 3.7: Schematic of the delay cell unit and the equivalent small signal circuit.

therefore together with  $M_{n1}$ , the pull-down strength for node OUTN in region 2 is enhanced. Consequently, this node finishes its transition even before  $M_{n1}$  is completely turned on, lowering the cell delay. Similar faster low-to-high transition generated by the PMOS XPCs can be observed.

A small-signal model for one delay cell is shown in Fig 3.7. The small-signal drain to



Figure 3.8: Output and input waveforms for a delay cell in the negative skewed ring oscillator.

source impedance of an XPC is  $\frac{-2}{g_m}$ , therefore the total single-ended signal gain is:

$$A_{\nu} = \frac{g_{mn1} + g_{mp1}}{g_{on1} + g_{op1} - g_{mp3} - g_{mn3}}$$
(3.2)

Where  $g_{mn}$  and  $g_{mp}$  is the transconductance of NMOS and PMOS while  $g_{on}$  and  $g_{op}$  is the channel conductance of NMOS and PMOS, respectively. Compared with invertersonly delay cells, these negative conductance  $g_{mp3}$  and  $g_{mn3}$  increase the small-signal gain for higher oscillation frequency.

Note that adding the XPCs makes this circuit to be a ratioed logic. The size of XPC can not be arbitrarily large. Otherwise, the output node voltage cannot be changed due to the intense strength of the XPC, failing the oscillation.

Moreover, greater node capacitance is also introduced at the output node with a larger XPC size since they are directly connected. Therefore, excessive size of XPC reduces the oscillation frequency instead of increasing it. In this design, the size of XPC is set to be  $3 \times$  smaller than the delay cell's to achieve a good trade-off. The size ratio of  $M_n$  and  $M_p$  is 2 to have an equal pull-down and pull-up current.

The flicker noise of a ring oscillator can be calculated by:

$$P_f = \frac{C_{ox}}{8MI_{DSAT}} \left(\frac{u_N K_{fN}}{L_N^2} + \frac{u_P K_{PN}}{L_P^2}\right) \left(\frac{f_0^2}{f^3}\right)$$
(3.3)

Where  $C_{ox}$  is the gate-oxide capacitance per unit area and  $u_N, u_P$  is mobility of charge carriers of NMOS and PMOS, *M* is the number of oscillation stages,  $K_{fN}$  and  $K_{fP}$  is the

empirical coefficient that are independent of bias, fabricator and technology.  $f_0$  is the oscillation frequency,  $L_n$  and  $L_p$  is the width of the transistors and  $I_{DSAT}$  is the pull-up and pull-down current for velocity-saturated devices calculated by:

$$I_{DSAT} = \frac{C_{ox}u}{2} \frac{W}{L} [(V_{DD} - V_T) V_{DSAT} - \frac{1}{2} V_{DSAT}^2]$$
(3.4)

Where  $V_{DSAT}$  and  $V_T$  is the velocity-saturated and threshold voltage of the device. Equation 3.3 and 3.4 serve as our way to improve the phase noise performance, describing the dependence of flicker noise upon the constant parameters of the technology, i.e.,  $C_{ox}$ ,  $u_N$ ,  $u_P$ , and the design parameter, i.e., I,  $f_0$  and L. Even though larger L reduces the pull-up and pull-down current of the ring oscillator proportionally, the phase noise is still improved due to the quadratic relation between phase noise and L. One potential drawback of larger L is creating the substantial node capacitance introduced by devices, but this increment is trivial and does not significantly impact the oscillation frequency compared to the interconnect capacitance from the layout. In this design, all transistors' lengths in the delay cell are set to be 50 nm instead of the minimum size, 30 nm, to achieve a good trade-off between phase noise performance and the operation frequency.

Equation 3.4 also indicates that the charge and discharge currents are higher for low threshold voltage (LVT) devices than the normal threshold voltage (NVT) devices. As a result, a higher output frequency can be attained with LVT devices. Fig 3.9a and Fig 3.9b compare the DCO that is made of LVT devices and NVT devices in terms of the output frequency and the power efficiency when sweeping the PVT bank code.

As shown in Fig 3.9a, the maximum output frequency for DCO consist of LVT delay cell is 10.8 GHz, compared to 9 GHz for the NVT DCO. The power efficiency equals to the ratio of the output frequency and the DC power consumption is also higher for the LVT devices. One potential disadvantage of using LVT devices is that more leakage power is expected when DCO is disabled. From the simulation, the leakage currents of the DCO core for both LVT and NVT delay cells are all below the micro-Watt level. Since duty cycling is not applied in this design, this leakage current penalty can be neglected.



Figure 3.9: Output frequency and efficiency when sweeping the PVT code index.

#### **3.4.** INJECTION LOCKING

For a noiseless ring oscillator, its zero crossings are uniformly spaced in the time domain. However, such kind of ideal ring oscillator does not exist due to the various noise sources in the circuit, which give rises to errors in zero crossings [6]. In this case, the free-running ring oscillator exhibits as a phase error integrator and the jitters are accumulated indefinitely every cycle. In this project, the phase noise performance is vital to the 8PSK modulation quality due to the fact that the receiver will fail to decode the 8PSK modulation signal if the RMS phase error is too high. In other words, each phase in the constellation diagram is expanded to be a long trace, which may overlap with other phases, thus generate a huge amount of decoding errors.

The injection locking technique, which is a way to suppress the accumulated zero crossing-time errors, is implemented in this design to correct the phase error. In this way, the DCO is realigned with a clean system reference clock, and ideally, the DCO node is "pulled" toward the correct reference level once the injection locking pulse signal arrives. The definition of injection locking strength  $\beta$ , which ranges from 0 to 1, is depicted in Fig



Figure 3.10: The definition of injection locking strength [6].

3.10. After realignment, the DCO phase is shifted by  $-\beta \times \theta_e(n)$ , where  $\theta(e)$  is the phase difference between the reference and free-running edge. When  $\beta = 1$ , the free-running edge is pulled immediately once the injection reference pulse arrives.

From [7], it has been confirmed that the power transfer function of an integrator that is realigned with a clean reference every  $T_{REF}$  second with reset strength  $\beta$  can be represented by:

$$(|H(1j\cdot\omega^2)|) = \frac{1}{\omega^2} + \frac{\beta^2}{(\beta - 1 + \cos(\omega \cdot T_{REF}))^2 + \sin(\omega \cdot T_{REF})^2} \cdot \frac{1}{\omega^2} (1 - 2 \cdot \frac{\sin(\omega \cdot T_{REF})}{\omega \cdot T_{REF}})$$
(3.5)

When  $\beta = 0$ , the injection locking is totally disabled and equation 3.5 is reduced to:
$$|H(1jw)| = \frac{1}{\omega^2} \tag{3.6}$$

In this scenario, this free running DCO is a perfect noise integrator. Consequently, these zero-crossing time errors will be converted into phase noise in the frequency domain by multiplying the transfer function proportional to  $1/\omega^2$ . When  $\beta = 1$ , Equation 3.5 reduces to:

$$|H(1jw)| = \frac{2}{\omega^2} \cdot (1 - \frac{\sin(\omega \cdot T_{REF})}{\omega \cdot T_{REF}})$$
(3.7)

Fig 3.11 plots how the injection strength influences the power spectral density (PSD) of phase noise. The injection locking provides a first-order high-pass filter with a cutoff frequency of around 0.3 to 0.5  $f_{ref}$ , depends on  $\beta$ , to effectively suppress the in-band phase noise.

However, it is tricky to make  $\beta = 1$  in the real circuit design. The injection locking buffer has finite strength, which makes the realignment transition happens in finite time. Besides, other imperfections like PVT variations and noise also inevitably impact the injection locking behaviour. To suppress the phase noise, two oscillation nodes, OUTB\_0



Figure 3.11: Calculated PSD of phase deviation [7].

and OUT\_0 in Fig 3.4, are injected by two injection locking buffers. Each time the rising edge of the 467MHz system clock comes, the pulse generator generates two inverted pulses with adjustable pulse width for injection locking. For other nodes, they are all connected to identical injection locking buffers with inputs that are tied to the ground or supply as dummy logics to guarantee all the nodes have the same load capacitance. The supply of the buffer is connected to the nominal supply for stronger pulling strength, and the resulting amplitude jump is filtered out by the output buffer.

The injection locking strength,  $\beta$  is determined by the size ratio of  $M_{ninj}$  and  $M_{n1}$ . When the injection locking pulse signal arrives, these two inverters fight against each other, so larger  $M_{inj}$  can pull the output closer to the reference edge, therefore corresponds to a larger  $\beta$ .

However, this ratio should be chosen carefully since these buffers are connected to the oscillating node directly. The maximum frequency of the oscillator is fairly sensitive to this node capacitance. Fig 3.12 shows how this ratio S affects the DCO output frequency. S is defined by the size ratio of  $M_{nini}$  and  $M_{n1}$ . Over-designed S slows down the



Figure 3.12: The output frequency affected by the size ratio of injection pairs and delay cells.

oscillator significantly, and with the ratio of 3, the maximum frequency is decreased by 10% in the simulation, violating our primary goal. Therefore, S is set to be 1 to achieve a good balance between operation frequency and the phase noise performance. As for the degradation of injection locking strength caused by the small size of the injection locking buffer, it can be compensated by increasing the injection locking pulse width, which is accomplished by the pulse generator.

### **3.5.** The load-balancing

The TX top schematic shows that all buffered 8 DCO outputs are connected to the 8-1 phase mux to be selected for the 8PSK modulation. On top of that, one of the DCO outputs (phase 0) is fed to the delay generator, which delays the input pulse based on the DCO clock period. Furthermore, one of the DCO outputs (phase 1) is connected to a counter of the transmitter for future design and implementation. Consequently, the load capacitance of each DCO output is in-balanced. Moreover, the extra wire length for connecting these two phases to the delay generator and counter also inevitably introduce additional unwanted coupling capacitance.

For the 8PSK modulation in 8 GHz, every two adjacent phases are only separated by 15 ps. Therefore, any fF-level unbalanced load capacitance is easily converted into the noticeable unbalanced delay, consequently, phase mismatch. To mitigate this issue, the



load-balancing buffers described in Fig 3.13 are employed.

Figure 3.13: The load-balancing buffers.

The load-balancing buffers are composed of 32 inverters which can be split into 2 stages. In the first stage, each DCO output phase is connected to 2 inverters. In the second stage, the outputs of inverter B are connected to the phase mux while the outputs of inverter A are open for phases 7 to 2 but fed to the logics for phases 1 and 0. The output loads of inverter B are hence isolated from the extra load to enable balance path load for all phases.

### **3.6.** THE DCO LAYOUT DESIGN AND ITS SIMULATION RESULTS

Fig 3.14 shows the layout of the DCO core. Because of the negatively skewed scheme, each delay cell is connected to two subsequent stages to generate two time-separated inputs. To avoid various trace lengths, the layout is designed to be as symmetric as possible. 4 delay cells are symmetrically placed in the layout. All the connections between the input and output of each delay cell are located in the middle to reduce the wiring length and capacitance. The injection locking is coming from the top, and the DCO output buffer is placed right next to the delay cell. Only higher metal layers (metal three to six) are used for signal paths routing to reduce the coupling capacitance.

Another improvement caused by the negative skew scheme is the low phase noise. The phase noise generated by the device noise can be estimated by [8]:

$$PN = \frac{T_{ON}}{T} \cdot PN_{cell} \tag{3.8}$$

Where  $T_{ON}$  is the transition time and T is the clock period while  $PN_{cell}$  is the noise contributed by the devices in the delay cell. The negative skew scheme enhances the phase noise performance by lowing the transition time. Fig 3.15 plots the post-layout simulation result of the phase noise of the free-running DCO that is extracted with resistance and capacitance. Its Figure of Merit (FoM) at frequency offset  $\Delta f$  is presented as follows [9]:



Figure 3.14: The DCO core top layout.

$$FoM(\Delta f) = PN(\Delta f) + 20log(\frac{\Delta f}{f_{osc}}) + 10log(\frac{P_{DC}}{1mW})$$
(3.9)

Where  $P_{DC}$  is the DC power consumption and  $f_{osc}$  is the oscillation frequency. The FoM of this oscillator is -154.9 dBc/Hz at 10 MHz offset. With the injection locking, the phase noise performance is expected to be better during the measurement.



Figure 3.15: Simulated phase noise of the free-running DCO at 9.67 GHz.

The Monte Carlo simulation result for the phase mismatch of 200 simulated points is

plotted in Fig 3.16. The mean of the distribution is -1.2%, and the standard deviation is 5.7%, normalized by 45°. According to the principle of the standard distribution, 95% of the simulation points fall in the two standard deviations of the mean, which is equivalent to -5.67 °to 4.58 °phase mismatch. The lower bound of this range just fulfils the phase mismatch requirement of 5.7°.



Figure 3.16: Monte Carlo simulation result for phase mismatch.

### REFERENCES

- G. Souliotis, C. Laoudias, F. Plessas, and N. Terzopoulos, "Phase interpolator with improved linearity," *Circuits, Systems, and Signal Processing*, vol. 35, no. 2, pp. 367– 383, 2016.
- [2] B. Razavi, *Design of CMOS Phase-Locked Loops: From Circuit Level to Architecture Level.* Cambridge University Press, 2020.
- [3] M.-T. Hsieh and G. Sobelman, "Comparison of lc and ring vcos for plls in a 90 nm digital cmos process," 01 2006.
- [4] S.-J. Lee, B. Kim, and K. Lee, "A novel high-speed ring oscillator for multiphase clock generation using negative skewed delay scheme," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 2, pp. 289–291, 1997.
- [5] J.-K. Lee, S. Yi, H.-S. Ahn, and H.-G. Jeong, "A large-signal analysis for a ring oscillator with negative skewed delay," 01 2010, pp. 105–108.
- [6] S. Ye, L. Jansson, and I. Galton, "A multiple-crystal interface pll with vco realignment to reduce phase noise," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 12, pp. 1795– 1803, 2002.
- [7] S. L. J. Gierkink, "Low-spur, low-phase-noise clock multiplier based on a combination of pll and recirculating dll with dual-pulse ring oscillator and self-correcting charge pump," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2967–2976, 2008.

- [8] M. Song, I. Jung, S. Pamarti, and C. Kim, "A 2.4 ghz 0.1-fref-bandwidth all-digital phase-locked loop with delay-cell-less tdc," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 12, pp. 3145–3151, 2013.
- [9] J. P. Caram, J. Galloway, and J. S. Kenney, "Voltage-controlled ring oscillator with fom improvement by inductive loading," *IEEE Microwave and Wireless Components Letters*, vol. 29, no. 2, pp. 122–124, 2019.

# 4

### THE POWER AMPLIFIER AND MATCHING NETWORK DESIGN

The stringent spectral mask requirements imposed by FCC regulations and the high absorption loss complicate the PA design. The PA must provide high output power to overcome the high absorption loss and sufficient linearity to guarantee spectral purity. This design implements a switched-capacitor PA (SCPA) with a tunable on-chip matching network to boost the radiated output power. The specifications for the PA and matching network are listed in 4.1. This chapter organizes as follows: the principle of SCPA is

| Supply (V)                                             |    |  |  |  |
|--------------------------------------------------------|----|--|--|--|
| DC Power consumption in continuous-wave (CW) mode (mW) |    |  |  |  |
| Maximum output power in CW mode (dBm)                  |    |  |  |  |
| Operation frequency (GHz)                              |    |  |  |  |
| Matching network bandwidthc (MHz)                      |    |  |  |  |
| System efficiency (%)                                  | 25 |  |  |  |

Table 4.1: Design specifications of the SCPA and matching network.

introduced in Section 4.1, the PA design is described in Section 4.2. Section 4.3 focus on the matching network design. The layouts design and post-layout simulation results are demonstrated in Section 4.4.

### 4.1. THE SWITCHED-CAPACITOR PA

As is the dominant energy consumer in a transmitter, the PA with high average efficiency can facilitate extending the battery life, and greater mobility of the users [1]. The SCPA comprises an array of capacitors that switch at the RF clock frequency between the ground or supply voltage [2]. Benefiting from the CMOS process scaling, SCPA offers good gain resolution for PAM modulation by an accurate capacitance ratio. Moreover, because the inputs contain amplitude and phase information, direct digital-to-RF conversion is achieved efficiently without a mixer.

In contrast, linear analog PAs (e.g., class-A, -AB) are not energy-efficient since the DC power consumption is constant even the output power varies, and its efficiency is deteriorated using a modulation scheme with larger PAPR [2]. Accordingly, the SCPA behaves like a class-D amplifier at peak power level, which means its DC power consumption is also scaled down when the output power is lower [3]. As its name suggests, the SCPA is composed of an array of precise switching capacitors and a matching network, as depicted in Fig 4.1. All the top plates of the N identical capacitors are connected while each of their bottom plates is connected to an RF switch. Selected bottom plates are switched at the RF clock frequency between  $V_{DD}$  and  $V_{gnd}$ , depending on the input amplitude code [2].



Figure 4.1: A simplified SCPA.

Fig 4.2 plots the simplified RF switch (unit cell) in this design which is connected to the bottom plate of the capacitor array. The switch comprises a control logic gate, which can be simplified as an "AND" gate, and an output driver. When this unit cell is selected (Amplitude Code =1), the output of the driver is toggled like a square wave between  $V_{DD}$  and  $V_{gnd}$  at the RF frequency. Conversely, the output is tied to the AC ground when the unit cell is deselected (Amplitude Code =0). Fig 4.3 illustrates the equivalent circuit seen from the matching network. The switching array can be represented by a voltage control source and an output impedance. Therefore, the PA output voltage level is proportional to the input digital code n/N, and the maximum output voltage is achieved when n/N is equal to 1, which corresponds to the maximum input digital code. The total output impedance of the SCPA is fixed and equals to [2]:

$$C_{total} = N \cdot C_u. \tag{4.1}$$

Where *N* is the number of capacitors in the SCPA, and  $C_u$  is the value of each identical unit capacitance. Therefore, the output impedance is code-independent, which facilities the matching network design. The inductor in the matching network is used to



Figure 4.2: A single PA unit.

resonate out the output capacitors at the operational frequencies. Ideally, the output on the load resistance only contains the fundamental amplitude of the output signal while other harmonics are suppressed. Thus, the output voltage amplitude on the load resistance can be expressed by [2]:

$$V_{out} = \frac{2}{\pi} \left(\frac{n}{N}\right) V_{DD} \tag{4.2}$$

Where *n* is the number of selected capacitors, *N* is the number of total capacitors and and  $\frac{2}{\pi}$  is the fundamental tone coefficient of the Fourier transformation for the rectangular signal. Accordingly, the output power can be calculated by [2]:

$$P_{out} = \frac{2}{\pi^2} \left(\frac{n}{N}\right)^2 \frac{V_{DD}^2}{R_{load}}$$
(4.3)

Where  $R_{load}$  is the output load resistance. The SCPA has a similar behaviour of the dynamic logic switched between supply and ground. The dynamic power consumption of such kind of circuit can be expressed by:

$$P_{SC} = V_{DD} \cdot f^2 \cdot C_{load} \tag{4.4}$$

Where f is the switching frequency, and  $C_{load}$  is the total node capacitance. For an SCPA, the load capacitance seen from the matching network is the series combination of two capacitors [2]:

$$C_{load} = \frac{n(N-n)}{N^2} \cdot C_{total} \tag{4.5}$$

Combining euqations 4.4 and 4.5, the ideal power added efficiency (PAE) of the SCPA can be calculated by:

$$PAE_{ideal} = \frac{P_{out}}{P_{out} + P_{SC}} = \frac{4n^2}{4n^2 + \frac{\pi n(N-n)}{Q_{load}}}$$
(4.6)

and the quality factor  $Q_{load}$  is defined as [2]:

$$Q_{load} = \frac{1}{2\pi f C_{total} R_{load}} \tag{4.7}$$

Equation 4.4 indicates that the PAE can reach 100% once with maximum input code n=N. However, this conclusion is based on the ideal case. In reality, several limitations degrade the PAE performance. First of all, the on-resistance of the switches is able to cause extra power loss. Besides, the parasitics of the switches and the coupling capacitance from the layout also increase the dynamic power. Finally, the PAE is proportion to the Q factor, while the perfect matching is infeasible in actual design since the on-chip Q factor is limited to 2 to 3 for CMOS implementations.



Figure 4.3: The equivalent circuit for the SCPA.

### 4.2. THE PA DESIGN

### **4.2.1.** TOP SCHEMATIC OF THE SCPA

An SCPA has been implemented in this design, as shown in Fig 4.4. To suppress the low-frequency undesired signal ( $\omega \ll \omega_0$ ) and its even harmonics signal of  $\omega_0$ , two single-ended PAs are capacitively combined, leading to a push-pull SCPA configuration [4].

The out-of-band high-frequency harmonics are not the issue since the on-chip bandpass matching network filters them out singificantly.

A single-ended SCPA consists of 8 identical unit cells corresponding to the 8-bit pulseshaped output from the pulse shaper. Furthermore, one unit cell contains 4 identical



Figure 4.4: The capacitively coupling SCPA.

unary units controlled by the baseband amplitude modulation signal to realize the 4PAM modulation. Although higher resolution can be achieved by binary coding, it is not suitable for the staircase-shaped pulse input. One selected RF clock signal from the phase mux based on the PSK modulation signal is fed to all unit cells of one signal-ended PA while its complementary counterpart is fed to the second single-ended PA. The digital controlled logics in the PA are represented by an 'AND' gate and a buffer. The unary cell is enabled only when both AM code and PAM modulation signal are one, enabling the recombination of phase and amplitude information. For other cases, the output of the inverter is tied to AC ground.

### 4.2.2. DESIGN FOR A LINEAR PAM MODULATION

Although this SCPA operates both in the minimum (at the beginning of the pulse and PAM=1) and full power mode (at the peak of the pulse and PAM=4) during modulation, the peak power of the UWB pulse is more of interest because the receiver detects only the UWB pulse's peak power for PAM de-modulation.

Ideally, in CW mode, the PA output power is proportional to the 2-bit PAM code. More specifically, the output power should be 6 dB less if the input code is halved. However, some imperfections that alter with the input PAM code give rise to the AM-AM distortion that deteriorates the PAM SNR [5].

The on-resistance of MOSFET generates extra power loss and degrades the output efficiency. Additionally, the AM-AM distortion is also due to the unequal on-resistance, namely  $R_p$  and  $R_n$  of PMOS and NMOS. The source impedance  $R_s$  in Fig 4.3 is the non-ideal source resistance seen from the matching network. If all other imperfections are

neglected, the source conductance causes by MOSFET on-resistance can be calculated by:

$$Y_s = \frac{2n}{R_p + R_n} + \frac{N - n}{R_n} \tag{4.8}$$

Where *n* is the number of bottom plates switching at the RF frequency while N - n bottom plates are tied to the ground. Consequently, the overall source resistance is the average resistance of N operating switches in parallel plus N-n on-resistance of NMOS that are pulling the bottom plates to AC ground. The coefficient 2 in the first term is used to average the on-resistance of the CMOS switch when a 50% duty-cycle clock is applied. One important observation follows equation 4.8 is that  $R_s(Y_s)$  is code-dependant, which means  $R_s(Y_s)$  changes with *n*. When the PA operates in the full amplitude mode (n=N), the source conductance is reduced to:

$$Y_s = \frac{2N}{R_p + R_n} \tag{4.9}$$

In this case, the source resistance is the average on-resistance of N switches in parallel. Compared equation 4.8 and 4.9, it can be obtained that these two equations are equivalent when  $R_n$  is equal to  $R_p$ . In this case, the source impedance becomes independent of the input AM input code, thus avoid the AM-AM distortion caused by the on-resistance mismatch. As a result, the aspect ratio of  $M_n$  and  $M_p$  for the CMOS switch in Fig 4.2 should be chosen carefully to ensure equal on-resistances.

The equivalent on-resistance of a MOSFET can be calculated by averaging the resistance of the device over the interval when the MOSFET charges or discharges the node capacitance from starting (0 or supply) point to midpoint [6]. Fig 4.5 depicts the simulated on-resistance of the PMOS and NMOS versus the width when their lengths are fixed in this process. Based on the simulation results, the  $W_p/W_n$  ratio is 1.2 to achieve equal on-resistance. Since the on-resistance is approximately linear to the width of the transistor, this ratio should be applicable for different widths.

#### **4.2.3.** OPTIMIZATIONS FOR LOW POWER DESIGN

The drain efficiency of an SCPA is degraded by several additional losses, including the switching loss caused by on-resistance, parasitic loss due to the switches' parasitics and the passive elements used in the matching network. Besides, the dynamic logic depicted in Fig 4.4 also consumes dynamic power when the PA is working, which affects the overall system efficiency substantially. The maximum PA output power is tightly correlated to the size of the PA switch. Additionally, lower switching loss is obtained by increasing the width, while on the other hand, larger gate capacitance is generated, which causes higher dynamic power loss. Our primary goal is to achieve the targeted RF peak power with the lowest power consumption in this project.

Fig 4.6 plots the simulation results of the PA's DC power consumption and the efficiency versus the transistor width, when the PA is operating in the full power mode and connected to an ideal matching network followed by a 50-ohm load resistance.



Figure 4.5: *R*<sub>on</sub> vs. transistor width.



Figure 4.6: DC power consumption and PA efficiency vs. transistor width.

As demonstrated, if the size keeps increasing, the DC power consumption also because of the proportionally growing gate capacitance of the switches. However, the PA output power reaches its plateau once the transistor width exceeds 400 nm because the switching loss degrading its system efficiency also remains steady.

Therefore, to balance the switching loss (caused by its on-resistance) and the dynamic loss (caused by the charging of the gate capacitance), the size of  $M_n$  and  $M_p$  are chosen to be 640 nm and 840 nm, respectively, to maintain high output efficiency while achieving an acceptable power consumption.

Another critical parameter is the value of  $C_u$ . A smaller size of  $C_u$  reduces the charging dissipation (equation 4.4) and improves the linearity of PAM modulation.

Fig 4.7 plots the simulated normalised PA output power when the input PAM code index changes from 1 to 4, when the PA RF output is merely connected to a 50-ohm load without any matching. As an RF digital-to-Analog converter (DAC), its linearity can be analysed with integral non-linearity (INL). Theoretically, the output power should be increased by 25% with each increment of the PAM code index. If the  $C_u$  is too large (48 fF in this case), noticeable INL errors are observed when the input code is small (<2). This resulting phenomenon is similar to the aforementioned AM-AM distortion.



Figure 4.7: Normalized output power vs. input PAM code.  $C_u$ =11f.

To study this phenomenon further, Fig 4.8 shows the voltage waveform at the bottom plate of  $C_u$  in a selected unary cell when the PAM code index is 2. In this case, only half of the PA units are enabled. As discussed in the previous subsection, the output inverter of the PA unit is relatively small to avoid high power dissipation, therefore its driving capability to a load is also limited. As shown in Fig 4.8, when  $C_u$  is too large, the waveform at the bottom plate of  $C_u$  is not switching in a full-swing mode. As a result, the output voltage of the equivalent voltage source in Fig 4.3 is reduced, which corresponds to the IN1 errors in Fig 4.7. To mitigate this non-linearity issue, a smaller size of  $C_u$  is usually preferred while the matching network design constraints its lower bound , thus requiring some trades off and iterative. If the capacitance is too small, a large inductor is needed to resonate out the capacitance, which occupies a larger area and is undesirable for the on-chip matching network design because of the limited quality factor. In this design, a small value of  $C_u$  is chosen (11 fF).



Figure 4.8: Comparison of the bottom plate waveform of  $C_u$  when PAM input code=2. $C_u$ =11fF.

### **4.3.** The on-chip matching network

As stated earlier, the targeted transmitted RF power for 8 GHz signal is 4.85 dBm to overcome high tissue loss in the human body, therefore a matching network enables proper matching by transforming the impedance between 50-ohm antenna load into the PA optimum load is employed.

From another perspective, an on-chip matching network absorbs RF pad parasitic capacitances and includes the subsequent output RF bond-wire, which affects the output power significantly. The differential output bond wire is illustrated in Fig 4.9. The combination of  $C_{PCB}$  and  $L_{bw}$  creates an unwanted L-section matching network, transforming the antenna impedance to an another value.

In this model,  $C_{IC}$  models the coupling capacitance between the ground and routing from the IC side while  $C_{PCB}$  models its PCB counterpart.  $R_{BW}$  and  $L_{BW}$  represent the resistance and inductance of the wire, and k is the magnetic coupling factors of the bondwires. All these parameters are empirical, so variations of these parameters can be expected. Thus, the matching network should be tunable to cover the variation and brings back the matching condition.

#### **4.3.1.** MATCHING STRATEGY WITH IDEAL COMPONENTS

Two identical  $\pi$  on-chip matching networks, which are formed by connecting two back-to-back L networks, are the reasonable matching strategies for this differential SCPA because these two capacitors' values can be controlled by connecting their bottom plates to switches, which is the step toward a tunable matching network design. As discussed in the previous section, the output impedance of an SCPA is independent of the input code. Theoretically, the load-pull simulation is a reasonable approach to characterize the PA output impedance *Z*<sub>*in*</sub>. The impedance seen from the SCPA with the wire-bond model



Figure 4.9: The bonding-wire model from PCB.

 $Z_{out}$  can also be obtained by the load-pull simulation. The final step is to build a matching network based on the difference between  $Z_{in}$  and  $Z_{out}*$ , which is the conjugate of  $Z_{out}$  in the Smith chart.

However, due to the presence of the bond wire model, where the ground is considered and coupled to the signal paths, the load-pull simulation based on the 2-terminal ports in cadence cannot perfectly demonstrate the correct  $Z_{out}$ . In this project, the direct sweep for the values of L and  $C_{in}$  and  $C_{out}$  to achieve the maximum output power is applied. Based on the simulations, the value of L and  $C_{in}$  and  $C_{out}$  in an ideal matching network is set to 1.39 nH, 100 fF and 110 fF, respectively.

As its name suggests, the UWB signal is wideband (>500MHz by definition) thus the matching network must not be the limiting block for radiating a signal with large bandwidth. The 10 dB load reflection bandwidth is an indicator for a feasible matching, corresponds to approximately 10% of the incident power reflected and is the widely accepted standard. Fig 4.10 shows the simulated bandwidth of load reflection coefficient seen from the 50-ohm antenna side when matching networks are applied. Note that this small-signal simulation result typically deviates from reality, since the PA output is considered to be a large signal. The 10 dB load reflection coefficient bandwidth is greater than 500MHz, which sufficiently meets the requirements of wide-band matching.

### **4.3.2.** THE TUNABLE MATCHING NETWORK DESIGN

In the first step, the values of ideal passive components are determined. Then  $C_{in}$  and  $C_{out}$  are replaced by 4 smaller switching capacitors units. Each unit includes an NMOS switch so that the total capacitance value is tuned by the 4-bit control code connected to the gates of the NMOS switches, namely, IN<3:0> and OUT<3:0>. To further extend the tuning range, the capacitor units are sized in a binary way. More specifically, the size of the capacitance,  $C_1$ , and switch  $W_1$  in the capacitor unit connected to the LSB



Figure 4.10: Simulated bandwidth of load reflection coefficient.

control code is the smallest while the sizes in the other units are twice of the previous units'. Consequently, the maximum value of the  $C_{in}$  and  $C_{out}$  are 750 fF and 850 fF that are realized by 50 fF  $C_1$  and 56.7 fF  $C_2$ . The parasitics provide the minimum capacitance for fixed capacitors in the tunable matching network when all switches are turned off. Fig 4.11 shows the schematic of this ttunable matching network.

The MOSFET does not act as an ideal switch because of its on-resistance and parasitic capacitance. The quality factor of the switching capacitor unit can be defined by:

$$Q = \frac{1}{\omega_0 \cdot R_{on} \cdot C_1} \tag{4.10}$$

Where  $\omega_0$  is the operating frequency and  $R_{on}$  is its on-resistance. The quality factor of each switch should be high enough (>15) to reduce the power loss caused by its resistance. As discussed in the previous sub-section, the on-resistance is inversely proportional to the width of the transistor. To increase Q to 15, the width of the NMOS switch is 1.2u based on the simulation.

#### **4.3.3.** NON-LINEARITY ISSUE OF THE MATCHING NETWORK

In the previous section, the non-linearity issues of the SCPA have been addressed. Nevertheless, the non-linearity caused by the matching network is also worth discussing.

A relatively large switch size is chosen to improve the quality factor, leading to a larger gate-to-drain capacitance. Consequently, even when the switch is off, the RF signal is coupled to the gate of the transistor by  $C_{dg}$  and turns on the switch. The magnitude of the coupling signal at the gate depends on the RF signal. Therefore, the total  $C_{in}$  and  $C_{out}$  are modulated by the RF signal amplitude, which is problematic for the PAM mod-



Figure 4.11: The tunable matching network.



Figure 4.12: INL casued by the matching network.

ulation. To tackle this issue, high threshold voltage (HVT) transistors are exploited as switches instead of normal threshold voltage (NVT) transistors. Because of its increased threshold voltage, the transistor can be completely turned off, so the AM-AM distortion is mitigated. Fig 4.12 compares the linearity of the PAM when NVT and HVT transistors are used in the matching network. The INL2 and INL3 are improved by 5 and 5.1 times, respectively, by using HVT transistors.



Figure 4.13: The layout of the SCPA.

## **4.4.** LAYOUTS DESIGN AND THE POST-LAYOUT SIMULATION RESULTS

Fig 4.13 demonstrates the dual SCPA layout consists of 64 PA units. Fig 4.14 shows the layout of the matching network. The 1.39 nH inductor dominates the area of the transmitter layout. Fig 4.15 shows the post-layout simulation results at 8 GHz in CW



Figure 4.14: The layout of matching network.

mode when the matching network is applied. As can be seen, the peak system efficiency is 29.6% and is achieved in full power mode. The DC power consumption is 9.6 mW in this mode. The pre-drive stages discussed in Section 4.2 also operate at such a high frequency, degrading the system efficiency drastically. Based on the simulation result, these logics consume 51% of the total DC power in full power mode. The peak RF output

power at 8 GHz is 2.8 mW (4.47 dBm). It is 0.4 dB lower than our targeted specification, i,e., 4.85 dBm. While in the schematic-level simulation, the peak power can reach 6 dBm. The difference is caused by the large parasitics from the layout routing.



Figure 4.15: Post-layout simulation results at 8 GHz.

### REFERENCES

- S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A switched-capacitor rf power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 12, pp. 2977–2987, 2011.
- [2] S.-M. Yoo, J. S. Walling, E. C. Woo, and D. J. Allstot, "A switched-capacitor power amplifier for eer/polar transmitters," in 2011 IEEE International Solid-State Circuits Conference, 2011, pp. 428–430.
- [3] L. Yang and G. Giannakis, "Ultra-wideband communications: an idea whose time has come," *IEEE Signal Processing Magazine*, vol. 21, no. 6, pp. 26–54, 2004.
- [4] P. P. Mercier, D. C. Daly, and A. P. Chandrakasan, "An energy-efficient all-digital uwb transmitter employing dual capacitively-coupled pulse-shaping drivers," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 6, pp. 1679–1688, 2009.
- [5] S.-M. Yoo, J. S. Walling, O. Degani, B. Jann, R. Sadhwani, J. C. Rudell, and D. J. Allstot, "A class-g switched-capacitor rf power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 5, pp. 1212–1224, 2013.
- [6] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolić, *Digital integrated circuits: a design perspective*. Pearson education Upper Saddle River, NJ, 2003, vol. 7.

# 5

### **MEASUREMENT RESULTS**

The UWB transmitter is fabricated in TSMC 28nm technology. The chip analog frontend micrograph is demonstrated in Fig 5.1. The chip occupies an active core area of  $367x463 \text{ um}^2$ .



Figure 5.1: The micrograph of the transmitter.

In this chapter, stand-alone tests for subblocks are presented in Section 5.1 and Section 5.2. Section 5.3 introduces the measurement results for the UWB transmitter and the comparisons with the state-of-the-arts .

There are two kinds of PCB daughter boards implemented in total, and all the chips are directly bonded to the PCB daughter boards. On the SMA board, the RF output ports are connected by two SMA connectors. On the wireless board, an off-chip balun followed by a printed circular monopole antenna is connected to the chip's RF output for the wireless test. The subbllock's measurements are all conducted on the SMA board.

The PCB mother broad comprises a Teensy Microcontroller for SPI control and low drop-out regulators (LDOs) to provide stable supplies for DCO, PA and other subblocks

(Fig 5.2). The external system clock is used to provide the references for injection locking and the baseband data. A configurable pseudo-random binary sequence (PRBS) is integrated into the digital part of the chip for data generation.



Figure 5.2: Simplified block diagrams for the PCB board.

### **5.1.** TRANSMITTER FRONT-END MEASUREMENT

Fig 5.3 shows the measured PA output power versus the input code at different frequencies range. The cable losses at various frequencies have been de-embedded.



(a) Measured differential output power vs. PAM code index. (I

(b) Measured differential output power vs. Frequency.

Figure 5.3: The Measurement result of the PAM stand-alone modulation.

It can be observed that the linearity requirement for 4-PAM modulation has been met since the output power is decreased approximately by 2.5 dB, 6 dB and 12 dB when the PAM index code changes from 4 to 1. BER measurement presented in Section 5.3 evaluates the PAM modulation quality with decoded transmitted RF signal.

As shown in Fig 5.3b, although the output matching network is designed and optimized for 8 GHz, the maximum output power is achieved at 6.5 GHz. To further evaluate the matching network's characteristic, the single-ended RF output load reflection coefficient is measured and plotted in Fig 5.4. The best matching condition is performed in 6.37 GHz, instead of 8 GHz. The deviated measurement outcomes are caused by poor bond wire and on-chip parasitics modelling. The unwanted deviations are magnified at such a high-frequency design. The matching network fails to re-tune the operational frequency back to 8 GHz with tuning since its cover range is not wide enough.



Figure 5.4: The measured single-ended load reflection coefficient from the 50  $\Omega$  load side.

### **5.2.** THE DCO MEASUREMENT RESULTS

The stand-alone DCO measurements performed at room temperature (25 °) are presented in this section. The SMA board with SMA connectors is connected to a FSV spectrum analyzer.

### **5.2.1.** DCO TUNING RANGE

The PAM code and AM code in Fig 4.4 are all set to 1. Accordingly, the PA acts as an output buffer of the oscillator. Fig 5.5a plots the DCO coarse tuning range by changing PVT and ACQ bank input code index. Overlaps are observed even when the PVT step is set to 5, indicating the frequency tuning dead zone is avoided. The fine-tuning frequency gain at the desired output frequency (6.7 GHz) is 1.6KHz/bit for LSB and 9KHz/bit for the MSB, which are displayed in Fig 5.5b.

As shown in Fig 5.5a, the output frequency characteristic becomes more non-linear as the frequency increases. The main reason is the current bank's saturation effect caused by the current source's finite output resistance. When the total output current injected to the node VDD\_DELAY shown in Fig 3.2 increases with the tuning code, the node voltages also increases, assuming the input impedance of the DCO is fixed. As a result, the effective drain-to-source voltage of current sources is reduced, therefore the output current of the current bank is not linear to the tuning code any more.

In this design, to further increase the operational frequency, some fixed current sources



Figure 5.5: The tuning range of DCO.

in the PVT bank that are turned on eternally, aggravating this effect since the saturation is reached with lower coarse tuning code.

Note that the maximum output frequency (8.8 GHz) is 15% lower than the post-simulation result, potentially caused by the extra capacitance from the metal filling in the layout.



Figure 5.6: Phase noise performance.

Fig 5.6 depicts the phase noise improvement achieved by applying the injection locking at 6.7 GHz and 8 GHz DCO operating frequencies, compared with the free-running DCO.

Table 5.1 summarises the phase noise performance, RMS jitter and RMS phase error that are obtained by integrating over the frequency offset range of 1MHz-to-1GHz. The measured phase noise at 10MHz offset is -108.79 dBc, together with 1.8 mW DCO power consumption (output buffer included) at 6.7 GHz, translated to FoM of -165 dBc/Hz @

|                            | Free running | Injection locked |
|----------------------------|--------------|------------------|
| Phase noise (dBC/Hz) @1M   | -74.91       | -104.39          |
| Phase noise (dBC/Hz) @10M  | -96.54       | -108.79          |
| Phase noise (dBC/Hz) @100M | -119.41      | -118.92          |
| RMS jitter (ps)            | 10.11        | 1.09             |
| RMS phase error (°)        | 24.0         | 2.6              |

10MHz offset. The RMS phase error is reduced significantly from 24 °to 2.6 °, thus enabling a solid 8PSK modulation.

Table 5.1: Summery for the DCO performance.

### **5.3.** WIRELESS MEASUREMENT FOR THE UWB TRANSMITTER

In this section, the measurement setups and results for the transmitter's systematic performance, including the modulation quality, transmitted range and DC power consumption are presented.

To achieve a higher transmitting range, the UWB central frequency for the wireless test is set to 6.7 GHz for lower tissue loss and power consumption while achieving a higher output PA output power. Accordingly, the digital baseband frequency (system clock) is 238 MHz, corresponding to a symbol period of 4.2 ns.

## **5.3.1.** Measurement for the UWB transmitter



Figure 5.7: The measurement setup for the wireless test.

To clarify the transmitting range for this implantable UWB transmitter, the wireless *in virto* setup environment illustrated in Fig 5.7 is employed. The wireless board is inserted in a multi-layer porcine skin tissue with 15 mm thickness. The radiated signal is received by an 15 cm<sup>2</sup> antenna with 5 dBi gain and a LNA with 3 dB noise figure, 22 dBi gain. A LECROY high speed oscilloscope is connected to the output of LNA to capture data.

By varying the distance, collected data is post-processed in Matlab to calculate BER.



Figure 5.8: Transmission range and data rate for the wireless test.

Fig 5.8 plots the BER versus the transmission range. For a BER lower than  $10^{-4}$ , the maximum transmitting range for 1.66 Gbps (4PAM, 8PSK, 4PPM) and 1.43 Gbps data rates (2-PAM,8PSK,4PPM) is 2 and 15 cm, respectively. Due to the high tissue and free space path loss, the radiated amplitude is strongly attenuated therefore the 4-PAM modulation can not be decoded and separated from the noise, which is the main reason the maximum data rate drops to 1.43 Gbps at 15 cm distance.

The path loss measurement by the same setup has been performed to verify the proposed path loss estimation. This *In virto* path loss measured results for 8 GHz and 6.7 GHz versus different transmission range are demonstrated in Fig 5.9. The LNA gain and cable loss have been de-embedded. At 0 mm distance, the path loss at 8 GHz is approximately -47 dB, consistent with the result (50 dB) measured with the liquid phantom. Interference such as antenna misalignment and various antenna gains, different lab environment can explain this acceptable 3 dB difference.



Figure 5.9: In virto measurement of path loss at 6.7 GHz and 8 GHz.

The time waveform of the transmitting impulse with hybrid modulation is shown in Fig 5.10a. The 4PPM and 4PAM can also be verified by the eye diagram of the impulse

envelope, which is shown in Fig 5.10b. The 4-level peak amplitudes and positions of the pulses with the step of 600 ps is the manifestation of well working PPM and PAM modulation. Since the 8PSK modulation is hard to observe from the time domain, Fig 5.10c plots the constellation diagram in the complex plane, which is the representation of the 8PSK modulation and 4PAM modulation. The EVM specification is defined by [1]:

$$EVM = \sqrt{\frac{1}{N} \sum_{n=0}^{N-1} (Meas[n] - Qref[n])^2}$$
(5.1)

Where *N* is the total number of the symbols and n is the symbol index, Meas[n] and Qref[n] is the measured signal vector and the reference signal vector, respectively. The EVM is -21.8 dB for the PAM and PPM modulation in the wireless test, decoded by a Matlab script. The 3-dimension version of the constellation diagram with PPM modulation is shown in Fig 5.11. Fig 5.12 plots the output spectrum of the DPA with the



(a) The time waveform of the modulated UWB impulse.



(b) The eye diagram of the hybrid impulse modulation.

(c) Constellation diagram in complex plain.

Figure 5.10: Demodulated signal.

SMA board. The output spectrum achieves good FCC regulation compliance with a -10 dB bandwidth of 1 GHz and peak power below -42.0 dBm/MHz, even without the tissue loss.

The power breakdown for the UWB transmitter operating at 6.7 GHz and 8 GHz with 1.66 Gbps data rate is shown in Fig 5.13. The PA and phase mux are connected to the



Figure 5.11: 3D constellation diagram.



Figure 5.12: Measured output spectrum of the SMA board with 6.7 GHz carrier frequency and working at 1.66 Gbps data-rate mode.



Figure 5.13: Power break down for the UWB TX.

same supply VDD\_PA and consume around 52% of the total power. It is impossible to separate the power contributions of these two blocks, thus the efficiency of the SCPA is not presented here. Compared with the 8 GHz mode, around 19% of power reduction is achieved at the 6.7 GHz mode. The power relaxation is mainly contributed by the DCO and PA, whose power consumption is sensitive to the operating frequency and successfully brings the TX's overall power consumption below 10mW. The energy per bit at the frequency of 6.7 GHz is 5.8 pJ/bit and 6.77 pJ/bit with 1.66 Gbps and 1.463 Gbps data rate.

### **5.3.2.** Performance Summary and Benchmarks of the UWB TX

The performance of the IR-UWB transmitter and the comparisons with the state-ofthe-arts are shown in Fig 5.14. The normalized energy efficiency per meter  $E_{pJ/bit/m}$  is

|                                  | This work                     | TBioCAS'                   | TBioCAS'           | TBioCAS'       | ISSCC'            | ISSCC'                       |
|----------------------------------|-------------------------------|----------------------------|--------------------|----------------|-------------------|------------------------------|
|                                  |                               | 2016 [2]                   | 16 [3]             | 2020 [4]       | 2021 [5]          | 2021 [6]                     |
| Device<br>technology             | 28nm CMOS                     | 180nm<br>CMOS              | GaAs HBT           | VCSEL          | 65nm<br>CMOS      | 28nm<br>CMOS                 |
| Wireless method                  | IR-UWB                        | IR-UWB                     | IR-UWB             | Optical        | IR-UWB            | IR-UWB                       |
| Frequency                        | 6-9GHz                        | 3-7GHz                     | 8GHz               | NIR            | 4GHz              | 3-10GHz                      |
| Modulation                       | 4PPM+8PSK+<br>4PAM impulse    | BPSK<br>impulse            | OOK<br>impulse     | OOK<br>impulse | D-MPPM<br>impulse | BPSK<br>impulse              |
| TX architecture                  | Digital polar<br>(DPA+ PHMUX) | Edge<br>combine            | Edge<br>combine    | -              | Edge<br>combine   | Digital polar<br>(DPA+ ILRO) |
| Max. data rate                   | 1.66Gbps                      | 500Mbps                    | 128Mbps            | 300Mbps        | 1.125Gbps         | 27Mbps                       |
| TX power cons.                   | 9.69mW                        | 5.4mW                      | 561mW              | 11mW           | 28mW              | 4.9mW                        |
| TX energy<br>efficiency.         | 5.8pJ/b                       | 10.8pJ/b                   | 438pJ/b            | 37pJ/b         | 25pJ/b            | 180pJ/b                      |
| TX peak P <sub>OUT</sub>         | 1.5dBm                        | N.A.                       | -12.3dBm           | -              | -10dBm            | -0.7dBm                      |
| TX antenna area                  | 66mm <sup>2</sup>             | 100mm <sup>2</sup>         | N.A.               | N.A.           | N.A.              | -                            |
| TX antenna gain                  | -8.5dBi                       | N.A.                       | N.A.               | -              | 3dBi              | -                            |
| Tissue thickness                 | 15mm skin/fat                 | 2mm skin/fat<br>& 4mm bone | 15-20mm<br>phantom | 3.5mm<br>skin  | No tissue         | No tissue                    |
| Transmission<br>range (in-vitro) | 2cm@1.66Gbps<br>15cm@1.43Gbps | 1.5cm                      | 1cm                | 0.4cm          | N.A.**            | -**                          |
| Normalized<br>energy effi.*      | 45pJ/b/m<br>(@1.43Gbps)       | 720pJ/b/m                  | 43.8nJ/b/m         | 9.25nJ/b/m     | -                 | -                            |

\*TX energy efficiency normalized to 1m transmission range. \*\* No transmission range reported with tissue or phantom.

Figure 5.14: Performance summery of the IR-UWB transmitters and the comparisons table.

calculated by:

$$E_{pJ/bit/m} = \frac{S}{P_{DC} \cdot D}$$
(5.2)

Where *S* is the data (symbol) rate and  $P_{DC}$  is the DC power consumption of the UWB transmitter, while *D* is the transmission range normalized by 1 m.

This work presents the highest energy efficiency per meter of 45pJ/bit/m, which is 16 times better than the best reported prior transcutaneous IR-UWB transmitter [2], [3], [4], [5], [6].



Figure 5.15: Benchmark chart of the energy efficiency versus data rate of state-of-the-art IR-UWB transmitters.

Fig 5.15 shows the benchmark chart of energy efficiency versus data rate. This proposed architecture achieves the energy efficiency of 5.8pJ/bit, which is the best among other state-of-the-art high-data-rate IR-UWB transmitters.

### REFERENCES

- M. Kim, A. Kiyono, K. Ichige, and H. Arai, "Experimental study of jitter effect on digital downconversion receiver with undersampling scheme," *IEICE Transactions*, vol. 88-D, pp. 1430–1436, 07 2005.
- [2] A. De Marcellis, G. D. P. Stanchieri, M. Faccio, E. Palange, and T. G. Constandinou, "A 300 mbps 37 pj/bit pulsed optical biotelemetry," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 14, no. 3, pp. 441–451, 2020.
- [3] H. Ando, K. Takizawa, T. Yoshida, K. Matsushita, M. Hirata, and T. Suzuki, "Wireless multichannel neural recording with a 128-mbps uwb transmitter for an implantable brain-machine interfaces," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 10, no. 6, pp. 1068–1078, 2016.
- [4] A. De Marcellis, G. D. P. Stanchieri, M. Faccio, E. Palange, and T. G. Constandinou, "A 300 mbps 37 pj/bit pulsed optical biotelemetry," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 14, no. 3, pp. 441–451, 2020.
- [5] G. Lee, S. Lee, J.-H. Kim, and T. W. Kim, "21.1 a 1.125gb/s 28mw 2m-radio-range iruwb cmos transceiver," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 302–304.
- [6] E. Allebes, G. Singh, Y. He, E. Tiurin, P. Mateman, M. Ding, J. Dijkhuis, G.-J. v. Schaik, E. Bechthum, J. v. d. Heuvel, M. E. Soussi, A. Breeschoten, H. Korpela, Y.-H. Liu, and C. Bachmann, "21.2 a 3-to-10ghz 180pj/b ieee802.15.4z/4a ir-uwb coherent polar transmitter in 28nm cmos with asynchronous amplitude pulse-shaping and injection-locked phase modulation," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 304–306.

# 6

### CONCLUSION

### **6.1.** MASTER THESIS CONCLUSION

Based on the path loss estimation and link budget analysis, an IR-UWB implantable transmitter with a hybrid modulation scheme is designed to achieve a high data rate and long transmission range for BCIs applications in this master thesis project.

By employing the negatively skewed oscillator technique, the output frequency of the 8-phase DCO is increased by 30%, thus enabling operation over a high-frequency UWB channel (6 to 8 GHz). The applied injection locking technique suppresses the RMS jitter significantly to 1.07 ps. Together with the proper layout design, load-balancing buffers, an EVM of -21.8 dB is achieved, indicating a good 8PSK modulation performance. The overall LO module consumes 2.9 mW DC power at 6.7 GHz output frequency.

The implementation of a linear SCPA enables the 4 PAM modulation successfully. The on-chip matching network boosts the peak output power to 1.5 dBm at 6.7 GHz, therefore even though attenuated by the high tissue loss, the transmission can cover up to 15 cm range.

Fabricated in TSMC 0.9V 28 nm technology, the chip occupies an area of  $0.155 \text{ mm}^2$ . In the *in virto* measurement, the chip is inserted into a multi-layer porcine skin tissue with 15 mm thickness. Implementing the hybrid modulation combined with 4PPM, 4PAM and 8PSK, the measured maximum data rate for BER below  $10^{-4}$  is 1.66 Gbps and 1.44 Gbps in 2 cm and 15 cm transmission range with 9.69 mW and 9.95 mW power consumption, respectively. The transmitter achieves an energy efficiency of 5.8 pJ/bit in 1.66 Gbps data mode, leading to the best energy efficiency among other IR-UWB high-data-rate transmitters. Besides, in 1.44 Gbps data rate mode, the resulting best-in-class 45 pJ/bit/m energy efficiency normalized by 1 m shows this transmitter is suitable for shallow-implant BCIs systems that demands energy-efficient, high-data-rate wireless data communications.

### **6.2.** Recommendations for future work

As stated in Section 5.1, a deviation of the peak output power is observed. In the post-layout simulation result, the maximum output power is reached when the carrier frequency is 8 GHz. However, this frequency is shifted to 6.37 GHz during the measurement, caused by the bond wires and on-chip parasitics. Accordingly, the cover range of the tunable matching network is insufficient to cover these variations and re-tune the operating frequency. Therefore, the tunable matching network should be re-designed to cover an extensive range, i.e., 2-VSWR circle in the smith chart, to overcome these parasitics in future implementation.

Chip-to-chip variation is observed during the measurement. There is one chip that performs the 8PSK modulation poorly with tremendous phase mismatch. Since the 8 phases are derived directly from the buffered DCO outputs, the process variations can breakout the balanced load of each node, undermining the PSK signal quality. Therefore, a DLL can be integrated into the future design to realize a robust 8PSK modulation against the PVT variations.

### REFERENCES

- R. Patel, P. Patel, J. Lalwani, M. Sarkar, and S. Nagaraj, "Investigating the feasibility of multiple uwb transmitters in brain computer interface (bci) applications," in 2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN), 2016, pp. 236–241.
- [2] J. R. Wolpaw, N. Birbaumer, W. J. Heetderks, D. J. McFarland, P. H. Peckham, G. Schalk, E. Donchin, L. A. Quatrano, C. J. Robinson, T. M. Vaughan *et al.*, "Braincomputer interface technology: a review of the first international meeting," *IEEE transactions on rehabilitation engineering*, vol. 8, no. 2, pp. 164–173, 2000.
- [3] J. L. Collinger, B. Wodlinger, J. E. Downey, W. Wang, E. C. Tyler-Kabara, D. J. Weber, A. J. McMorland, M. Velliste, M. L. Boninger, and A. B. Schwartz, "High-performance neuroprosthetic control by an individual with tetraplegia," *The Lancet*, vol. 381, no. 9866, pp. 557–564, 2013.
- [4] L. R. Hochberg, D. Bacher, B. Jarosiewicz, N. Y. Masse, J. D. Simeral, J. Vogel, S. Haddadin, J. Liu, S. S. Cash, P. Van Der Smagt *et al.*, "Reach and grasp by people with tetraplegia using a neurally controlled robotic arm," *Nature*, vol. 485, no. 7398, pp. 372–375, 2012.
- [5] D. J. H. L. Homer ML, Nurmikko AV, "Implants and decoding for intracortical brain computer interfacesg," *Annual review of biomedical engineering*, vol. 15, no. 2, pp. 383–405, 2013.
- [6] H. Bahrami, S. A. Mirbozorgi, L. A. Rusch, and B. Gosselin, "Integrated uwb transmitter and antenna design for interfacing high-density brain microprobes," in 2015 IEEE International Conference on Ubiquitous Wireless Broadband (ICUWB), 2015, pp. 1–5.
- [7] C. M. Lopez, S. Mitra, J. Putzeys, B. Raducanu, M. Ballini, A. Andrei, S. Severi, M. Welkenhuysen, C. Van Hoof, S. Musa, and R. F. Yazicioglu, "22.7 a 966-electrode neural probe with 384 configurable channels in 0.13μm soi cmos," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), 2016, pp. 392–393.
- [8] H. Ali, T. J. Ahmad, and S. A. Khan, "Inductive link design for medical implants," in 2009 IEEE Symposium on Industrial Electronics & Applications, vol. 2. IEEE, 2009, pp. 694–699.
- [9] M. N. Islam and M. R. Yuce, "Review of medical implant communication system (mics) band and network," *Ict Express*, vol. 2, no. 4, pp. 188–194, 2016.
- [10] L. Zhou, X. Chen, Y. Li, and J. Li, "Bluetooth low energy 4.0-based communication method for implants," in 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), 2017, pp. 1–5.
- [11] K. Ture, A. Devos, F. Maloberti, and C. Dehollain, "Area and power efficient ultrawideband transmitter based on active inductor," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 65, no. 10, pp. 1325–1329, 2018.

- [12] N. Soltani, H. Kassiri, H. M. Jafari, K. Abdelhalim, and R. Genov, "0.13m cmos 230mbps 21pj/b uwb-ir transmitter with 21.3% efficiency," in ESSCIRC Conference 2015 - 41st European Solid-State Circuits Conference (ESSCIRC), 2015, pp. 352–355.
- [13] J. Ryckaert, C. Desset, A. Fort, M. Badaroglu, V. De Heyn, P. Wambacq, G. Van der Plas, S. Donnay, B. Van Poucke, and B. Gyselinckx, "Ultra-wide-band transmitter for low-power wireless body area networks: design and evaluation," *IEEE Transactions* on Circuits and Systems I: Regular Papers, vol. 52, no. 12, pp. 2515–2525, 2005.
- [14] K. Siwiak and D. McKeown, Ultra-wideband radio technology. John Wiley & Sons, 2005.
- [15] L. Yang and G. Giannakis, "Ultra-wideband communications: an idea whose time has come," *IEEE Signal Processing Magazine*, vol. 21, no. 6, pp. 26–54, 2004.
- [16] Y. Rahayu, T. A. Rahman, R. Ngah, and P. Hall, "Ultra wideband technology and its applications," in 2008 5th IFIP International Conference on Wireless and Optical Communications Networks (WOCN'08). IEEE, 2008, pp. 1–5.
- [17] H. W. Pflug, J. Romme, K. Philips, and H. de Groot, "Method to estimate impulseradio ultra-wideband peak power," *IEEE Transactions on Microwave Theory and Techniques*, vol. 59, no. 4, pp. 1174–1186, 2011.
- [18] B. Schleicher and H. Schumacher, "Impulse generator targeting the european uwb mask," in *2010 Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems (SiRF)*, 2010, pp. 21–24.
- [19] H. Yamamoto, J. Zhou, and T. Kobayashi, "Ultra wideband electromagnetic phantoms for antennas and propagation studies," *IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences*, vol. 91, no. 11, pp. 3173– 3182, 2008.
- [20] S. Gabriel, R. Lau, and C. Gabriel, "The dielectric properties of biological tissues: Iii. parametric models for the dielectric spectrum of tissues," *Physics in medicine & biology*, vol. 41, no. 11, p. 2271, 1996.
- [21] T. Van Nunen, E. Huismans, R. Mestrom, M. Bentum, and H. Visser, "Diy electromagnetic phantoms for biomedical wireless power transfer experiments," in 2019 IEEE Wireless Power Transfer Conference (WPTC). IEEE, 2019, pp. 399–404.
- [22] R. Johnson, H. Ecker, and J. Hollis, "Determination of far-field antenna patterns from near-field measurements," *Proceedings of the IEEE*, vol. 61, no. 12, pp. 1668– 1694, 1973.
- [23] M. Ghavami, L. Michael, and R. Kohno, *Ultra wideband signals and systems in communication engineering*. John Wiley & Sons, 2007.
- [24] M. Viswanathan, "Performance comparison of digital modulation techniques," https://www.gaussianwaves.com/2010/04/performance-comparison-of-digitalmodulation-techniques-2/, 2014.

- [25] X. Tong and J. Li, "A sub-ghz uwb data transmitter with enhanced output amplitude for implantable bioelectronics," in *2017 IEEE Biomedical Circuits and Systems Conference (BioCAS)*, 2017, pp. 1–4.
- [26] A. Ebrazeh and P. Mohseni, "30 pj/b, 67 mbps, centimeter-to-meter range data telemetry with an ir-uwb wireless link," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 9, no. 3, pp. 362–369, 2015.
- [27] M. Demirkan and R. R. Spencer, "A pulse-based ultra-wideband transmitter in 90nm cmos for wpans," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2820– 2828, 2008.
- [28] N.-S. Kim and J. M. Rabaey, "A high data-rate energy-efficient triple-channel uwbbased cognitive radio," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 4, pp. 809– 820, 2016.
- [29] E. Allebes, G. Singh, Y. He, E. Tiurin, P. Mateman, M. Ding, J. Dijkhuis, G.-J. v. Schaik, E. Bechthum, J. v. d. Heuvel, M. E. Soussi, A. Breeschoten, H. Korpela, Y.-H. Liu, and C. Bachmann, "21.2 a 3-to-10ghz 180pj/b ieee802.15.4z/4a ir-uwb coherent polar transmitter in 28nm cmos with asynchronous amplitude pulse-shaping and injection-locked phase modulation," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 304–306.
- [30] G. Souliotis, C. Laoudias, F. Plessas, and N. Terzopoulos, "Phase interpolator with improved linearity," *Circuits, Systems, and Signal Processing*, vol. 35, no. 2, pp. 367– 383, 2016.
- [31] B. Razavi, *Design of CMOS Phase-Locked Loops: From Circuit Level to Architecture Level.* Cambridge University Press, 2020.
- [32] M.-T. Hsieh and G. Sobelman, "Comparison of lc and ring vcos for plls in a 90 nm digital cmos process," 01 2006.
- [33] S.-J. Lee, B. Kim, and K. Lee, "A novel high-speed ring oscillator for multiphase clock generation using negative skewed delay scheme," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 2, pp. 289–291, 1997.
- [34] J.-K. Lee, S. Yi, H.-S. Ahn, and H.-G. Jeong, "A large-signal analysis for a ring oscillator with negative skewed delay," 01 2010, pp. 105–108.
- [35] S. Ye, L. Jansson, and I. Galton, "A multiple-crystal interface pll with vco realignment to reduce phase noise," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 12, pp. 1795– 1803, 2002.
- [36] S. L. J. Gierkink, "Low-spur, low-phase-noise clock multiplier based on a combination of pll and recirculating dll with dual-pulse ring oscillator and self-correcting charge pump," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2967–2976, 2008.

- [37] M. Song, I. Jung, S. Pamarti, and C. Kim, "A 2.4 ghz 0.1-fref-bandwidth all-digital phase-locked loop with delay-cell-less tdc," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 12, pp. 3145–3151, 2013.
- [38] J. P. Caram, J. Galloway, and J. S. Kenney, "Voltage-controlled ring oscillator with fom improvement by inductive loading," *IEEE Microwave and Wireless Components Letters*, vol. 29, no. 2, pp. 122–124, 2019.
- [39] S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A switched-capacitor rf power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 12, pp. 2977–2987, 2011.
- [40] S.-M. Yoo, J. S. Walling, E. C. Woo, and D. J. Allstot, "A switched-capacitor power amplifier for eer/polar transmitters," in 2011 IEEE International Solid-State Circuits Conference, 2011, pp. 428–430.
- [41] P. P. Mercier, D. C. Daly, and A. P. Chandrakasan, "An energy-efficient all-digital uwb transmitter employing dual capacitively-coupled pulse-shaping drivers," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 6, pp. 1679–1688, 2009.
- [42] S.-M. Yoo, J. S. Walling, O. Degani, B. Jann, R. Sadhwani, J. C. Rudell, and D. J. Allstot, "A class-g switched-capacitor rf power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 5, pp. 1212–1224, 2013.
- [43] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolić, *Digital integrated circuits: a design perspective*. Pearson education Upper Saddle River, NJ, 2003, vol. 7.
- [44] M. Kim, A. Kiyono, K. Ichige, and H. Arai, "Experimental study of jitter effect on digital downconversion receiver with undersampling scheme," *IEICE Transactions*, vol. 88-D, pp. 1430–1436, 07 2005.
- [45] A. De Marcellis, G. D. P. Stanchieri, M. Faccio, E. Palange, and T. G. Constandinou, "A 300 mbps 37 pj/bit pulsed optical biotelemetry," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 14, no. 3, pp. 441–451, 2020.
- [46] H. Ando, K. Takizawa, T. Yoshida, K. Matsushita, M. Hirata, and T. Suzuki, "Wireless multichannel neural recording with a 128-mbps uwb transmitter for an implantable brain-machine interfaces," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 10, no. 6, pp. 1068–1078, 2016.
- [47] A. De Marcellis, G. D. P. Stanchieri, M. Faccio, E. Palange, and T. G. Constandinou, "A 300 mbps 37 pj/bit pulsed optical biotelemetry," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 14, no. 3, pp. 441–451, 2020.
- [48] G. Lee, S. Lee, J.-H. Kim, and T. W. Kim, "21.1 a 1.125gb/s 28mw 2m-radio-range iruwb cmos transceiver," in *2021 IEEE International Solid- State Circuits Conference* (*ISSCC*), vol. 64, 2021, pp. 302–304.
[49] E. Allebes, G. Singh, Y. He, E. Tiurin, P. Mateman, M. Ding, J. Dijkhuis, G.-J. v. Schaik, E. Bechthum, J. v. d. Heuvel, M. E. Soussi, A. Breeschoten, H. Korpela, Y.-H. Liu, and C. Bachmann, "21.2 a 3-to-10ghz 180pj/b ieee802.15.4z/4a ir-uwb coherent polar transmitter in 28nm cmos with asynchronous amplitude pulse-shaping and injection-locked phase modulation," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 304–306.