A 1-Mega Pixels HDR and UV Sensitive Image Sensor With Interleaved 14-bit 64Ms/s SAR ADC

Ruijun Zhang
A 1-Mega Pixels HDR and UV Sensitive Image Sensor With Interleaved 14-bit 64Ms/s SAR ADC

Thesis

submitted in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

in

ELECTRICAL ENGINEERING

by

Ruijun Zhang
born in Lanzhou, P.R. China

This work was performed in:

Caeleste CVBA
Hendrik Consciencestraat, 1B
2800 Mechelen
Belgium
The work in this thesis was supported by Caeleste CVBA. Their cooperation is hereby gratefully acknowledged.
The undersigned hereby certify that they have read and recommend to the Faculty of Electrical Engineering, Mathematics and Computer Science for acceptance a thesis entitled “A 1-Mega Pixels HDR and UV Sensitive Image Sensor With Interleaved 14-bit 64Ms/s SAR ADC” by Ruijun Zhang in partial fulfillment of the requirements for the degree of Master of Science.

Dated: September 7, 2015

Supervisor: Prof. dr. ir. Albert J.P. Theuwissen

Industry Supervisor: Dr. ir. Philippe Coppejans

Committee Members:

Prof. dr. ir. Albert J.P. Theuwissen
Dr. ir. Andre Bossche
Dr. ir. Philippe Coppejans
Abstract

This thesis presents a 1-Mega pixels high-dynamic range and UV sensitive image sensor in 0.18µm technology with 14-bit interleaved 64Ms/s SAR ADC. It can achieve 64 fps and 101 dB dynamic range.

The pixel array contains three kinds of pixel: UV pixel, visible pixel and low blue pixel. Those three kinds of pixel form a special 4×4 kernel to meet the technology etching requirement and an optimized modulation transfer function (MTF). The extended floating diffusion method is introduced to achieve the high-dynamic range.

An odd/even time interleaved analog readout method is used to increase the read out speed. To read out column outputs, the row driver block, column loads, S/H circuits, readout buffers and digital control block are all implemented in this design.

A 14-bit interleaved 64Ms/s SAR ADC is used to digitize the analog signals. The ADC block contains 16 14-bit 4Ms/s SAR ADCs. To achieve a 4Ms/s 2V peak-to-peak 14-bit ADC, a high speed comparator with offset calibration is implemented. Two 14-bit serializers are used to serialize the ADC output in this design.

Key words: UV sensitive, interleaved SAR ADC, comparator, offset calibration, high dynamic range
# Table of Contents

## Abstract

## Acknowledgements

### 1 Introduction

1.1 Motivation and challenge ........................................ 1
1.2 Thesis project overview ........................................... 3
1.3 Thesis organization ................................................. 4

### 2 Background of the CMOS image sensor and SAR ADC

2.1 Basic concept of CMOS image sensor ............................... 5
   2.1.1 The photodiode and optical absorption ..................... 5
   2.1.2 Pixel structures and common methods of HDR ................. 8
      2.1.2.1 Photodiode Three Transistor (3T) Pixel ............... 8
      2.1.2.2 Pinned Photodiode Four Transistor (4T) Pixel ........ 8
   2.1.2.3 Common methods of HDR .................................. 10
2.2 Electronic shutter mode ........................................... 13
2.3 Successive approximation register (SAR) analog-to-digital converter ........................................... 14
2.4 Conclusion ......................................................... 15

### 3 Pixel concept

3.1 Pixel overview .................................................... 17
   3.1.1 UV sensitive pixel and visible light pixel .................. 17
   3.1.2 Low blue pixel ............................................... 19
   3.1.3 Pixel kernel .................................................. 21
3.2 Pixel operation .................................................... 22
3.3 Conclusion ......................................................... 25
# Table of Contents

4 Analog readout chain

4.1 Overview ........................................... 27

4.2 Column load ........................................ 28

4.3 Sample and hold circuit and readout buffers ................... 29

4.3.1 Sample and hold circuit .............................. 31

4.3.2 Readout buffer .................................... 32

4.3.2.1 Readout buffer design ............................ 32

4.3.2.2 Readout buffer parasitics consideration ......... 33

4.4 Multiplexer ............................................ 35

4.4.1 Clock divider ...................................... 36

4.4.1.1 Local synchronous start signal regenerator .... 37

4.4.1.2 Clock divider ................................... 37

4.4.1.3 Functional simulations .......................... 40

4.4.2 X_scan block ...................................... 42

4.5 Row driver ............................................. 46

4.6 Test and peripheral circuit for analog readout chain ........ 49

4.6.1 MBS and test circuit ................................. 49

4.6.2 Precharge circuit .................................... 50

4.7 Conclusions ........................................... 52

5 Interleaved 14-bit 64Ms/s SAR ADC and digital control block

5.1 Comparator design considerations .......................... 55

5.1.1 Latch ............................................... 55

5.1.2 Pre-amplifier and offset calibration .................... 59

5.1.2.1 First stage of the pre-amplifier ................... 59

5.1.2.2 Second and third stage of the pre-amplifier ...... 61

5.2 Single 14-bit 4Ms/s SAR ADC ................................ 65

5.3 Overview of interleaved 14-bit 64Ms/s SAR ADC ............ 66

5.4 Digital control block .................................... 69

5.5 Conclusion ............................................. 70

6 Conclusion

6.1 My contributions ....................................... 73

6.2 Future work ............................................ 73

Bibliography ............................................ 75
List of Figures

1.1 A CD jewel case imaged by visible light and 365-nm UV lighting .......................... 2
1.2 A white Toyota Prius with a new replaced fender imaged by visible light and UV light ......................................................................................................................... 2
1.3 Overview of sensor architecture .................................................................................. 3
2.1 Pinned photodiode ......................................................................................................... 6
2.2 P-n junction band diagram and inside movement of e-h pairs ..................................... 6
2.3 Net photocurrent contribution of each area of a typical photodiode ............................... 7
2.4 3T APS structure ......................................................................................................... 9
2.5 3T APS timing diagram ............................................................................................... 9
2.6 4T APS structure ......................................................................................................... 10
2.7 4T APS timing diagram ............................................................................................... 11
2.8 Multiple capture method and one example .................................................................. 12
2.9 Conversion gain method ............................................................................................. 12
2.10 The nonlinear response method .................................................................................. 13
2.11 The timing diagram of rolling shutter ........................................................................ 13
2.12 The timing diagram of global shutter ......................................................................... 14
2.13 A simplified 6-bit SAR ADC architecture and an example operation ....................... 15
3.1 The UV sensitive pixel architecture ............................................................................ 18
3.2 Classic front-side illuminated pixel and the front-side thinned pixel ........................... 18
3.3 The low blue pixel architecture ................................................................................... 19
3.4 The low blue pixel architecture ................................................................................... 20
3.5 Layout of the low blue pixel ........................................................................................ 20
3.6 The estimated quantum efficiency as a function of the wavelength for the low blue, visible and UV sensitive pixel ................................................................. 21
3.7 The basic 2x2 pixel structure ....................................................................................... 22
<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.8</td>
<td>The 4X4 pixel pattern</td>
<td>22</td>
</tr>
<tr>
<td>3.9</td>
<td>The timing diagrams of three different working models</td>
<td>23</td>
</tr>
<tr>
<td>3.10</td>
<td>The pixel standalone simulation</td>
<td>24</td>
</tr>
<tr>
<td>4.1</td>
<td>The overview of readout circuit</td>
<td>27</td>
</tr>
<tr>
<td>4.2</td>
<td>The schematic of the column load</td>
<td>29</td>
</tr>
<tr>
<td>4.3</td>
<td>The two columns’ S/H circuit and readout buffers</td>
<td>30</td>
</tr>
<tr>
<td>4.4</td>
<td>The S/H circuit</td>
<td>32</td>
</tr>
<tr>
<td>4.5</td>
<td>The readout buffer</td>
<td>32</td>
</tr>
<tr>
<td>4.6</td>
<td>The load of a readout buffer</td>
<td>34</td>
</tr>
<tr>
<td>4.7</td>
<td>The simulation circuit for settling</td>
<td>35</td>
</tr>
<tr>
<td>4.8</td>
<td>The bus arrangement</td>
<td>36</td>
</tr>
<tr>
<td>4.9</td>
<td>The synchronous start signal regenerator</td>
<td>37</td>
</tr>
<tr>
<td>4.10</td>
<td>The simulation result of the local synchronous start signal regenerator</td>
<td>37</td>
</tr>
<tr>
<td>4.11</td>
<td>The required eight time-interleaved 8 MHz clocks</td>
<td>38</td>
</tr>
<tr>
<td>4.12</td>
<td>The clock divider</td>
<td>39</td>
</tr>
<tr>
<td>4.13</td>
<td>The eight time-interleaved 8 MHz clocks simulation with 000 ROI setting bits</td>
<td>40</td>
</tr>
<tr>
<td>4.14</td>
<td>The eight time-interleaved 8 MHz clocks simulation with 001 ROI setting bits</td>
<td>41</td>
</tr>
<tr>
<td>4.15</td>
<td>The arrangement of row readout circuits and required timing</td>
<td>42</td>
</tr>
<tr>
<td>4.16</td>
<td>The conventional Dff-based scanner</td>
<td>43</td>
</tr>
<tr>
<td>4.17</td>
<td>The modified scan unit</td>
<td>43</td>
</tr>
<tr>
<td>4.18</td>
<td>The modified Dff-based scanner</td>
<td>44</td>
</tr>
<tr>
<td>4.19</td>
<td>The simulation result of the modified Dff-based scanner</td>
<td>45</td>
</tr>
<tr>
<td>4.20</td>
<td>An overview of the row driver block</td>
<td>46</td>
</tr>
<tr>
<td>4.21</td>
<td>The address code wire routing</td>
<td>47</td>
</tr>
<tr>
<td>4.22</td>
<td>The row driver unit</td>
<td>47</td>
</tr>
<tr>
<td>4.23</td>
<td>The simulation result of TG row driver</td>
<td>48</td>
</tr>
<tr>
<td>4.24</td>
<td>Brief schematic of the standard MBS Write cell</td>
<td>49</td>
</tr>
<tr>
<td>4.25</td>
<td>Brief schematic of the standard MBS Sense cell</td>
<td>49</td>
</tr>
<tr>
<td>4.26</td>
<td>Brief schematic of an MBS cell for writing in and sensing back</td>
<td>49</td>
</tr>
<tr>
<td>4.27</td>
<td>The scanner detector</td>
<td>50</td>
</tr>
<tr>
<td>4.28</td>
<td>The precharge unit</td>
<td>50</td>
</tr>
<tr>
<td>4.29</td>
<td>A simulation result of the row readout</td>
<td>51</td>
</tr>
<tr>
<td>5.1</td>
<td>The ADC category for image sensors</td>
<td>53</td>
</tr>
<tr>
<td>5.2</td>
<td>The ADC architectures, applications, resolution, and sampling rates</td>
<td>54</td>
</tr>
<tr>
<td>5.3</td>
<td>The latch in the comparator</td>
<td>56</td>
</tr>
<tr>
<td>5.4</td>
<td>The offset of the latch</td>
<td>57</td>
</tr>
<tr>
<td>5.5</td>
<td>The latch model</td>
<td>58</td>
</tr>
<tr>
<td>5.6</td>
<td>The architecture of the pre-amplifier</td>
<td>60</td>
</tr>
<tr>
<td>5.7</td>
<td>The first stage pre-amplifier</td>
<td>60</td>
</tr>
<tr>
<td>5.8</td>
<td>The input offset storage (IOS) and output offset storage (OOS)</td>
<td>62</td>
</tr>
</tbody>
</table>
List of Figures

5.9  The second and third stage pre-amplifier .................................. 62
5.10 The working principle of the offset calibration ............................... 62
5.11 The offset of the comparator ..................................................... 64
5.12 The residual affect ................................................................. 65
5.13 The single 4Ms/s SAR ADC ....................................................... 66
5.14 The overview of the ADC block ................................................ 67
5.15 The 14-bit serializer ............................................................... 67
5.16 The functional simulation result of the serializer ............................. 68
5.17 The schematic of the special I/O cell ......................................... 68
5.18 The schematic of the VASPI ...................................................... 69
5.19 The (a3d9) VASPI timing diagram and bit function for the upload mode ........ 70
5.20 The functional simulation result of an a3d9 VASPI ......................... 71
# List of Tables

1.1 Sensor specifications ................................................. 4

2.1 Absorption depth of photons in silicon. ............................ 8

3.1 UV sensitive and visible light pixel specifications. ............... 19

3.2 UV sensitive and visible light pixel specifications. ............... 21

3.3 Summary about the standalone pixel and the column bus. ......... 24

4.1 Settling parameters. ................................................. 31

4.2 Summary of S/H circuit structures. ................................ 31

4.3 The noise contribution of each transistor. ......................... 33

4.4 The readout buffer design parameters. ............................ 33

4.5 The load of a readout buffer. ...................................... 34

4.6 The function of the Mux2 ......................................... 38

4.7 The truth table of the decoder circuit. ............................ 38

4.8 The truth table of the decoder circuit. ............................ 48

5.1 The SPI address with dedicated LSBs for a particular block. ...... 70
Acknowledgements

Two years ago, I moved to Delft, Netherlands and started my life in Europe. I built a new network, joined one of the top research groups in the world and met a lot of interesting people. Here, I would like to give my sincere thanks to all the people who have helped me during my MSc project. Without them, it was impossible for me to grow in image sensors world.

First and foremost I would like to express my sincere thanks to Prof. dr. Albert Theuwissen, my supervisor at TU Delft. He introduced me into this colorful image sensors world by his lively and humorous teaching in the image sensor course. Especially his responsible guidance and instructive suggestions help me in every step of the thesis, even I lead a tight time schedule to him.

Special thanks to my daily supervisor and project leader dr. ir. Philippe Coppejans at Caeleste. He gave me valuable advice and hands-on instructions. He was always willing to share his knowledge and experience with me and I have learned a lot from his expertise. He taught me what a right step is in a real design.

Moreover, I am particularly grateful to Prof. dr. Bart Dierickx, CTO of Caeleste and dr. ir. Peng Gao. Bart guided me step by step from the beginning to the end. Without his help, this work could not have reached present form. I would like to express my gratitude to Peng Gao who is my mentor in this design. He spent a lot of time on guiding my design, which makes me profoundly understand it. I really want to thank for his support.

Also, I want to thank my colleagues at Caelsete: Jiaqi Zhu, Qiang Yao, Gaozhan Cai, Wei Wang, Patrick Henckes, Bert Luyssaert, Benoit Dupont, Dirk Uwaerts and many others, who are tutors in my study and friends in my life. Especially for Jiaqi Zhu, she helped me a lot in every step in this design.

Last but definitely not the least, I want to express my love to my parents and friends. It is their understanding, support and encouragement that drive me forward.

Ruijun Zhang
Delft, University of Technology
September 7, 2015

Master of Science Thesis Ruijun Zhang
Chapter 1

Introduction

With the fast development of the image sensor technology and the strong growth in market demand, specifications of image sensors become more and more demanding. E.g. image sensors are not limited to sensing visible light but aim to ultraviolet (UV), infrared, X-ray or even particle detection. This necessitates special pixel design and technology. Also high speed and high dynamic range (HDR) image sensors are in demand. Some applications need a combination of a number of these and other specifications. A custom design image sensor is one route to fulfill such demands.

In this thesis project, we contributed to a 1-Mega pixel HDR and UV sensitive image sensor with 14 bits 64Ms/s interleaved on chip ADC. This work is executed at Caeleste CVBA, Mechelen, Belgium.

In this first chapter, we discuss the motivation and challenge of this project is discussed in section 1.1. In section 1.2, an overview of this thesis project is presented in section 1.2. Finally, the section 1.3 shows the thesis organization.

1.1 Motivation and challenge

Most image sensors are intended to be sensitive to visible spectrum which means wavelengths in the range 390 to 700 nm [1], because this is the range where the human eye is also sensitive. However, we might expand the wavelengths ranges beyond the visible spectrum to obtain more information. Ultraviolet (UV) light is an electromagnetic radiation with a wavelength from 390 nm down to 100 nm, shorter than that of visible light but longer than X-rays[2]. In some cases, sensitivity to ultraviolet light is essential for the application and it can provide unique information and open new applications.

Reflected-UV imaging has begun to emerge as an inspection modality for some industrial processes. Firstly, the most common applications for reflected-UV imaging is the detection of scratches of a surface. The shorter UV wavelengths can be scattered more strongly off to tiny scratches compared to visible light in same angle. Figure 1.1 shows that a CD jewel case
is contrast-imaged by visible light (left) and 365-nm UV illumination (right) [3]. According to figure 1.1, the scratches can be seen much easier in a UV image. Secondly, reflected-UV can also be used to distinguish the material on the surface according to the different absorption coefficient of different material. For example, we can distinguish the new paint of a car according to the degree of oxidation because relatively unoxidized paint has a UV-inhibiting clear coat. Figure 1.2 shows a contrast image of a white Toyota Prius with a new replaced fender in visible light (left) and UV light (right) [3]. We can see from the figure that the new fender is darker in the 320-400-nm UV compared to the older paint on the rest of the car. Additionally, as similar to visible light image sensors, high dynamic range (HDR) may also be for UV sensitive image sensors.

![Figure 1.1: A CD jewel case imaged by visible light (left) and 365-nm UV lighting (right).](image1)

![Figure 1.2: A white Toyota Prius with a new replaced fender imaged by visible light (left) and UV light (right).](image2)

According to the most common application of UV image sensors, to detect the tiny scratches of a surface, the combination of high UV sensitive image sensors and a high resolution ADC is required. Otherwise, one of them is over designed. As the system integration of image sensors is increasing, on-chip ADCs become more popular compared to off-chip ADCs. Furthermore, applications of UV sensitive image sensors are relatively special and custom design oriented compared to visible light image sensors. The on-chip ADC should be also portable and easy to adapt to other projects. Considering all above requirements, a high resolution and portable on-chip ADC is worth the design effort.

In this project, the main challenges are three parts.

1. A UV sensitive and high dynamic range pixel array and its analog read out circuit should be implemented.

2. One on-chip high resolution, high speed and portable ADC is required.

3. Because of the on-chip ADC, synchronization of whole system and high testability are necessary.

Ruijun Zhang

Master of Science Thesis
1.2 Thesis project overview

Figure 1.3 shows the sensor’s architecture of this project.

![Figure 1.3: Overview of sensor architecture.](image)

A 0.18\(\mu m\) process is used in this project.

In general, this project contains six main blocks and the detail of these parts will be discussed in other chapters.

- **Pixel array block**: It contains a pixel array which has a 1280x720 pixel resolution to meet the customer’s requirement. The pixel size is 6.0\(\mu m\) \(\times\) 6.0\(\mu m\).
- **Pixel row driver block**: It contains the Y-axis addressing decoder for pixel row selection and row driver to drive transistors in the pixel.
- **Analog readout block**: It contains the column read out circuit, X-axis multiplexing and analog bus to read out the column signal before the ADC block.
- **SAR ADC block**: It contains 16 interleaved 14-bit 4Ms/s SAR ADCs and output serializer to achieve A/D conversation and digital output.
- **Digital control block**: It contains variable addressable serial parallel interface (VASPI) and other interface and housekeeping sub-circuits to internal control.
- **I/O block**: It contains power supply, low-voltage differential signaling (LVDS) sender/receiver and tuner settings to provide I/O function of the whole sensor.
Table 1.1 is a summary of this sensor’s specifications.

<table>
<thead>
<tr>
<th>Specification</th>
<th>Value/Comment</th>
</tr>
</thead>
<tbody>
<tr>
<td>Processing technology</td>
<td>0.18\textmu m 1P4M CMOS</td>
</tr>
<tr>
<td>Power supply</td>
<td>3.3V</td>
</tr>
<tr>
<td>Pixel pitch</td>
<td>6.0\textmu m</td>
</tr>
<tr>
<td>Dynamic range</td>
<td>101dB</td>
</tr>
<tr>
<td>QFW [e]</td>
<td>High gain: 12ke- Medium gain: 70ke- Low gain: 350ke-</td>
</tr>
<tr>
<td>Readout noise</td>
<td>High gain: 3 e- Medium gain: 30 e- Low gain: 50 e-</td>
</tr>
<tr>
<td>Image sensor master clock</td>
<td>64MHz</td>
</tr>
<tr>
<td>ADC resolution</td>
<td>14 bits</td>
</tr>
<tr>
<td>Elementary ADC sample rate</td>
<td>4Ms/s</td>
</tr>
<tr>
<td>ADC master clock[Hz]</td>
<td>128MHz</td>
</tr>
<tr>
<td>Frame rate[Hz]</td>
<td>64</td>
</tr>
</tbody>
</table>

1.3 Thesis organization

This thesis consists 5 chapters. The first chapter gives an introduction for this project which illustrates the motivation and challenges.

Chapter 2 gives necessary background information for this project. It includes basic concept of CMOS image sensor, the principle of photodiode, conventional pixel operation, common methods of HDR and the working principle of the successive approximation register (SAR) analog-to-digital converter.

Chapter 3 explains three kinds of pixel and the 4×4 kernel used in this project. The extended FD method is also explained in this chapter.

Chapter 4 shows the design and simulation result of the whole analog readout chain which includes S/H circuits, column loads, readout buffers and the multiplexer.

Chapter 5 focuses on the ADC block. The design and verification of the comparator, serializer and digital control block are discussed. The single 4Ms/s SAR ADC and an overview of interleaved 14-bit 64Ms/s SAR ADC are also presented in the chapter 5.

Chapter 6 contains conclusions of the thesis and the future work.
Chapter 2

Background of the CMOS image sensor and SAR ADC

This chapter gives an overview of the CMOS image sensor working principle, most commonly-used pixel structures, HDR methods and successive approximation ADC (SAR ADC). In section 2.1, optical absorption, photodiode operation and silicon photon absorption length is discussed. Then, a brief introduction of pixel structure and common methods of HDR is present in section 2.2. Finally, Section 2.3 presents a brief overview of the SAR ADC.

2.1 Basic concept of CMOS image sensor

In 1967, Weckler proposed the operation of charge integration on a photon-sensing p-n junction which was treated as the fundamental principle of CMOS image sensor [4]. This charge integration technology is still being used in the CMOS image sensors now. Shortly, in 1968, Weckler and Dyck proposed the first passive pixel image sensor [5]. In 1968, Peter Noble described the CMOS active pixel image sensor. This invention laid the foundation for almost modern CMOS image sensors. Yet one had to wait until the 1990s solving the limitations of CMOS technology for active pixel image sensors to develop rapidly.

2.1.1 The photodiode and optical absorption

Almost all solid-state image sensors are based on the photoelectric effect in a p-n junction and the charge integration. The photoelectric effect is the phenomenon in which electrically charged particles are excited from or within a material when it absorbs electromagnetic radiation. Specific to silicon, if the photon energy is larger than the band-gap of silicon, the absorbed electromagnetic radiation generates electron-hole pairs which can be subsequently collected by a photodiode.

The photodiode is the light detector in CMOS image sensor. The basic structure of a pinned photodiode is shown in figure 2.1. The photodiode contains a p-n junction and a built-in...
electric field is inside. The dash region is the depletion region. Moreover, compared to a normal photodiode, the pinned photodiode contains an extra p+ layer above the n layer to reduce part of dark current noise. The p+ layer can separate the surface states of the SiO₂-Si interface from the n layer (storage area). Figure 2.2 is a typical p-n junction band diagram and the movement of generated e-h pairs under reverse bias. The free electrons/holes inside the depletion region will be pushed into the n-doped or p-doped region by the electric field as shown in figure 2.2.

![Figure 2.1: Pinned photodiode.](image)

Figure 2.1: Pinned photodiode.

![Figure 2.2: P-n junction band diagram and inside movement of e-h pairs.](image)

Figure 2.2: P-n junction band diagram and inside movement of e-h pairs.

The generated e-h pairs can be classified according to the place of origin, in the depletion region (Region II in figure 2.3 [6]) or in the quasi-neutral region (Region I and III in figure 2.3).
e-h pairs generated inside the depletion region are effectively separated towards the n-doped or p-doped region by the built-in electric field. Electrons drift towards the n-doped region and holes drift towards the p-doped region. The e-h pairs generated in the quasi-neutral region will move to a Brownian motion. Some of them move into the depletion region and then are collected into the n-doped or p-doped region by the electric field. Others will recombine or collected by other devices inside the pixel, or are collected by a neighbor pixel. The green area in figure 2.3 is the net photocurrent contribution of each area of a typical photodiode[6].

![Figure 2.3: Net photocurrent contribution of each area of a typical photodiode[6]](image)

Furthermore, the light absorption in a photodiode is correlated with the wavelength and the silicon depth. Table 2.1 shows the absorption depth of photons in silicon versus wavelength.

The shorter wavelength photons have a shallower penetration depth. The Figure 2.3 presents a light distribution in a typical photodiode. The photocurrent model is $I(z) = I_0 e^{-\alpha z}$, where $z$ is the silicon depth and $\alpha$ is the absorption coefficient of silicon. According to above property, the location of the p-n junction inside the photodiode has large influence on the absorption for different wavelength photons. Another important factor for the light absorption is the constructions above the pixel such as SiO$_2$ layer and metal wire, as it can either absorb or reflect the light before it reaches the photodiode.
### Table 2.1: Absorption depth of photons in silicon.

<table>
<thead>
<tr>
<th>wavelength (nm)</th>
<th>Absorption depth (µm)</th>
</tr>
</thead>
<tbody>
<tr>
<td>400</td>
<td>0.25</td>
</tr>
<tr>
<td>500</td>
<td>1</td>
</tr>
<tr>
<td>550</td>
<td>1.54</td>
</tr>
<tr>
<td>600</td>
<td>2.5</td>
</tr>
<tr>
<td>700</td>
<td>5</td>
</tr>
<tr>
<td>800</td>
<td>11.1</td>
</tr>
<tr>
<td>900</td>
<td>30.3</td>
</tr>
<tr>
<td>1000</td>
<td>100</td>
</tr>
</tbody>
</table>

#### 2.1.2 Pixel structures and common methods of HDR

Now the most commonly used pixel structure is the active pixel sensor (APS) as it can handle a lot of noise issue, is available with a reasonable performance and at an acceptable cost. In addition, 3t and 4T are mostly used structures of active pixel sensor (APS).

##### 2.1.2.1 Photodiode Three Transistor (3T) Pixel

The 3T pixel is the first active pixel sensor. It contains a photodiode, reset gate, selection gate and source-follower readout transistor which is so-called 3T cell. Figure 2.4 shows the typical 3T APS structure. The working principle can be explained as follows: In the beginning, the reset transistor (RST) is on and the photodiode is set to the maximum value (Vdd). Then, the reset transistor is off and exposure phase starts. This operation introduces $kT/C$ noise to the photodiode. During the exposure, the photon generated electrons are collected by the photodiode and the voltage of photodiode decreases. When the exposure ends, the row select transistor (RS) is switched on. The voltage of photodiode is read out by the source follower to the column bus. At the end of read out phase, the reset transistor (RST) resets the voltage of photodiode to the Vdd for the next exposure. Then, the reset value of photodiode is also read out by the source follower. According to this working principle, the timing diagram is shown in Figure 2.5.

The 3T pixel is the first structure which introduced the in-pixel amplifier to isolates the photodiode capacitor from the large column bus capacitor. However, in-pixel amplifiers have large mismatch between different pixels which will introduce the fixed pattern noise (FPN). To reduce the offset variations, Double Delta Sampling (DDS) is one of the solution. Although this technique can be very efficient for the offset cancellation, it can have a negative effect on the reset and other thermal noise sources [7].

##### 2.1.2.2 Pinned Photodiode Four Transistor (4T) Pixel

To overcome the problem of the 3T pixel, the 4T pixel is developed and it is the mainstream CMOS image sensor pixel architecture today. Based on the 3T structure, the 4T pixel introduces a pinned photodiode instead of normal photodiode, a transfer gate and floating diffusion node as shown in Figure 2.6.
As mentioned before, due to p+ shallow implantation, the pinned photodiode is separated from the surface states of the SiO$_2$-Si interface which means the dark current is reduced by more than an order of magnitude.

Figure 2.7 shows the timing diagram. The operation of the 4T pixel goes as follows. Firstly, the floating diffusion is reset to a maximum value via the reset transistor (RST). Then, reset transistor (RST) is off and row select transistor is on. The voltage of the floating diffusion (FD) node ($V_{\text{reset}}$) is read out to column bus. During this phase, there is a small voltage drop in floating diffusion due to the cross-talk of the reset pulse. Secondly, When the transfer gate (TX) is on the photon-generated charge will transfer from the pinned-photodiode to the floating diffusion node. When TX is off, the voltage of FD will be read out as the video signal ($V_{\text{signal}}$). Correlated double sampling (CDS) is possible for 4T pixel because the reset signal and video signal can be read out in the same frame. CDS will subtract the $V_{\text{reset}}$ with $V_{\text{signal}}$ to cancel the reset noise and off-set variations from pixel to pixel.
2.1.2.3 Common methods of HDR

The noise sources in an image sensor can be briefly classified into three categories: dark current, fixed pattern noise and temporal noise. If the noise is fixed at a certain spatial position, it is called the fixed pattern noise (FPN). If the noise is time-dependent and not fixed to certain spatial position, it is named the temporal noise.

Dark current is the leakage current at the diode region which is not generated by photons but due to the leakage of junction and transistor. The dark current will limit the dynamic range of image sensors. The dark current also varies from pixel to pixel which is called dark signal non-uniformity (DSNU) and it is hard to cancel.

Nature of fixed-pattern noise is the mismatch of different devices which is fixed to certain spatial position. Two main fixed-pattern noises are offset and gain mismatch.

Temporal noise mainly contains kT/C noise from reset phase, flicker noise and photo shot noise. The kT/C noise is a soft thermodynamical limitation which can be suppressed by CDS. For flicker noise, it is not a physical limitation and can be reduced by design efforts. The photon shot noise is a kind of hard physical fundamental limitation due to the property of the light quantum nature which is unavoidable and can not be canceled.

The dynamic range (DR) is the ratio between the largest and smallest detectable possible values in same scene of an image sensor. It is defined as:

\[
DR = 20 \log \frac{N_{Max}}{N_{Min}}
\]  

(2.1)

\(N_{Max}\) is the maximum number of photon-generated charges at saturation and \(N_{Min}\) is the minimum number of detectable photon-generated charges which is equal to the noise level in
dark environment. Compared to DR, signal to noise ratio (SNR) is the ratio of the signal voltage to the noise level, which is defined as:

\[
SNR = 20 \log \frac{S_{\text{input}}}{S_{\text{noise}}} \quad (2.2)
\]

\(S_{\text{input}}\) is the input signal voltage and the \(S_{\text{noise}}\) is the noise level also in volts, at the same time. The different between the SNR and the DR is that the SNR is signal and environment depended but the DR is a kind of device performance.

To increase the dynamic range, we can increase the \(N_{\text{Max}}\) or reduce the \(N_{\text{Min}}\) or we can do both. The limitation of dynamic range for a CMOS image sensor is the well capacity which usually is limited by the floating diffusion node and the noise.

In general, there are four popular methods to achieve a dynamic range extension. [8] They are multiple capture method, multiple conversion gain method, Time to saturation detection method and nonlinear response method.

The multiple capture method is to capture a scene with different exposure time and integrate them into one image. Figure 2.8 shows the principle of multiple capture method and one example. The drawback of the this method: it introduces incorrectly interpolation for fast moving objects. And a longer exposure time also has large photon shot noise for in the same illumination. There is no circuit level design effort for this method.

Compared to the multiple capture method, the multiple conversion gain method only needs one exposure, but it reads out the charge integration twice: in the high gain model and the low gain model. The low gain model is to detect high illumination and the high gain model is for dimmed scenes.

Figure 2.9 shows the principle of multiple conversion gain method. The multiple conversion gain es not suffer from the blurred images of fast moving objects but it needs circuit level design efforts.
The idea of time to saturation detection method is to reuse the well capacity in one exposure. When the output signal is detected as high as a threshold voltage, the FD will be reset and the reset number will plus one. When the exposure ends, the final output signal is the reset number times threshold voltage plus residual voltage. As the FD is reset several times, it introduces large $kt/C$ noise.

There are usually two ways to achieve a nonlinear response method which are the logarithmic response photodiode and the well capacity adjusting method. In logarithmic sensors, the photocurrent is flowing through a resistor which is implemented by a MOS transistor operating in weak inversion mode and the photocurrent will be directly converted to a voltage to be read out. The well capacity adjusting method uses a reset transistor to change the voltage of the floating diffusion to increase the well capacity one or more times during integration. Both of them are shown in figure 2.10 [8] [9].

Our design will employ the extended FD method to achieve HDR. This method belongs to the multiple conversion gain type. The details will be discussed in the chapter 3.
2.2 Electronic shutter mode

As the pixel output is decided by the sensed illumination amount, both of the light intensity and exposure time have a large impact. In the film cameras age, the mechanical shutter is widely used to control the amount of light. For a digital camera, the mechanical shutter are replaced by the electronic shutter.

In CMOS image sensors, there are two kinds of electronic shutter modes: rolling shutter and global shutter. Figure 2.11 illustrates the working principle of the rolling shutter.

For the rolling shutter, the pixel array is operated row by row. All pixels in one row are reset and exposed at the same time. There is a constant delay between each row operation to reduce the peak current due to the reset phase. The disadvantage of the rolling shutter is motion blur artifacts as the information is captured in different time.

To dispose the rolling shutter’s undesired artifacts, the global shutter is developed. Figure 2.12 shows the working principle of the global shutter.
14 Background of the CMOS image sensor and SAR ADC

Compared to the rolling shutter, global shutter captures the whole photo in a single time so that the motion blur issue is solved. Pixels of the whole sensor are reset and exposed at the same time. At the end of the exposure, charges of each row are transferred to the additional not photosensitive memory element. Finally, they will be sequentially readout row by row. The limitation of the global shutter is the large reset peak current and additional memory element which requires extra area.

2.3 Successive approximation register (SAR) analog-to-digital converter

In the majority of CMOS image sensors, on chip analog-to-digital converters are employed to produce digital outputs due to high level of integration. In our design, the successive approximation register (SAR) ADC architecture is chosen. In the SAR ADC, there are four main blocks: a sample and hold circuit, a comparator, a digital to analog converter and a control logic block. Figure 2.13 shows a simplified 6-bit SAR ADC architecture and an example operation [8].

As the name implies, the SAR ADC basically uses a binary search algorithm although there are many variations for implementing a SAR ADC. Firstly, the input signal is sample and hold as one side input of the comparator. The other input signal of the comparator is the output of DAC which is controlled by the successive approximation registers. To implement the binary search algorithm, the output of the DAC is firstly set to $V_{FS}/2$ where the $V_{FS}$ is the reference voltage of the ADC. Then a comparison is performed to determine if $V_{in}$ is less than or greater than $V_{DAC}$. If $V_{in}$ is larger than $V_{DAC}$, the comparator output is a logic high and the MSB is set to logic 1. Otherwise, the comparator output is a logic low and the MSB is set to logic 0. Then, if the MSB is logic high, the SAR control logic forces the $V_{DAC}$ to increase half of previous range which means $V_{FS}/2$ and compare it with $V_{in}$ again. The
sequence continues all the way up or down to the LSB. In the end, the N-bit digital word is available in the register.

To design a fine SAR ADC, there are three main challenges. Firstly, a robust control logic block which adapts to the search algorithm is needed. Secondly, a low offset, small delay time and high resolution comparator is required. Finally, an accuracy and fast setting DAC is a must. Those challenges need a lot of design efforts to implement it.

2.4 Conclusion

In this chapter, the section 2.1.1 shows the working principles of the photodiode and the optical absorption. In section 2.1.2, an overview of conventional 3T and 4T pixel structure and general HDR method is introduced. The section 2.2 illustrates the electronic shutter mode. In section 2.3, an introduction to the SAR ADC is presented.

In the next chapter, the detailed design of pixel array will be explained.
Chapter 3

Pixel concept

In the previous chapter, we gave a basic background about pixel structures and the working principle. This chapter focuses on the design of pixel array. The section 3.1 presents three kinds of basic pixel units in this design and its operation timing diagram. The whole pixel array and the kernel of the pixel array which adapts to the technology are given in section 3.2.

3.1 Pixel overview

The design target for this project is to detect UV light with an HDR function. To achieve this requirement, three kind of pixels, a special kernel and an appropriate readout mode are implemented in this project.

3.1.1 UV sensitive pixel and visible light pixel

The UV sensitive and visible light pixel in this design are both based on the architecture shown in the figure 3.1. This architecture is based on 4T pixel structure. To achieve UV sensitive, a special layout and the front side thinning technology is employed in the UV sensitive pixel. For the visible light (VIS) pixel, the architecture is the same as UV pixel expect for the front side thinning technology. The function of the VIS pixel has two parts. The first one is to obtain the information of visible light. Another one is to fulfill the special technology requirement of the UV pixel. The difference of the UV pixel output and the VIS pixel output can provide the information of the UV light.

When operating an image sensor under front-side-illumination pixel, the quantum efficiency (QE) is strongly dependent on the metal density in the metal-oxide-metal stack above the silicon and the thickness of the silicon oxide. Therefore the layout of UV sensitive pixel has a very few metal interconnections which are preferably traced in the border area. The detail of the pixel layout will be discussed in next section with the pixel kernel. Even if almost entirely
free from metal, the UV light is yet absorbed in and reflected at the front-side silicon oxide. In order to achieve higher sensitive to the UV light, it is useful to let the photons reach the pinned photodiode directly. The front side thinning technology which is to create a trenches in the silicon oxide layer is a solution for this issue. The figure 3.2 shows the classic front-side illuminated pixel and the front-side thinned pixel.

To achieve HDR function, the UV sensitive and visible light pixels can both work in medium gain model which is based on the 3T model. In medium gain model, the transfer gate is always on so that the FD connects to the PPD directly. The full well charge ($Q_{FW}$) is increased to the sum of $C_{PPD}, C_{FD}$ and $C_{TG}$. The drawback of the medium gain model is that 3T pixel cannot do the CDS. The noise level is relatively high compared to the 4T model. But the SNR is higher than the high gain because of a large $N_{Max}$. The table 3.1 is a summary of the UV sensitive and visible light pixel specifications.
3.1 Pixel overview

Table 3.1: UV sensitive and visible light pixel specifications.

<table>
<thead>
<tr>
<th>Model</th>
<th>Pixel size</th>
<th>Charge conversion ($\mu$V/e-)</th>
<th>kT/C noise</th>
<th>SNR Max</th>
<th>QFW</th>
</tr>
</thead>
<tbody>
<tr>
<td>High gain</td>
<td>6.0 µm × 6.0 µm</td>
<td>80</td>
<td>20e</td>
<td>100</td>
<td>12k</td>
</tr>
<tr>
<td>Medium gain</td>
<td>6.0 µm × 6.0 µm</td>
<td>13</td>
<td>50e</td>
<td>250</td>
<td>70k</td>
</tr>
</tbody>
</table>

3.1.2 Low blue pixel

Another type of pixel in this design is the low blue pixel which can detect the lower part of the visible spectrum. The low blue pixel filters out partly the blue light because of a poly layer on top of the photodiode. The main reason that we introduce the low blue pixel is to ensure the fabrication of the UV pixel can be adapted to a certain technology. In addition, the poly layer on top of the photodiode can realize more functions especially for the HDR.

The schematic of the low blue pixel is shown in figure 3.3 which is based on 4T structure with a extra variable FD capacitor.

Compared to the UV sensitive and VIS pixel, one extra capacitor is introduced to increase the full well charge. The source follower and select (SEL) transistors employ a specific implant to achieve a low $V_{th}$ which provides a larger signal range.

In this design, the extra variable FD capacitor is placed on top of the photodiode as a MOS electrode to the inversion layer in the pinning layer, as in [11]. The figure 3.4 shows the physical structure of the low blue pixel. Firstly, the poly layer can filter out part of light especially for the blue light according to the silicon optical absorption property. Secondly, the poly layer can be used as a MOS capacitor when a positive voltage is applied on that. A positive voltage on the poly layer can form an inversion layer in the pinning layer which actually is a MOS capacitor.

Because of the inversed MOS capacitor, the low blue pixel can work in the high gain, medium gain and low gain model. The high gain and medium gain model are the same as the UV sensitive pixel and visible light pixel. Figure 3.5 shows the layout of the low blue pixel.

Figure 3.3: The low blue pixel architecture.
In the low gain model, according to the layout, if the electron node is connected to a proper positive voltage, the inverted MOS capacitor will connect to PPD and FD directly. The full well charge \(Q_{FW}\) changes to the sum of \(C_{PPD}, C_{FD}, C_{TG}\) and \(C_{cap}\). The drawback of the low gain model is the same for the other two pixels in the medium gain model which is that the 3T pixel cannot do CDS. The noise level is relatively high. The table 3.2 is a summary of the low blue pixel specifications.

In the low gain model, according to the layout, if the electron node is connected to a proper positive voltage, the inverted MOS capacitor will connect to PPD and FD directly. The full well charge \(Q_{FW}\) changes to the sum of \(C_{PPD}, C_{FD}, C_{TG}\) and \(C_{cap}\). The drawback of the low gain model is the same for the other two pixels in the medium gain model which is that the 3T pixel cannot do CDS. The noise level is relatively high. The table 3.2 is a summary of the low blue pixel specifications.

Those three kinds of pixels are designed for different sensitivity in different spectrum. The figure 3.6 shows the quantum efficiency as a function of the wavelength for the three pixels.
3.1 Pixel overview

### Table 3.2: UV sensitive and visible light pixel specifications.

<table>
<thead>
<tr>
<th>Model</th>
<th>Pixel size</th>
<th>Charge conversion ($\mu$V/e$^-$)</th>
<th>kT/C noise</th>
<th>SNR$_{Max}$</th>
<th>Q$_{FW}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>High gain</td>
<td>6.0µm x 6.0µm</td>
<td>80</td>
<td>20e</td>
<td>100</td>
<td>12K</td>
</tr>
<tr>
<td>Medium gain</td>
<td>6.0µm x 6.0µm</td>
<td>13</td>
<td>50e</td>
<td>250</td>
<td>70K</td>
</tr>
<tr>
<td>Low gain</td>
<td>6.0µm x 6.0µm</td>
<td>2.5</td>
<td>100e</td>
<td>650</td>
<td>350K</td>
</tr>
</tbody>
</table>

We can see different QE in the short wavelength part due to thinning of the oxide layer of the UV sensitive pixel and the ploy layer of the low blue pixel. Since no filtering happens at higher end the visible range, we see a relative similar QE for the three pixels. The figure 3.6 shows the estimated quantum efficiency as a function of the wavelength for the three pixels.

![Figure 3.6: The estimated quantum efficiency as a function of the wavelength for the low blue, visible and UV sensitive pixel.](image)

3.1.3 Pixel kernel

In this design, we propose two kinds of basic $2 \times 2$ pixel structure to build up the pixel kernel as shown in the figure 3.7. For the left one, two low blue pixels around the UV sensitive pixel are designed to meet the etching technology requirement. The technology requires the minimal opening width in both x and y direction imposed by the fabrication process is 4µm. The right-side structure is designed to optimize the modulation transfer function (MTF).

For the pixel kernel, the fabrication is inevitably subject to lithographical imitations because of the UV sensitivity requirement. The minimal opening width in both x and y direction imposed by the fabrication process is 4µm. It is not compatible with the target pixel pitch of 6µm, if the 3 different pixels are placed in a traditional Bayer pattern. To meet the design and the fabrication requirement, we arrange the 3 different pixels in a different fashion such that 2 UV pixels are put side-by-side and mirrored as shown in the figure 3.8. In figure 3.8, the dashed box delimits the repeated the new $4 \times 4$ kernel. The respectively highlighted green,
The basic 2x2 pixel structure.

Figure 3.7: The basic 2x2 pixel structure.

blue and red boxes is the low blue, visible and UV sensitive pixel. This new 4×4 pixel kernel can obey the etching restrictions while maintaining a small pixel size.

Figure 3.8: The 4X4 pixel pattern.

Based on this kernel, a 1280×720 pixels array is implemented in this design.

3.2 Pixel operation

The timing diagrams of three different working models are illustrated in the figure 3.9(a), figure 3.9(b) and figure 3.9(c).

In the high gain model, the timing diagram is the same with the normal 4T pixel. The FD is first reset to $V_{dd_{pix}}$ and the select transistor is on to read out the reset signal. At the same time, the pixel is also in the integration phase. When the integration phase ends, the TG is on to transfer the electrons from the PPD to the FD. Then the select transistor completes the HG video signal readout. In this design, the CDS will be operated later in the ADC stage. In the medium gain model, TG is always on so that the 4T pixel is working as the 3T model. At the beginning, the FD is reset to $V_{dd_{pix}}$ and the reset signal is readout via the SF. Because of lack of the TG, the medium gain video signal is directly read out after the reset signal. According to this property, the reset phase should be relatively short. The low gain model is specially designed for low blue pixel. In the beginning, the electron is on and the inverted
Figure 3.9: The timing diagrams of three different working models.
MOS capacitor is formed to increase the FD capacitance. Other operation timing is same as the medium gain model.

According to previous mentioned operation timing, the figure 3.10 shows a very rough pixel standalone simulation in the high gain model. This simulation is used to know the gain of the SF and parasitic capacitance of the FD node. The parasitic RC of the column bus comes from two parts. The first one is 720 parallel SEL transistors’ capacitance of the drain node. Another part is the resistance and capacitance of the column bus which is a thin metal wire. The table 3.3 shows a summary. The parasitic RC of the column bus is estimated based on the layout.

<table>
<thead>
<tr>
<th>Source follower</th>
<th>Gain: 0.816</th>
</tr>
</thead>
<tbody>
<tr>
<td>Largest output swing</td>
<td>1.1V</td>
</tr>
<tr>
<td>Parasitic capacitance of the FD node</td>
<td>0.717 fF</td>
</tr>
<tr>
<td>Column bus parasitic(node)</td>
<td>460.8fF</td>
</tr>
<tr>
<td>Column bus parasitic(wire)</td>
<td>1134Ω and 0.864pF</td>
</tr>
</tbody>
</table>
3.3 Conclusion

In this chapter, three kinds of different pixels and its features are illustrated. Based on those pixels, one special pixel kernel which can meet the technology etching restrictions in a small pixel size is developed. Then, the timing diagrams of the three working models are introduced. A standalone pixel simulation which is based on the high gain model is presented.

In the next chapter, the analog readout chain is illustrated. The analog readout chain contains the column load, sample and hold circuit, readout buffer and multiplexer.
In the previous chapter, we introduced three pixels and a special 4x4 pixel kernel in this design. This chapter focuses on the analog readout part before the on chip ADC. The design of on chip ADC will be discussed in the chapter 5.

4.1 Overview

The figure 4.1 shows an overview of the analog readout circuit in this design.

![Figure 4.1: The overview of readout circuit.](image)

In this design, the readout circuit can be classified into the analog part and the digital part. The analog readout part refers to the readout chain before the ADC block. The digital
readout part refers to the 14-bit digital word after the A/D conversion which is read out by two low-voltage differential signaling (LVDS) senders.

In this design, we arrange the readout method in an odd/even interleaved way. In this method, one row is in the S/H phase and the previous row is in the readout phase. Eight analog buses are employed to connect the column outputs to the ADC block.

Because of noise requirement, the sample and hold capacitor is at least 2 pF. Its settling error should be less than one half LSB at the 64 frame rate speed which means $60\mu V$ for $2\text{V}_{pp}$ input range 14-bit ADC. It is difficult to meet this requirement in a traditional way, it will cost large power consumption and area. With this method, the sample and hold time increased to near half of the row time. The price we paid for this method is that four S/H capacitors and readout buffers are required for each column which costs the area. As the S/H circuit has a relatively long hold time, the leakage of S/H capacitor should be taken into account.

To achieve this readout method, the analog readout chain contains four main blocks:

1. Column load: providing the load current for the each column and can remove the memory effect of the column bus.

2. Sample and hold circuit and readout buffers: providing the S/H function and sending the analog signal out via the analog buses to the ADC block.

3. Control logic: generating the sync signal and clock for the timing sequence of the x-direction multiplexing, controls the readout buffers and the region of interest (ROI) function.

4. Row driver: providing row-per-row logic control signal of the pixel array. Each row of pixels requires 3 or 4 control signals: TG, RESET, SEL, and ELECTRON.

Finally, to adapt the column outputs to the time-interleaved ADCs, 8 analog buses are arranged between the analog readout chain and the ADC block. Each bus has two signal lines, reset and video line. The reset line is used to transmit the reset signal of the pixel. And the video line is used to transmit the signal of the pixel after exposure. The analog buses can also be connected to bond pads to test the ADC block and the image sensor block independently.

### 4.2 Column load

To read out the column signal, in-pixel source followers are driven by a subcircuit which is called column load. Every column needs a column load so that 1280 column loads are implemented in parallel in this design. The dash box in figure 4.2 shows the schematic of a column load as used in this design.

The column load has four main functions:

1. Bias current for the in-pixel source follower: With a proper biasing, the 'ntuner' transistor can provide the biasing current for the in-pixel source follower to read out column output signal. The biasing current value depends on the S/H capacitor size, the required settling time and the tolerable error. Normally, a large capacitance and short settling time lead to a large biasing current.

2. Column short testability: The 'test' transistor in the dashed box 'C' is designed to check if two neighboring columns are shunted. When the test transistor is on, the voltage of column
output is forced to Vddpix. If neighboring columns are not shorted, they will carry different signals. Otherwise, both of them are forced to Vddpix.

3. Precharge of the column bus: the "Precharge" transistor in the dashed box "D" is designed to speed up the column bus settling. Between reset and video signal, the column bus need to be precharged to cancel the memory effect. For 4T pixel, this settling time must be in the order of the transfer time which is relatively short. Without the precharge function, the column bus is settling by the biasing current of the "ntuner" transistor. To speed up the column load's settling, one introduces an additional discharge path "D". The precharge transistor is used as an enable function and the ntuner_precharge transistor provides extra discharge current. It can work in two models. If ntuner_precharge transistor is biased as a current source via an external ntuner, it works in liner region. If ntuner_precharge transistor is directly biased by Vdd, we have the maximum precharge effect which means that it works in the exponential region.

4.3 Sample and hold circuit and readout buffers

The design of sample and hold (S/H) circuits and readout buffers is directed by the readout method we choose. As mentioned before, we propose an interleaved readout method which means two rows are in operation at the same time. For each column, no matter if the pixel is working in 3T or 4T model, the reset signal and the video signal must be sampled-and-hold and then read out during the row time. Considering the interleaved readout method, one
Figure 4.3: The two columns' S/H circuit and readout buffers.

The typical readout process in 4T model is as follows: Firstly, row_n is selected by the row driver and the signal reset_n is sampled and hold in the capacitor C1 and C5. When signal reset_n’s sample phase ends, the TG is on and the photon-generated charges are transferred from the PPD to the FD. In the meantime, the column bus is precharged to Vss to speed up the next signal’s settling. When photo charges are all transferred, the signal video_n is sampled and hold in the capacitor C3 and C7. When reset_n and video_n signal are all sampled, readout buffers start to read out those two signals simultaneously under the control of multiplexer. When row_n’s signal start to be read out via readout buffers, row_n+1 starts its sample and hold phase and it repeats row_n’s previous operation. The difference is that the reset_n+1 and video_n+1 signal are sampled in capacitor C2, C4 and C6, C8. In a word,
in one row time, row\_n is under readout phase and row\_n+1 is in S/H stage. We interleave
the S/H stage and readout phase in this way.

### 4.3.1 Sample and hold circuit

There are four key factors to decide the S/H circuit structure. The first one is the longest
possible sampling time. Second one is the largest acceptable settling error. The input signal
range is also important for S/H circuit. The last one is the S/H capacitor introduced kT/C
noise. Table 4.1 shows those the requirements for this design. The settling time is nearly half
of row time as the odd/even interleaved readout method. The kT/C noise is 1/3 LSB for a
2V peak-to-peak 14 bits ADC.

<table>
<thead>
<tr>
<th>Table 4.1: Settling parameters.</th>
</tr>
</thead>
<tbody>
<tr>
<td>Row time</td>
</tr>
<tr>
<td>21.7µV</td>
</tr>
</tbody>
</table>

Four kinds of S/H circuit structures can be chosen for this design. The table 4.2 is a summary
of those potential choices.

<table>
<thead>
<tr>
<th>Table 4.2: Summary of S/H circuit structures.</th>
</tr>
</thead>
<tbody>
<tr>
<td>Structure</td>
</tr>
<tr>
<td>nMOSFET switch</td>
</tr>
<tr>
<td>T-switch</td>
</tr>
<tr>
<td>Complementary</td>
</tr>
<tr>
<td>switch</td>
</tr>
<tr>
<td>Bootstrap switch</td>
</tr>
</tbody>
</table>

If the performance is the single consideration, the bootstrap switch is the best choice. But,
in this design, each column needs four S/H circuits. There are 5120 S/H circuits in parallel.
if the bootstrap switch is employed, it costs a large area which is unacceptable. Therefore,
the complementary switch is chosen in this design.

For the sample capacitor, the metal insulator metal (MIM) capacitor and the MOS capacitor
are two potential choices. The MIM capacitor has two advantages compared to MOS capac-
itor. Firstly, the MIM can handle positive and negative voltages. For the MOS capacitor,
only a positive voltage is allowed. Secondly, the MIM capacitor suffers from less mismatch
problems compared to the MOS capacitor. The disadvantage of the MIM capacitor: its density is one third to one fifth of the MOS capacitor which means three to five times larger area is required for the same capacitance. Because the column output is always a positive voltage and with the area constraint, 2 pF MOS capacitor is chosen in this design. Figure 4.5 shows the schematic of S/H circuit. The on-resistor of the complementary switch is about 15k Ω.

![Diagram of S/H circuit](image)

**Figure 4.4:** The S/H circuit.

### 4.3.2 Readout buffer

An analog buffer in the column readout circuit is used to send out the sampled column output signal. There are three design considerations for this readout buffer. The first one is the readout noise introduced by the readout buffer. The second is the gain loss due to the limited closed-loop gain which can decrease the signal range and SNR. The third one is the input range of the readout buffer which should be adjusted to the column signal range.

#### 4.3.2.1 Readout buffer design

Figure 4.6 shows the schematic of the readout buffer which is used in this design.

![Diagram of readout buffer](image)

**Figure 4.5:** The readout buffer.
In this design, the readout buffer is implemented in a modified 5T buffer style. The formula 3.1 shows the open loop gain of this buffer. The $u_P$ and $u_N$ are the hole mobility and electron mobility.

$$A \approx \frac{g_{m5}}{g_{m7}} \approx \sqrt{\frac{W_5}{u_P L_5} \frac{W_7}{u_N L_7}} \quad (4.1)$$

The transistor M1 is designed as a current source which decides the DC biasing and the driven current of the readout buffer. The transistor M2 achieves the enable function so that the readout buffer can be turned off when it is not used. The dashed box A is the input pair of the readout buffer. In this design, a pMOS input pair is employed. There are two reasons. The first one is that the pMOS input pair is suitable for the column output signal range which is around 0.4V to 1.5V. The second reason is that the 1/f noise of pMOS is one order of magnitude smaller compared to the nMOS for the same size so that it can lower the readout noise effectively. Another noise source comes from the transistor M6 and M7 which introduce the white noise. The white noise can be reduced by increasing the $g_m$ of transistor M6 and M7. The most efficient way to reach a higher $g_m$ is to increase the biasing current or $W/L$ ratio. Transistor M8 multiplexes the signal on the analog buses.

Table 4.3 shows the noise contribution of each transistor which is based on the AC noise simulation. In this design, the input referred readout noise of the readout buffer is 107 $\mu$VRMS.

<table>
<thead>
<tr>
<th>Transistor</th>
<th>Noise contribution</th>
</tr>
</thead>
<tbody>
<tr>
<td>M6</td>
<td>27.763%</td>
</tr>
<tr>
<td>M7</td>
<td>26.812%</td>
</tr>
<tr>
<td>M5</td>
<td>17.572%</td>
</tr>
<tr>
<td>M4</td>
<td>16.992%</td>
</tr>
<tr>
<td>M8</td>
<td>9.290%</td>
</tr>
</tbody>
</table>

4.3.2.2 Readout buffer parasitics consideration

For the readout buffers of image sensors, the input signals are firstly sampled by the S/H circuit so that the readout buffer input signals are always a DC voltage. For this property, the slew rate (SR) instead of frequency response is important for the readout buffer. The slew rate is defined as:

$$SR = \frac{I_{max}}{C_{load}} \quad (4.2)$$

The $I_{max}$ is the maximum output current and the $C_{load}$ is the load capacitor of the readout buffer. According to this formula, the load capacitor is a key factor for the slew rate. The figure 4.7 illustrates the load of a readout buffer in this design.
The load of the readout buffer can be classified into three parts. The first one is the parasitic capacitance of 320 parallel source nodes of the select transistor (M8), load A. In this design, 8 analog buses are employed for 1280 columns and each bus has two signal lines, reset and video signal line. In addition, each column has 4 readout buffers. So 320 readout buffers are grouped into one signal line. In the readout phase, one readout buffer is active in one signal line and it needs to load 319 parallel closed select transistors (M8). The parasitic capacitance of those switch nodes have a positive correlation with the size of switches. To minimize the parasitic capacitance, those switches are designed according to the minimum size allowed.

The second part is the wire resistance and capacitance of the signal line, load B. This load strongly depends on the layout routing. The third one, load C, is the input capacitance of peripheral circuits of the signal line which include the ADC input buffer, the precharge buffer and the ADC bypass switch. The table 4.5 presents the specific values of those loads.

**Table 4.4:** The load of a readout buffer.

<table>
<thead>
<tr>
<th></th>
<th>Capacitance</th>
<th>Resistance</th>
</tr>
</thead>
<tbody>
<tr>
<td>320 switch nodes</td>
<td>580 fF</td>
<td>/</td>
</tr>
<tr>
<td>Wire RC</td>
<td>1.6 pF</td>
<td>242 Ω</td>
</tr>
<tr>
<td>Peripheral circuits</td>
<td>136 fF</td>
<td>/</td>
</tr>
</tbody>
</table>

Figure 4.8 shows the simulation circuit which is used to check the setting of the S/H circuit and the signal line.

In this standalone pixel output signal simulation, wire parasitic capacitance and resistance are included and its worst case scenario is assumed to leave enough headroom for the layout. Peripheral circuits are also connected to the signal line. When the load is charging, the current via the resistor $R_2$ can introduce a voltage drop. This effect leads to a difficult situation to check whether the signal is settled or not. The readout buffer 2 is introduced to provide a reference voltage which does not suffer from the dropped voltage because of charging current.
to check the settling of the signal.

The settling time simulation results are presented in section 4.6 combined with the multiplexer and other peripheral circuits. For this design, the readout settling error is 40 µV.

4.4 Multiplexer

To arrange 1280 column output signals into 8 analog buses, a multiplexer is required. The reason that 8 analog buses are employed is to realize a high speed readout while optimizing the memory effect of the and power consumption. Each bus includes two signal lines and it connects to 160 column outputs as shown in the figure 4.9. To reach the 64 frames/s for a 1280 × 720 pixels array, a 64 MHz master clock for the image sensor part is employed. 8 analog buses are arranged in a time-interleaved method to adapt to the ADC block. For one analog bus, it works at 64/8=8 MHz. One bus also connects to two 4 Ms/s SAR ADCs which match to the 8 MHz work speed of a single analog bus. The multiplexer is designed to generate readout buffer control signals such as s<0> to s<1271> shown in figure 4.9 to decide which column output is being read out. The precharge circuits of the analog buses are also controlled by the multiplexer.

To implement those functions, the multiplexer should be able to provide a reliable internal clock and the timing sequence. In this design, the multiplexer contains two main blocks: the clock divider and the X_scan block.
4.4.1 Clock divider

A 64 MHz clock is chosen as the master clock for the image sensor part in this design. A clock divider is necessary to generate eight time-interleaved 8 MHz clocks and it can provide sufficient local drive capability for the readout buffer control signals. For every row readout, a synchronous start signal is required for the clock divider and the X_scan block and it is sent via a bond pad from an off-chip signal generator. Due to the wire’s resistance, capacitance and neighboring signal’s crosstalk, time shift and uncertain jitter are introduced so that a local synchronous start signal regenerator are necessary. Finally, the possibility of region of interest (ROI) readout is required for this design. The region of interest is a selected subset of samples within a dataset identified for a particular purpose[12]. In an image sensor design, it refers to the function that only part of pixels array is read out. For the clock divider in this design, it should be able to change the order of eight time-interleaved 8 MHz clocks. In conclusion, the clock divider block should achieve three main functions:

1. Divide the 64 MHz master clock into eight time-interleaved 8 MHz clocks.
2. Local synchronous start signal regeneration.
3. Possibility of region of interest.

4.4.1.1 Local synchronous start signal regenerator

The figure 4.10 shows the schematic of the local synchronous start signal regenerator.
The Xsync is the input of the synchronous start signal via a bond pad and the cleansync is the local synchronous start signal. This local synchronous start signal regenerator only needs three device: a D flip-flop (Dff) with global reset, a power-on-zero D flip-flop and an inverter. This combination guarantees the robustness of this block. The inverter is introduced to ensure the local synchronous start signal is generated in the falling edge of the master clock.

The figure 4.11 shows the simulation result of the local synchronous start signal regenerator. The purple waveform is the 64 MHz master clock and the red waveform is the input synchronous start signal. The green waveform is the generated cleansync signal.

We can see from the figure 4.11, that the local synchronous start signal is generated in the first falling edge of the master clock after the rising edge of input synchronous start signal and its period is one master clock cycle as we expected.

4.4.1.2 Clock divider

The figure 4.12 shows the required eight time-interleaved 8 MHz clocks. According to the figure 4.12, we find that clock<4>, clock<5>, clock<6> and clock<7> is the inversed clock of the clock<0>, clock<1>, clock<2> and clock<3>. They can be easily generated by means of a D flip flop combined with a proper input and clock.

Based on this principle, figure 4.13 presents the clock divider block. The Xsync signal is the off-chip generated input synchronous start signal. The digital inputs (addr<0:2>) are
the setting bits for the ROI function. The clock_in is the 64 MHz master clock. The clkdive<0:7> and nclkdive<0:7> are eight overlapping time-interleaved 8 MHz clocks. The clean_sync<0:7> are eight local synchronous start signals for the D flip-flop chain in the X_scan block. The dashed box A in the figure 4.13 are designed to generate eight single ended time-interleaved 8 MHz clocks. To ensure the possibility of changing the order of those clocks, the Dff input should be selectable so that the component Mux2 is introduced. The table 4.6 shows the function of the Mux2. When the signal cleansync is high, the data input of Dff is the internal generated setting bit which is the Loadcode<n> as shown in the figure 4.13.

<table>
<thead>
<tr>
<th>Output</th>
<th>cleansync=1</th>
<th>cleansync=0</th>
</tr>
</thead>
<tbody>
<tr>
<td>Port B</td>
<td></td>
<td>Port A</td>
</tr>
</tbody>
</table>

Table 4.5: The function of the Mux2

In order to facilitate the setting, a decoder circuit in the dashed box B in the figure 4.13 is employed to transfer 3-bit digital inputs (addr<0:2>) to 4-bit internal setting bits (Loadcode<0:3>). The truth table of the decoder circuit is shown in the table 4.7.

<table>
<thead>
<tr>
<th>addr&lt;0:2&gt;</th>
<th>Loadcode&lt;0:4&gt;</th>
<th>First generated clock</th>
</tr>
</thead>
<tbody>
<tr>
<td>000</td>
<td>0000</td>
<td>clockoutdiv8&lt;0&gt;</td>
</tr>
<tr>
<td>100</td>
<td>1000</td>
<td>clockoutdiv8&lt;1&gt;</td>
</tr>
<tr>
<td>010</td>
<td>1100</td>
<td>clockoutdiv8&lt;2&gt;</td>
</tr>
<tr>
<td>110</td>
<td>1110</td>
<td>clockoutdiv8&lt;3&gt;</td>
</tr>
<tr>
<td>001</td>
<td>1111</td>
<td>clockoutdiv8&lt;4&gt;</td>
</tr>
<tr>
<td>101</td>
<td>0111</td>
<td>clockoutdiv8&lt;5&gt;</td>
</tr>
<tr>
<td>011</td>
<td>0011</td>
<td>clockoutdiv8&lt;6&gt;</td>
</tr>
<tr>
<td>111</td>
<td>0001</td>
<td>clockoutdiv8&lt;7&gt;</td>
</tr>
</tbody>
</table>

Table 4.6: The truth table of the decoder circuit.

If the clock_in signal is on, four D flip-flops are always free running as they are connected in a closed loop. It is important to set the standard to determine the order of eight time-interleaved 8 MHz clocks. As the D flip-flop chain in the X_scan block is rising edge triggered, the first coming clock is recognized by the first rising edge of those clocks after the local synchronous
4.4 Multiplexer

start signals. For example, if the first rising edge of those clocks after the local synchronous start signals belongs to the clockoutdiv8<3> signal, the clockoutdiv8<3> signal is treated as the first generated clock signal. The table 4.7 also includes a summary of the relationship among the first generated clock, 3-bit digital inputs (addr<0:2>) and 4-bit internal setting bits (Loadcode<0:3>).

The six OV blocks (OV<0:7>) in the dashed box C are designed to transfer eight single ended time-interleaved 8 MHz clocks (clockoutdiv8<0:7>) into eight overlapping and differential clocks which are clkdiv8<0:7> and nclkdiv8<0:7> clocks. In the dashed box C, eight cleansync components (cleansync<0:7>) are implemented to generate eight local synchronous start signals (clean_sync<0:7>) for Dff chain in X_scan block.

For the eight single ended time-interleaved clocks generator in the dashed box A, it requires four clock cycles to be stable when it is synchronized by the cleansync signal. To ensure clean_sync<0:7> signals are generated correctly, five clock cycle delay are introduced by the Dff chain in the dashed box D. In the clock divider block, some inverters are also employed to provide enough drive capability.
4.4.1.3 Functional simulations

Figure 4.14 presents a simulation result of the clock divider block and setting bits (addr<0:2>) of the ROI function, which are in this simulation set to 000.

![Simulation Result](image.png)

The clk_in is the 64 MHz master clock and the sync signal is the off-chip input synchronous start signal. The cleansync and cleansync2 are two internal generated synchronous clock. The cleansync2 is generated five clock cycles later compared to the cleansync as expected. The clkoutdiv<0:8> are eight time-interleaved 8 MHz clocks. The first rising edge of eight time-interleaved 8 MHz clocks belongs to the clkoutdiv<0> which means the clkoutdiv<0> is the first generated clock when ROI setting bits are 000. The clean_sync<0:7> are the eight local synchronous start signals for the D flip-flop chain in the X_scan block. Those signals determine the starting time of the row readout. We can see from the figure 4.14 that the clean_sync<0> is the first generated signal as expected.
To verify the ROI function, the figure 4.15 presents another simulation result of the clock divider block and setting bits (addr<0:2>) of ROI function, in this simulation set to 001.

Compared to previous simulation, we can see from figure 4.15 that the first rising edge of eight time-interleaved 8 MHz clocks belongs to the clkoutdiv<4> which means the clkoutdiv<0> is the first generated clock when the ROI setting bits are 001. And the clean_sync<4> is the first generated synchronous start signals for the D flip-flop chain. This simulation proves the ROI function.

4.4.2 X_scan block

The X_scan is a digital block to provide the row readout timing sequence which is used to arrange 1280 column outputs to 8 analog buses. Figure 4.16 presents the arrangement of row readout circuits and required timing.
Figure 4.15: The arrangement of row readout circuits and required timing.
As a interleaved readout method which is mentioned in the section 4.1 is employed in this design, the row readout timing sequence is divided into the even and odd groups so that two rows’ reading process can be controlled independently. In the figure 4.16, the red components are controlled by sel_n_buffer<0:1279> signals which is the even group. And the purple components are controlled by sel_n+1_buffer<0:1279> signals which is the odd group. A typical required timing diagram and corresponding 8 buses’ working conditions are presented in the lower half of the figure. We can see that the sel_n_buffer<n:n+7> signals are 8 time-interleaved pulses and the sel_n_buffer<n+8> signal is one clock cycle shifted compared to the sel_n.buffer<n> signal.

The sel_n_buffer<0:1279> and sel_n+1_buffer<0:1279> signals are generated by a modified Dff-based scanner. In conventional Dff-based scanner, the X_scanner are simple series Dffs as shown in figure 4.17. It can only generate one set of the control signal and the required time-interleaved pulses are impossible for this structure. Besides those two disadvantages, it also suffers from the clock attenuation because of the clock wire’s resistance and capacitance. It can cause that two time shifted pulses is overlapped such as sel_n_buffer<n> and sel_n.buffer<n+8> which means two readout buffers connect to one analog bus at the same time. This will introduce large cross talk.

In this design, we modify the scan unit and the structure of Dffs chain to handle previous mentioned issues. Figure 4.18 presents the modified scan unit.

In the modified scan unit, two control signals, the odd and even signal, are introduced to control the signal path via two NAND gates. Two control signals which are the sel_n.buffer<n> and sel_n+1.buffer<n> can be generated in one scan unit. To avoid the situation that two time shifted pulses such as sel_n.buffer<n> and sel_n.buffer<n+8> in figure 4.18 are overlapping, we set the rise time arge than the fall time for a scan unit. From the figure 4.18, we can see the falling signal is directly triggered by the Q signal via a nSR latch. The rising up signal is triggered by the Q signal via a nSR latch after a NAND gate. A NAND gate propagation delay is introduced. In the figure 4.18, it shows a typical required time shifted pulse timing diagram. Because falling edge of sel_n.buffer<n> signal and the rising edge of sel_n.buffer<n+8> are triggered by a same Q signal, the falling edge is generated early compared to the rising edge.

To generate 8 time-interleaved pulses, 1280 scan units are divided into eight groups. Each group includes 160 scan units in series and they are driven by eight overlapping differential
time-interleaved 8 MHz clocks which are generated by the clock divider block. Figure 4.19 presents the modified Dff-based scanner.

We take the chain<0> in even mode as an example to illustrate the modified Dff-based scanner working operation. Firstly, the even signal is set to 1 and two inversed 8 MHz clocks which are clkdiv<0> and nclkdiv<0> are applied. Then, 160 scan units are reset by the clean_sync<0> signal. The first scan unit is reset to 1 and the other scan unit is reset to 0. In the same time, the sel_n_buffer<0> signal is generated. The digital 1 is propagating through the modified Dff-based chain driven by two inversed 8 MHz clocks. As clean_sync<0:7> and clkdiv<0:7> are time-interleaved clocks, the digital 1 inside other scan chains is also generated and propagating in a time-interleaved way. Figure 4.20 shows the simulation result of the modified Dff-based scanner.

The dff_sync<0:7> are 8 time-interleaved synchronous clocks for the scan chain<0:7>. The select_even<0:7> are first eight scan signals corresponding to sel_n_buffer<0:7> signals in the figure 4.19. The select_even<0> and select_even<8> are two time shifted pulses corresponding to sel_n_buffer<0> and sel_n_buffer<8> signal of the chain<0> in the figure 4.19. We can see from the simulation that there is a delay between the falling edge and the rising edge so that the overlap risk is avoided by the modified scan unit.

4.5 Row driver

To realize the Y-axis scanning of pixels and provide all the operation signals, a row driver is required. Figure 4.21 shows an overview of the row driver block.

It includes 720 row driver units in parallel to control the pixel array. Each row driver contains a NAND 10 and four logic circuits to achieve the right pixel operational function. The address
Figure 4.18: The modified Dff-based scanner.
Figure 4.19: The simulation result of the modified Dff-based scanner.

Figure 4.20: An overview of the row driver block.
code and control bits are sent from the variable addressable serial parallel interface (VASPI) in the digital control block which will be discussed in the chapter 6. To reduce the impact of peak current crosstalk, an isolated clean analog power supply is required which comes from the IO block. In this design, the decoding function is implemented by the address code wire routing instead of a hardware decoder. Figure 4.22 shows the address code wire routing. The signal yaddr<0:7> and nyaddr<0:8> signals are provided from the VASPI. The important reason that the decoder is NAND10 is that pullup is fast, otherwise unwanted glitches are created.

![Figure 4.21: The address code wire routing.](image)

Four logic circuits inside the row driver unit is implemented by two blocks. They are propagation cell and selection cell. The propagation cell is used to generate TG, ELECTRON and RESET signals which may require three different voltage levels. The selection cell is used to generate SEL signal which includes two different voltage levels. Figure 4.23 shows the schematic of the row driver unit.

The propagation cell requires four digital input control signals and three independent analog power supplies. Four digital input signals are used to control the output value in addressed and non-addressed state. To reduce the impact of peak current crosstalk, three independent analog power supply groups are used to power three propagation cell of TG, ELECTRON and RESET signals. The Tg_nop signals are used as enable function. The Tg_row signals are used to choose rolling shutter or global shutter model. Tg<0:1> signal are setting bits to choose the voltage value of the output signal. The selection cell is a streamlined propagation cell which only requires three digital input control signals and two independent analog power supplies. The table 4.8 gives a summary of the Tg propagation cell as an example.

Figure 4.24 presents a function simulation result of the TG propagation cell. The addrout is the output of the decoder. The load on the output of each control signal is considered in this simulation. The load includes the parasitic resistor and capacitor (RC) of the metal wire running across the pixel array and the connected pixel transistors.

From the simulation result, we can see the TG propagation cell is working as expected.
Figure 4.22: The row driver unit.

Table 4.7: The truth table of the decoder circuit.

<table>
<thead>
<tr>
<th>Tg&lt;0:1&gt;</th>
<th>Output</th>
<th>Voltage level</th>
</tr>
</thead>
<tbody>
<tr>
<td>11</td>
<td>Vhigh</td>
<td>vddtrans</td>
</tr>
<tr>
<td>10</td>
<td>Vlow</td>
<td>vsstrans</td>
</tr>
<tr>
<td>01</td>
<td>Vmedium</td>
<td>vmetrans</td>
</tr>
<tr>
<td>00</td>
<td>Vlow</td>
<td>vsstrans</td>
</tr>
</tbody>
</table>
Figure 4.23: The simulation result of TG row driver.
4.6 Test and peripheral circuit for analog readout chain

In order to test whether the analog readout chain is working as expected, some test circuits are required. To cancel the memory effect of the analog bus between different signals, a precharge block is implemented.

4.6.1 MBS and test circuit

Mixed boundary scan (MBS) is a test and diagnostic block to either write or read data on certain nodes in the design and facilitate testing.

In 'sense' model, MBS is used to access important nodes of the readout circuits in the sensor especially when the readout value is not the presumed result. In the 'write' model, MBS is used to impulse a value on some nodes in the sensor to check that a specified part in the sensor has the correct function.

The schematics of MBS sense and MBS write are shown in figure 4.25 and figure 4.26. Figure 4.27 presents an example that the MBS can write in and sense back the same node.

The input and output signals of the MBS are all propagating in a long metal wire which is called MBS_BUS. As large parasitic RC of the MBS_BUS and load of the off-chip oscilloscope probe, the MBS can only sense low-frequency signals to achieve the functional verification.

In this design, the capacitor C1, C2, C3 and C4 in figure 4.16 are sensed by four MBS cells to check the sample value. The cleansync and the clkdvi<0> signal in figure 4.13 are also sensed to check the local generated synchronization signal and the divided clock.
To check whether the modified Dff-based scanner is working as expected, a scanner detector block is implemented. The schematic of this block is shown in the figure 4.27.

![Figure 4.27: The scanner detector.](image)

This block contains two identical detection circuit which are used to detect the beginning of the scanner and the end of the scanner. Each detection circuit are formed by 7 XNOR2 gates. The output value is sensed by a MBS cell.

### 4.6.2 Precharge circuit

As shown in figure 4.16, 8 analog buses are employed to read out the column output and they are continuously working. The analog bus essentially is a long metal wire with parasitic RC. The parasitic RC exhibits a memory effect between two readout signals due to capacitor dielectric absorption/relaxation.

To cancel the memory effect, a precharge phase is introduced in every readout. The precharge unit can force the analog bus to a certain voltage regardless of previous signal value. Figure 4.29 shows the precharge unit of analog bus<0>.

![Figure 4.28: The precharge unit.](image)

The precharge unit contains a 5T buffer and a NOR2 gate. The NOR2 gate is used to generate the control signal of the precharge unit. The precharge phase will take the first one eighth time period of the readout. The precharge level signal connects to a DC voltage. As the pull-up resistor is larger than the pull-down resistor for a 5T buffer, the DC voltage should...
be set higher than the middle level to reduce the settling pressure of the reset signal. In this design, according to the simulation result, the DC voltage is optimized to 1.1V.

According to figure 4.8, Figure 4.30 shows a simulation result of the row readout in high gain model combined with the multiplexer, digital control block, peripheral circuit and the assumed load.

![Simulation result of the row readout](image)

**Figure 4.29:** A simulation result of the row readout.

The column_out is the input signal for the S/H circuit. We can see from figure 4.30 that the column wire is precharged to 0 to speed up the settling for next signal and the memory effect is canceled. The N_2 node is the D node inside the readout buffer as shown in the figure 4.7. This node is charged by the standby current to follow the sampled signal in the S/H circuit. From the simulation result, the D node can be settled within the required time. The videoprecharge is the control signal for the precharge unit which is generated by the

Ruijun Zhang

Master of Science Thesis
clkdiv<1> and nclkdiv<0> from the clock divider. The simulation result shows that the precharge function is working as expected and the timing is correct. The simulation result proves that the largest settling error in this design is 40 $\mu$V which equals to 1/3 LSB for a 14-bit ADC.

4.7 Conclusions

In this chapter, the whole analog signal chain between pixel and ADC was discussed. The validity of their function is proved by the results of functional simulations. We optimized several parts.

In the next chapter, the on-chip ADC block will be presented.
In the previous chapter, we present the analog readout chain of this image sensor. This chapter focuses on the ADC block.

To convert the analog output into a digital signal, an on-chip Analog to Digital Converter (ADC) block is required. For image sensors, the ADC type can be classified into three categories: chip level, column level and pixel level as shown in figure 5.1(A), figure 5.1(B), figure 5.1(C) [7].

![Figure 5.1](image.png)

**Figure 5.1:** The ADC category for image sensors [7].

The chip level ADC type means the whole pixel output signals are read out row by row and digitized by one unique ADC. The advantage is that it does not suffer from ADC offset and its simplicity. The main drawback for the chip level ADC is the difficult compromise between readout speed and power consumption. To reach a same readout speed as column or pixel level ADC, a relatively large power consumption is required for a chip level ADC.

The pixel level ADC refers to that an ADC is implemented in each pixel. The biggest advantage of the pixel level ADC is that it has highest readout speed and relatively lower...
FPN compared to the column level ADC. The drawback is that the in-pixel ADC requires large area. It leads to a large pixel size or low fill factor.

Now the column level ADC is the most widely used ADC for image sensors. The analog pixel data of one single pixel line is digitized by column ADCs which are all working in parallel. The advantage of the column ADC is that it provides a good compromise between the fill factor, power consumption and readout speed. The drawback is that it suffers the offset which leads to the column FPN and the layout size is limited by the pixel size.

After serious consideration, this design employs an interleaved ADC which is a kind of variant column level ADC. The difference between the serial ADC and the column level ADC is that not one column but several columns share one ADC. As the column level ADC is limited by the pixel size, it is not convenient to reuse the ADC in other projects. As mentioned in the chapter 1, applications of UV sensitive image sensors are relatively special and custom design oriented compared to the visible light image sensors. The on-chip ADC should be portable and easy to adapt to other projects.

In this design, as the input of the ADC is a sampled DC voltage, the design pressure of the sampling bandwidth and distortion is released. The choice of ADC architecture mainly depends on the conversion speed, resolution and power consumption. The image sensor ADC is a kind of data acquisition ADC. For the data acquisition ADC, successive approximation register (SAR) ADC is a suitable choice as shown in figure 5.2[13]. The dashed line was the state of the art performance in 2013.

![ADC Architectures, Applications, Resolution, Sampling Rates](image)

**Figure 5.2**: The ADC architectures, applications, resolution, and sampling rates[13].

For the SAR ADC, there are three main advantages:

1. Simple principle: the binary search algorithm implemented in SAR ADCs is a simple algorithm compared to other types.
2. Low power consumption: in SAR ADCs, power-hungry operational amplifiers are not required and comparators consume much less power compared to operational amplifiers. The SAR ADC is most low-power from architectural standpoint compared to Pipelined and Sigma-Delta ADCs.

3. Easy to scale: as the analog components are simple, the SAR ADC does not suffer too much from the negative scaling impact of the analog circuit. SAR ADC is one of the better solutions for scaled-down analog technology.

For those reasons, the SAR architecture is preferred in this design.

### 5.1 Comparator design considerations

The comparator is an essential part of the SAR ADC. It is used to compare two analog signals and output a digital signal indicating which is larger. For a 2V peak-to-peak 14-bit ADC, the comparator has to be able to discriminate voltages as small as $122 \mu V$. Our comparator contains two parts: the pre-amplifier and the latch.

There are three main design considerations:

1. **Minimize the Offset error.**
   
   For the offset, it can be classified into two parts. The first one is the offset of the latch. This offset is caused by the device mismatch and it limits the resolution. The way to cancel this negative effect is to use a pre-amplifier before the latch to amplify the input signal. The second one is the offset of the pre-amplifier which is also caused by the device mismatch. In this design, we introduce an offset calibration phase to reduce the offset of the pre-amplifier effectively.

2. **Resolution.**
   
   The resolution refers to the minimum input signal that can be recognized by the comparator. The resolution is decided by the gain of comparator and the offset. For this design, the resolution should be at least one LSB which means $122 \mu V$.

3. **Speed.**
   
   The speed refers to the maximum speed that the comparator can work in expected design specifications. In this design, the speed design target for the comparator is 128 MHz due to a 4M sample rate and required 32 clock cycle for one conversation. The main speed limitation is the bandwidth of the pre-amplifier.

#### 5.1.1 Latch

The latch is used to establish full logic levels and give this decision to the SAR logic control block. Figure 5.3 presents the schematic of the latch which is used in this design.

The transistors M1 and M2 are designed to provide a constant biasing current to reduce the peak current and can switch it off when it is not used to reduce the power consumption. The M3 and M4 are the pMOS input pair to adapt the pre-amplifier’s output signal. The latch includes a positive feedback loop which is formed by M8 and M9 transistor. The M7 and M10...
transistors are used to break the positive feedback loop and reset the latch. The transistors M5 and M6 are used to reduce the kickback due to the reset phase. Two low threshold (LT) inverters are used to drive the next stage’s circuit.

The most important issue for the latch is the offset due to the device mismatch. The offset can be estimate as:

\[ \sigma = \frac{A_r}{\sqrt{WL}} \] (5.1)

Ar is a technology related parameter. W and L are the width and length of the transistor. The influence of the device mismatch for the latch has two parts. The mismatch of M3, M4 leads to a different threshold voltage of transistors M3 and M4 and \(g_m\) so that the resolution of the latch becomes deviating. The mismatch of M8, M9 causes a different positive feedback strength so that the settling time for logic 1 and 0 is different. For these two effects, the mismatch of M3, M4 is prioritized to solve. Because the positive feedback is so strong that the difference between two setting time is negligible.

The required input referred offset of the latch for the comparator in this design should be at least \(1/2\) LSB which means \(61 \mu V\). There are two ways to reach this target. The first one is to decrease the offset of the latch. Another one is to provide sufficient gain from the pre-amplifier. For a pre-amplifier whose gain is A at expected working speed, the input referred offset is:

\[ V_{osi} = \frac{V_{osl}}{A} \] (5.2)

The \(V_{osi}\) is the input referred offset and the \(V_{osl}\) is the offset of the latch.

The most efficient way to reduce the offset is to increase the transistor size. As the M3 and M4 are the output load of the pre-amplifier, if the size increases, the bandwidth of the pre-amplifier decreases. In other words, the gain of the pre-amplifier is lower at a certain
speed. The transistors M3 and M4 having a large size, would bring a large design pressure for the pre-amplifier. The size of M3, M4 should be decided combined with the design of the pre-amplifier to find the optimal point.

The figure 5.4 presents the offset (3σ) of the latch based on the Monte Carlo simulation with 100 iterations.

![Offset of the latch](image)

**Figure 5.4:** The offset of the latch.

According to the figure 5.4, the worst offset of the latch is 10.63 mV. To reach 61 µV input referred offset, the required gain of the pre-amplifier is 200 at 128 MHz sample rate. As the Monte Carlo simulation is based on Gaussian distribution (3σ) which is true with a probability of 99.73%, 200 is a quite safe value.

Due to the positive feedback, we can use an interconnected amplifier (M8, M9) model to analyze the latch. Figure 5.5 shows the model.

We assume that those two single pole opamps are identical which means they have the same transconductance (g_m), output resistance (R_{out}) and load capacitance (C_L). Based on the linear model, we can get:
\[ g_m V_x + \frac{V_y}{R_{out}} = -C_L \frac{dV_y}{dt} \] (5.3)

\[ g_m V_y + \frac{V_x}{R_{out}} = -C_L \frac{dV_x}{dt} \] (5.4)

Eq.(5-3) minus Eq.(5-4):
\[
\Delta V = \frac{\tau A}{A - 1} \frac{d\Delta V}{dt} \approx \frac{\tau}{A} \frac{d\Delta V}{dt} = \omega_u \frac{d\Delta V}{dt}
\] (5.5)

\[ A = g_m R_{out} \] (5.6)

\[ g_m = \sqrt{K \frac{W}{L} I_d} \] (5.7)

\[ \tau = C_L R_{out} = \frac{A}{\omega_u} \] (5.8)

A is the DC gain of the opamp. K is a technology related parameter. W and L are the width and length of M8, M9. \( I_d \) is the drain current of M8, M9. The \( \tau \) is opamp’s RC constant and the \( \omega_u \) is the unity gain bandwidth of the opamp. Solving Eq.(5-5):

\[ \Delta V = \Delta V_0 e^{\omega_u t} = \Delta V_0 e^{\frac{t}{\tau}} \] (5.9)

\[ \tau_l = \frac{1}{\omega_u} = \frac{\tau}{A} = \frac{C_L}{g_m} \] (5.10)
The $\tau_l$ is the setting time constant of the latch. The $\Delta V_0$ is the voltage difference between node $N_1$ and $N_2$ which is proportional to the differential input value from $V_{in+}$ and $V_{in-}$ signal.

Based on previous analysis, the speed of the latch is limited by two parts. The first one is the load capacitance. The second one is the strength of the positive feedback which means $g_m$ in Eq.(5.9). The load capacitance includes the M8 or M9’s gate capacitance, input capacitance of the LT inverter and the source capacitance of M9 and M10. In the load capacitance, the M8 or M9’s gate capacitances are the largest ones. To speed up the latch, the most efficient way is to reduce the size of M8, M9. But it also brings a lower $g_m$ which means a lower speed. Besides this negative effect, it also leads to a larger mismatch for M8, M9. Another way is to increase the biasing current to get a higher $g_m$ which means a larger power consumption. A good compromise should be made for the latch.

An important criteria for the speed of the latch is the required settling time for recognizable logic 1 and logic 0. According to the Eq.(5.9), required setting time is:

$$T_{latch} = \tau_l \ln \frac{\Delta V_{logic}}{\Delta V_0} \quad (5.11)$$

We can see for the Eq.(5.11), a small input leads to a longer settling time. So the worst case for the latch is that the input is one LSB. In this design, when the input is one LSB, the setting time is tuned to 1 ns which means eighth of a clock cycle for a 128 MHz clock.

### 5.1.2 Pre-amplifier and offset calibration

Due to the offset of the latch, a pre-amplifier is required to provide a 200 gain at 128 MHz to reduce the input referred offset. There are two main design challenges for the pre-amplifier. The first one is the speed and another one is the offset. For the speed specification, the pre-amplifier should be able to provide 200 gain at 128 MHz with a small enough hysteresis. Due to device mismatch, the pre-amplifier also introduces an offset. It requires that the pre-amplifier should have the ability to do the offset calibration.

The architecture of the pre-amplifier employs a three stage amplifier as shown in figure 5.6 which can achieve high gain with low power consumption[15]. For a pre-amplifier with 200 gain, the gain could be allocated according to the proportion of 8:5:5 which means the first, second and third amplifier’s gain are 8,5,5.

As the pre-amplifier of a comparator is always working in open loop state, there is no need to consider the linearity and stability. The design should focus on the speed and offset calibration.

#### 5.1.2.1 First stage of the pre-amplifier

The first stage of the pre-amplifier is critical for the whole pre-amplifier. Its offset directly count to the input referred offset, as there is no other pre-amplifier before it to provide a gain to reduce the offset. The first stage pre-amplifier should have lowest offset compared to other two stages. Figure 5.7 presents the schematic of the first stage pre-amplifier.
Figure 5.6: The architecture of the pre-amplifier.

Figure 5.7: The first stage pre-amplifier.
5.1 Comparator design considerations

The first stage pre-amplifier uses a current-bleeding modified 5T amplifier. It employs a pMOS input pair (M3, M4) to reduce the 1/f noise. For a traditional 5T amplifier, if it wants to achieve a gain of 8, the W/L of M3, M4 should be much larger compared to M7, M8. This always leads to a small size of M7, M8 which means a large thermal noise. To handle this issue, we introduce M5, M6 to increase the gain. For a traditional 5T amplifier, the gain is:

$$A_1 \approx \frac{g_{m4}}{g_{m8}} \approx \sqrt{\frac{u_p I_4}{u_n L_4}} \frac{W_4}{W_8}$$  \hspace{1cm} (5.12)

For a current-bleeding modified 5T amplifier, the gain changes to:

$$A_2 \approx \frac{g_{m4}}{g_{m8}} \approx \sqrt{\frac{I_{M4} u_p W_4}{(I_{M4} - I_{M6}) u_n W_8}}$$  \hspace{1cm} (5.13)

The M5, M6 are used as a constant current source to split the M7, M8’s current to increase the gain. The M5-M11 keep a same length which is friendly to the layout. As no offset calibration in the first stage, the size of M3, M4 should be kept large. The first stage pre-amplifier is driven by the capacitor digital to analog converter (CDAC) which has a much larger capacitance compared to the input capacitance of the first stage pre-amplifier. A large size of the M3, M4 is acceptable.

5.1.2.2 Second and third stage of the pre-amplifier

The second and third stage of the pre-amplifier include the offset calibration and provide enough gain. Normally, there are two ways to do the offset calibration which are: input offset storage (IOS) and output offset storage (OOS)[16]. Those two methods are also called auto-zero method. Figure 5.8 show those two methods.

The principle of auto zero is to sample the unwanted quantity (offset) and then subtract it from the input signal either at the input or the output of the amplifier[16]. To sample the unwanted quantity, a pair of capacitors are inevitable either at the input or the output of the amplifier. This leads to a large load capacitance which is not suitable for a high speed pre-amplifier. In this design, we introduce another offset calibration method to avoid the large load capacitance. Figure 5.9 shows the schematic of the second and third stage pre-amplifier.

The basic working principle of this offset calibration is that the unwanted quantity is sampled in the capacitor C1 and C2. And C1, C2 are used to bias M7 and M8 to use different current to compensate the offset.

The pre-amplifier has two working phases as shown in figure 5.10. In the phase A, the switch $S_1$ and $S_2$ are closed. The pre-amplifier is working as a normal 5T amplifier. We assume there is a positive input offset ($V_o$) applied in the in-node and the input signal are short and connected to a common voltage ($V_{cm}$).
Figure 5.8: The input offset storage (IOS) and output offset storage (OOS).

Figure 5.9: The second and third stage pre-amplifier.

Figure 5.10: The working principle of the offset calibration.
For the offset reduction, we can firstly analyze it in a system level. We assume that the gain of the pre-amplifier is $A_1$ in the phase 1 and the gain of the pre-amplifier is $A_2$ in the phase 2. The output signal:

$$\Delta V_{out} = V_oA_1 \quad (5.14)$$

In the phase 2, the switch $S_1$ and $S_2$ are open. The output signal:

$$\Delta V_{out} = V_{eff}A_2 \quad (5.15)$$

The $V_{eff}$ is the input referred offset in the phase 2. Ideally, the switch $S_1$ and $S_2$’s opening should have no different impact between left and right side. The $\Delta V_{out}$ should keep same, we can conclude that:

$$V_{eff} = \frac{A_1}{A_2}V_o \quad (5.16)$$

In the phase 1, the pre-amplifier is working as a normal 5T amplifier and we assume $M5$ and $M7$ has an equal length:

$$A_1 \approx \frac{g_{m3}}{g_{m57}} \approx \sqrt{\frac{\mu p W_3}{L_3}} \frac{W_5 + 7}{W_5} \quad (5.17)$$

In the phase 2, the $M7$ and $M8$ transistors can be ideally treated as a current source. We can get:

$$A_2 \approx \frac{g_{m3}}{g_{m5}} \approx \sqrt{\frac{I_1 \mu p W_3}{(I_1 - I_{M7})u_n W_5}} \quad (5.18)$$

The offset is reduced by a different DC gain of the pre-amplifier. We can also understand the offset reduction in the circuit level. This positive offset leads to an unbalanced current which means $I_1$ is smaller compared to $I_2$. If the circuit can compensate the unbalanced current, the offset is reduced. Due to $I_1$ is smaller than $I_2$, the $out-$ is large than $out+$ which means the sampled bias voltage of $M8$ is larger than $M7$. In the phase 2, $I_{M8}$ is larger than $I_{M7}$. We can know:

$$I_6 - I_5 < I_2 - I_1 \quad (5.19)$$

As the transistors $M7$ and $M8$ can be ideally treated as a current source in phase 2, only the $I_6$ and $I_5$ have an influence on the gain. So the difference of effective current ($I_6$, $I_5$) in the phase 2 is smaller compared to the phase 1 ($I_2$, $I_1$). The offset is reduced by the different biasing current of $M7$, $M8$.

The advantage of this offset calibration method is that it can avoid extra load capacitance compared to the previous mentioned method. It ensures a high speed. The drawback is its...
offset reduction which is normally lower than the previous mentioned method. For this reason, the comparator should have a relatively large size to reduce its device mismatch. For this design, as our ADCs are not used in column level, a relatively large comparator is acceptable.

Figure 5.11 presents the offset (3σ) of whole comparator with offset calibration based on the Monte Carlo simulation with 100 iterations.

![Figure 5.11: The offset of the comparator.](image)

The maximum offset which includes the offset of latch and pre-amplifier after calibration is 100 μV which is less than one LSB.

For the pre-amplifier, as it is working in an open loop, it easily leads to a saturated situation. The recovery cannot only be based on the small signal model but also the large signal should be taken into account especially for the third stage pre-amplifier. This means, for the pre-amplifier, its recovery will suffer the slewing issues. The worst case is in the first several bits due to a large input value which could lead to that three pre-amplifiers are all in saturated situation. All pre-amplifiers will suffer from slewing issues. In this design, the ADC includes a self-calibration function to handle this issue. So the comparator only need to consider several last bit cases. To achieve an acceptable residual effect, the size and biasing current of the
5.2 Single 14-bit 4Ms/s SAR ADC

The in and nin are two input signals of the comparator. The out and nout are comparator’s outputs. To check the residual effect in the last two bits case, the nin signal connects to a 1.6V reference voltage and the in signal is set to a gradually increased 128MHz pulse signal. The increased step is $5\mu V$ for two neighboring signals. For the last two bits case, the input difference is 2 LSB which means 250 $\mu V$. The input signal of the comparator is set to the same situation as shown in figure 5.12. Finally, the residual voltage is optimized to 30 $\mu V$ which is less than one LSB.

![Figure 5.12: The residual affect.](image)

The DAC is implemented as a capacitor DAC (CDAC) and redistributes its charge during the sampling phase according to its 14 control bits. The CDAC is formed by three parts: a 6-bit linear CDAC, 4-bit binary CDA and 4-bit resistive CDAC. The capacitor bank is fully differential. One input connects to the signal and the other to the reference level. The common mode is not altered since the switching in the two parts will always happen complementary. At the end of the conversion, the voltage difference at the comparator inputs is a measure for the conversion accuracy and it should be less than the one LSB.

The digital logic block is used to generate the digital control bit to realize the successive approximation algorithm. Besides the conventional logic, the digital logic block can also achieve self-calibrating logic and post processing function. This is the reason why a 14-bit ADC requires 32 clock cycles to finish one conversion. The digital bits are connected to
the CDAC in following fashion: the MSB connects to the largest capacitor, the LSB to the smallest one in the bank.

For the correlated double sampling (CDS), a SAR ADC allows CDS on-chip, CDS off-chip, pseudo-differential sampling and single-ended sampling due to the CDAC.

In the pseudo-differential mode, used for on-chip CDS, the two inputs of the ADC will be the pixel reset (reference) and the pixel signal level (signal). Those are acquired at two moments slightly shifted in time, but become available simultaneously by the use of S/H (clamping) stages. In this case the result of the AD conversion corresponds to the difference between both levels, irrespective of their absolute value.

In a single-ended mode, used for off-chip CDS and in the case without CDS (3T pixel) readout, the reference input of the ADC will see a fixed voltage with the signal input receiving the real pixel value.

5.3 Overview of interleaved 14-bit 64Ms/s SAR ADC

As a single ADC will not be able to provide the expected pixel output of around 64 Ms/s, multiple ADCs are interleaved. After a thorough trade-off analysis, interleaved ADCs are selected for implementation because of the scalability in future versions of the sensor. Figure 5.14 shows an overview of the ADC block. The ADC block includes 16 4Ms/s SAR ADCs to achieve an overall 64Ms/s.

Ruijun Zhang

Master of Science Thesis
Figure 5.14: The overview of the ADC block.

As the output of the ADC is a parallel output and the low-voltage differential signaling (LVDS) sender can only handle a serial input, 14-bit interleaved serializers are required. For this design, as it has 16 4Ms/s 14-bit ADC, data output rate are 896M/s. We arrange two 250M/s double data serializers to serialize the output data. For a serializer, 250M speed is not critical which means it does not suffer from clock skew too much. A simple and robust serializer with a local clock generator mentioned in chapter 4 is qualified. Figure 5.15 shows the schematic of the 14-bit serializer.

Figure 5.15: The 14-bit serializer.

The 14-bit serializer uses a DFF chain and a Mux2 to serialize the ADC output. The inverter is used to drive the LVDS sender in the next stage. This structure is robust in low speed. The figure shows a functional simulation result of this serializer in 250 MHz.

The input bit is 11001100001111 and the expected output bit is 00110011100 due to the

Master of Science Thesis Ruijun Zhang
inverter. If there is no input bit, the output of the serializer is the training bit: 01010101 as one DFF chain is connected to VDD and another one is connected to VSS. The function of the serializer is proved.

To test the ADC block and image sensor block independently, a special I/O cell is introduced. Figure 5.17 shows the schematic of this I/O cell.

This cell includes a strong buffer, Mux2 and electrostatic discharge (ESD) protection circuit. The Mux2 can switch off the output buffer when the standby signal is high to reduce the power consumption. The input of the output buffer is one of the signal lines of 8 buses. There are 16 this special I/O cells in parallel in the west I/O block.

When we want to test the ADC block only, the image sensor block and the output buffer are powered off and the test signal is sent via bond pad directly. When we want to test the image sensor block, the output buffer are active to send the bus signal out.
5.4 Digital control block

To generate the control signal for different parts of the sensor, the digital control block is implemented. The digital block is mainly formed by several variable addressable serial parallel interface (VASPI). Inside the sensor, VASPI is used for two functionalities:

1. SPI upload: setting register bits in a certain register. This is used to apply static or slowly varying signals to the sensor without the need for a dedicated I/O for these signals. Multiple bits inside the same register can be changed within the same SPI upload.

2. SPI read-back: reading back the state of a certain single bit value inside a register. This allows to verify the correct upload of this register bit by bit. The read-back value appears on an MBS wire. To avoid conflicts on this MBS wire, the read-back state of only one bit at a time should be requested.

Compared to the conventional addressable serial parallel interface (ASPI), variable ASPI (VASPI) means that the SPI word length is variable and can change per register. This allows to scale the data content of the register to increase the speed of the ASPI. Figure 5.18 shows the schematic of a VASPI with 2 bit address and 2 bit data.

![Figure 5.18: The schematic of the VASPI.](image)

The VASPI uses three control signals:

1. SPI_CLK: it provides the clock to drive the shift register.
2. SPI_data: it sends the serial input data for the VASPI.
3. SPI_load: it is the control signal which makes the input data load into the memory register.

We can see from the schematic, the upload or read-back bit is the last bit. This makes the change of address and data bit possible.

VASPIs inside the digital block are classified by three groups: the image sensor group, ADC group, I/O and test group. The image sensor group is used to generate the control signal for the row driver block, S/H circuit, column load and MBS in the image sensor part. The I/O and test group is mainly used to control the MSB and set the programmable tuner to bias.
the circuit inside the readout chain and ADC block. Those three groups are managed by the last two bits of its address bit as shown in Table 5.1.

Table 5.1: The SPI address with dedicated LSBs for a particular block.

<table>
<thead>
<tr>
<th>Chip part</th>
<th>Address bits &lt;1:0&gt;</th>
</tr>
</thead>
<tbody>
<tr>
<td>Image sensor</td>
<td>x0</td>
</tr>
<tr>
<td>ADC</td>
<td>01</td>
</tr>
<tr>
<td>I/O and test blocks</td>
<td>11</td>
</tr>
</tbody>
</table>

Figure 5.19 presents a 3 bit address and 9 bit data (a3d9) VASPI timing diagram and bit function for the upload mode.

In upload mode, SPI_load is a pulse, triggering the propagation to and storage of the data in the registers. The register content is changed on the rising edge of the load pulse.

Figure 5.20 presents a functional simulation result of an a3d9 VASPI with 30MHz master clock (SPI_CLK).

The SPI<1> is the SPI_CLK and the SPI<0> is the SPI_data. The SPI<2> is the SPI_load. We can see from the simulation result that the VASPI’s data is uploaded in the same time and the address is functional. The validity of the upload function of VASPI is proved.

5.5 Conclusion

In this chapter, the ADC block is discussed. The analysis and simulation result of the comparator is illustrated in details. An overview of a single and whole ADC block are presented. And the special I/O cell for testing is introduced. Finally, the digital control block is discussed.
Figure 5.20: The functional simulation result of an a3d9 VASPI.
In this thesis work, I worked as a part of the Caeleste team and contributed to the design of a UV sensitive image sensor. The features of this sensor are:

1. A new UV sensitive included in a $4 \times 4$ kernel, with 106 dB high dynamic range achieved by an extended floating diffusion, is developed.
2. An interleaved 14-bit, 64Ms/s SAR ADC is realized which includes a high speed, low offset comparator and self-calibrating function.

### 6.1 My contributions

My contributions in this Caeleste project were:

1. The whole analog readout chain design and verification which includes the row diver block, S/H circuits, readout buffers and multiplexer.
2. Participation in the comparator design and verification.
3. Schematic design and verification of the digital control and I/O block.
4. Major part of the layout for the standard cell library which is used in the layout of the sensor.
5. Help with making the datasheet and building the new technology library.

### 6.2 Future work

By the time of writing this thesis, part of layout of this work was done. The whole layout, top-level simulation combined with the ADC block and verification still needed to be carried out. Especially for the ADC, the layout style has a large influence on the mismatch of the capacitor bank.
Due to the on-chip ADC, synchronization of whole system should be carefully checked and the optimization of the ADC is still needed which includes the settling of CDAC, the power consumption of the comparator and the delay time of the control logic.
Bibliography


