## THE DESIGN OF A HIGH SPEED CMOS IMAGE SENSOR

featuring global shutter, high dynamic range and flexible exposure control





Periklis Stampoglis 2018

#### THE DESIGN OF A HIGH SPEED CMOS IMAGE SENSOR

## FEATURING GLOBAL SHUTTER, HIGH DYNAMIC RANGE AND FLEXIBLE EXPOSURE CONTROL IN 110 nm TECHNOLOGY

#### THE DESIGN OF A HIGH SPEED CMOS IMAGE SENSOR

## FEATURING GLOBAL SHUTTER, HIGH DYNAMIC RANGE AND FLEXIBLE EXPOSURE CONTROL IN 110 nm TECHNOLOGY

#### Thesis

to obtain the degree of Master of Science from the Technical University of Delft, under the authority of Prof. dr. A. Theuwissen, chairman of the promotional committee, to be publicly defended on Monday, March 18<sup>th</sup>, 2019 at 13:00.

by

#### **Periklis STAMPOGLIS**

Faculty of Electrical Engineering, Mathematics and Computer Science, Technische Universiteit Delft, Delft, The Netherlands.

This dissertation has been approved by

supervisor: Prof. dr. A. Theuwissen daily supervisor: Mr. G. Cai

Promotional committee composition:

| Prof. dr. A. Theuwissen, | Technische Universiteit Delft |
|--------------------------|-------------------------------|
| Prof. dr. E. Charbon,    | Technische Universiteit Delft |
| dr. B. Luyssaert,        | Caeleste CVBA                 |
| Mr. G. Cai,              | Caeleste CVBA                 |

Independent members:

Other members:

*Note:* By request of the final customer, parts of this thesis (including figures and text) have been redacted. These were originally present in the document used during the defence of the thesis and reviewed by the thesis committee.



*Keywords:* CMOS image sensor, high speed, HDR, global shutter

An electronic version of this dissertation is available at http://repository.tudelft.nl/.

And He said:

$$\nabla \cdot \mathbf{E} = \frac{\rho}{\epsilon_0}$$
$$\nabla \cdot \mathbf{B} = 0$$
$$\nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t}$$
$$\nabla \times \mathbf{B} = \mu_0 \epsilon_0 \frac{\partial \mathbf{E}}{\partial t}$$

and there was light.

## **CONTENTS**

| st of I                                                                             | Figures                                                                                                                                                                                                                                                                                                                                                                                             | v                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|-------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| strac                                                                               | st                                                                                                                                                                                                                                                                                                                                                                                                  | viii                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| knov                                                                                | vledgments                                                                                                                                                                                                                                                                                                                                                                                          | ix                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Intr<br>1.1<br>1.2<br>1.3                                                           | oductionMotivationAbbreviationsOrganization of Thesis                                                                                                                                                                                                                                                                                                                                               | 1<br>2<br>2<br>3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| Bacl<br>2.1<br>2.2<br>2.3<br>2.4<br>2.5<br>2.6<br>2.7<br>2.8<br>2.9<br>2.10<br>Refe | kgroundHistory of Electronic ImagingSilicon PhotodetectorsElectronic ShuttersPixel Types2.4.1 Passive Pixel Sensors (PPS)2.4.2 Active Pixel Sensors (APS)Technology ScalingTypical CIS FloorplanCommon CIS MetricsSources of Non-ideality in CISHigh-speed ApplicationsSummaryerences                                                                                                               | 4<br>5<br>7<br>15<br>17<br>17<br>17<br>18<br>22<br>23<br>24<br>30<br>34<br>34<br>35                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| This<br>3.1<br>3.2<br>3.3<br>3.4<br>3.5<br>3.6                                      | Work       Introduction       System Architecture.       3.2.1       Top Level Architecture & Floor Plan       3.2.2       The Quadrant Split.       3.2.3       The Kernel       3.2.4       Pipelined Readout       3.2.5       Signal Path: From Pixel to Output       GS/HDR Pixel       Pixel Driver       KernelScan       Readout Circuits       3.6.1       Column Multiplexing       2.6.2 | <b>38</b><br>39<br>39<br>39<br>41<br>42<br>43<br>48<br>49<br>51<br>57<br>59                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|                                                                                     | strac<br>know<br>Intr<br>1.1<br>1.2<br>1.3<br>Bac<br>2.1<br>2.2<br>2.3<br>2.4<br>2.5<br>2.6<br>2.7<br>2.8<br>2.9<br>2.10<br>Refe<br>3.1<br>3.2<br>3.3<br>3.4<br>3.5<br>3.6                                                                                                                                                                                                                          | stract       knowledgments       Introduction       1.1 Motivation       1.2 Abbreviations       1.3 Organization of Thesis       Background       2.1 History of Electronic Imaging       2.2 Silicon Photodetectors       2.3 Electronic Shutters       2.4 Pixel Types       2.4.1 Passive Pixel Sensors (PPS)       2.4.2 Active Pixel Sensors (APS)       2.5 Technology Scaling       2.6 Typical CIS Floorplan       2.7 Common CIS Metrics       2.8 Sources of Non-ideality in CIS       2.9 High-speed Applications       2.10 Summary       References       3.2.1 Top Level Architecture & Floor Plan       3.2.2 The Quadrant Split       3.2.3 The Kernel       3.2.4 Pipelined Readout       3.2.5 Signal Path: From Pixel to Output       3.3 GS/HDR Pixel       3.4 Pixel Driver       3.5 KernelScan       3.6 Readout Circuits |

|   |      | 3.6.3   | Amplifying Sample and Hold             | 63 |
|---|------|---------|----------------------------------------|----|
|   |      | 3.6.4   | Column Readout Circuits                | 66 |
|   |      | 3.6.5   | Single-to-Differential Amplifier (S2D) | 69 |
|   |      | 3.6.6   | Output Stage.                          | 75 |
|   | 3.7  | Digita  | l Blocks                               | 77 |
|   |      | 3.7.1   | X&Y Scanners                           | 78 |
|   |      | 3.7.2   | X-Clock Generator                      | 81 |
|   |      | 3.7.3   | Region-of-Interest Selector.           | 83 |
|   |      | 3.7.4   | (A)SPI Interface                       | 84 |
|   | 3.8  | Desig   | n for Testing                          | 85 |
|   |      | 3.8.1   | Test & Variant Pixels                  | 85 |
|   |      | 3.8.2   | VDD <sub>pix</sub> Disabling           | 87 |
|   |      | 3.8.3   | Column Test                            | 88 |
|   |      | 3.8.4   | The MBS                                | 89 |
|   |      | 3.8.5   | Wafer Probing                          | 90 |
|   | 3.9  | Mode    | s of Operation.                        | 91 |
|   | 3.10 | Summ    | 1ary                                   | 95 |
|   | Refe | erences |                                        | 97 |
| 4 | Con  | clusio  | n                                      | 98 |

## **LIST OF FIGURES**

| 2.1  | Energy band structure of different materials 7        |
|------|-------------------------------------------------------|
| 2.2  | Photon absorption versus energy                       |
| 2.3  | QE versus wavelegth in BSI 9                          |
| 2.5  | Cross-section of photodiode 11                        |
| 2.6  | Integrating photodiode 12                             |
| 2.7  | PPD potential diagram                                 |
| 2.8  | Simplified cross section of a PPD                     |
| 2.9  | Rolling shutter principle15                           |
| 2.10 | Rolling shutter artefacts  15                         |
| 2.11 | Rolling versus Global shuttering  16                  |
| 2.12 | Global shutter principle1616                          |
| 2.13 | Common pixel structures 17                            |
| 2.14 | Some types of GS-capable pixels 20                    |
| 2.15 | CIS technology scaling                                |
| 2.16 | Floorplan of different ADC configurations  23         |
| 2.17 | Fill factor in BSI&FSI25                              |
| 2.18 | Use of microlenses in BSI for vignetting compensation |
| 2.19 | Microscope image of microlenses 26                    |
| 2.20 | Decrease of MTF with increasing spatial frequency 27  |
| 2.21 | Sampling a sinusoidal input with ideal pixels         |
| 2.22 | Cross-section of FSI and BSI configurations 28        |
| 2.23 | Examples of high-speed imaging applications 34        |
| 3.1  | The top-level block diagram of the sensor             |
| 3.2  | Quadrant split of the sensor41                        |
| 3.3  | The kernel concept 42                                 |
| 3.4  | Sample and Hold circuit 44                            |
| 3.5  | Nominal X-Scanner timing 46                           |
| 3.6  | Zero LBT X-Scanner timing 47                          |
| 3.7  | Simplified readout path diagram48                     |
| 3.8  | GS/HDR pixel schematic                                |
| 3.9  | GS/HDR pixel layout                                   |
| 3.10 | Pixel driver simplified schematic                     |
| 3.11 | Layout of the pixel driver52                          |
| 3.12 | Driver cell Type I schematic 53                       |
| 3.13 | Driver cell Type II schematic 53                      |
| 3.14 | Programmable strength driver 53                       |
| 3.15 | Driver cell Type III schematic                        |

| 3.16 Driver cell Type IV schematic                                        | 55 |
|---------------------------------------------------------------------------|----|
| 3.17 Driver cell Type V schematic                                         | 55 |
| 3.18 Pixel driver top-level simulation                                    | 56 |
| 3.19 The KernelScan Concept                                               | 57 |
| 3.20 Example KernelScan timing                                            | 58 |
| 3.21 KernelScan arbitrary integration                                     | 58 |
| 3.22 KernelScan pixel patterns                                            | 58 |
| 3.23 Column multiplexing scheme                                           | 59 |
| 3.24 Column multiplexing timing                                           | 60 |
| 3.25 Column load unit cell schematic ( <i>a</i> ) and layout ( <i>b</i> ) | 61 |
| 3.26 Column simulation                                                    | 62 |
| 3.27 Simplest S&H circuit                                                 | 63 |
| 3.28 S&H input and output                                                 | 63 |
| 3.29 S&H with amplification                                               | 64 |
| 3.30 S&H Accumulation Capacitor                                           | 65 |
| 3.31 Column Readout Circuits                                              | 66 |
| 3.32 Generation of column readout controls                                | 67 |
| 3.33 Column readout circuits layout                                       | 67 |
| 3.34 Simulation of video buffer output                                    | 68 |
| 3.35 Video amplifier control sequence                                     | 68 |
| 3.36 S2D amplifier schematic                                              | 69 |
| 3.37 S2D amplifier layout                                                 | 70 |
| 3.38 S2D clock distribution                                               | 71 |
| 3.39 S2D - 32 Channel layout                                              | 72 |
| 3.40 S2D output transient simulation                                      | 72 |
| 3.41 S2D transfer simulation (A=1)                                        | 73 |
| 3.42 S2D transfer simulation (A=4)                                        | 74 |
| 3.43 Output stage                                                         | 75 |
| 3.44 Output load                                                          | 76 |
| 3.45 Output stage memory effect                                           | 76 |
| 3.46 Typical X-scanner chain                                              | 78 |
| 3.47 8-phase X-scanner                                                    | 79 |
| 3.48 X-scanner layout                                                     | 79 |
| 3.49 Y periphery block diagram                                            | 80 |
| 3.50 X-Clock Generator                                                    | 81 |
| 3.51 X-Clock Generator Layout                                             | 82 |
| 3.52 Region of interest circuit                                           | 83 |
| 3.53 Layout of an ASPI register                                           | 84 |
| 3.54 Example of an ASPI transaction                                       | 84 |
| 3.55 Grayscale pixels                                                     | 86 |
| 3.56 VDD <sub>pix</sub> Disabling                                         | 87 |
| 3.57 VDD <sub>pix</sub> Disabling Cell                                    | 87 |
| 3.58 VDD <sub>pix</sub> Layout                                            | 88 |
| 3.59 Shorted Column Test                                                  | 88 |
| 3.60 MBS Cell Types                                                       | 89 |

| 3.61 | Waferprobe scheme |  | • |  | <br>• | • |  |  | <br> | • | • |  |  | • | • | • | 90 |
|------|-------------------|--|---|--|-------|---|--|--|------|---|---|--|--|---|---|---|----|
| 3.62 | Operating Mode 1  |  | • |  | <br>• |   |  |  | <br> |   |   |  |  |   |   |   | 92 |
| 3.63 | Operating Mode 2  |  | • |  | <br>• |   |  |  | <br> |   |   |  |  |   |   |   | 93 |
| 3.64 | Operating Mode 3  |  |   |  |       |   |  |  | <br> |   |   |  |  |   |   |   | 94 |

## **ABSTRACT**

High speed imagers find applications in many fields such as scientific and medical imaging, automotive applications, machine vision and much more. In this thesis, the design of a high speed, high dynamic range (HDR) CMOS sensor with electronic global shutter (GS) and flexible exposure control is presented. The sensor is designed in the  $0.11\mu$ m CIS process, features  $1k(H) \ge 1k(V)$  pixels and achieves framerates greater that 10.000 fps. A review of the architecture of the sensor is given, along with functional illustrations for each comprising block. The quadrant-based approach is described, along with the selectable region-of-interest capability. The pixel design is a eleven-transistor (11T) pinned photodiode global shutter pixel, implementing HDR by means of two in-pixel capacitors. The design of the pipelined Sample & Hold, column gain and column-level Correlated Double Sampling (CDS) circuits are shown.

Keywords --- CMOS image sensor, high speed, HDR, global shutter

## **ACKNOWLEDGMENTS**

I would like to first say a wholehearted thank you to Prof. Albert Theuwissen, my supervising professor at the TU Delft. His support, guidance and patience throughout the duration of this long project are and will continue to be greatly appreciated. To have worked with someone of the calibre of Prof. Theuwissen, a reference and master of the field, has truly been an honour and a privilege.

Next, I'd like to express my gratitude to my mentor and lead designer, Gaozhan Cai, without whom this very challenging project would simply not have been possible. Gaozhan has not only been a wealth of knowledge to me throughout my time at Caeleste but also an example of critical thinking, consistency and dedication for me to aim to.

Further on, a big thank you to Bert Luyssaert, the calm force behind the organization of this challenging project. It has been a great pleasure working together the past years.

I also want to thank Bart Dierickx, co-founder and CTO of Caeleste, for not only his profound knowledge and experience, but also his energy and dedication to the company. It is said if one wants to continuously improve themself, they should strive to never being the smartest nor the most knowledgeable person in the room. And in this, I like to think I've done well.

In addition I'd like to express my appreciation to all of my colleagues at Caeleste, the ones also involved in the project and not.

I would be remiss if I didn't also mention my gratitude to Prof. Edoardo Charbon, who was teaching the image sensor course during my year at the TU Delft. Him and the work I did for his course are the reason I am now finding myself in this fascinating and exciting path of image sensor design.

Finally, I want to express my love to my family from the bottom of my heart. I'll always be grateful for their care, love and support through my studies and my life. I am the person I am today thanks to them.

Periklis Stampoglis Delft, March 2019

# 1

## **INTRODUCTION**

"I happen to have discovered a direct relation between magnetism and light, also electricity and light, and the field it opens is so large and I think rich"

Michael Faraday

"Deep in their roots, all flowers keep the light" Theodore Roethke

#### **1.1.** MOTIVATION

High speed imaging sensors are an indispensable tool for capturing fast transient phenomena, with applications spanning a wide spectrum, such as scientific research, machine vision for the manufacturing and automotive fields, certain types of microscopy and much more. The continuous demand for higher frame rates at higher resolution, makes for a set of numerous and fascinating challenges the image sensor designer has to overcome. This project's aim is to design such a high speed imager featuring a  $1k \times 1k$  array of eleven-transistor (11T) global shutter (GS) & high dynamic range (HDR) capable pixels at a frame rate of 10.000 fps in continuous video operation. The 110 nm technology node was selected for this project, as opposed to the ever-popular 180 nm node. This means a greater device density and speed, both of which are critical aspects for a design like this.

#### **1.2.** ABBREVIATIONS

The following list provides definitions of commonly used abbreviations.

- ${\bf 3T}~$  Three transistor pixel with transfer gate
- ${\bf 4T} \ \ {\rm Four \ transistor \ pixel \ with \ transfer \ gate \ and \ pinned \ photodiode}$
- AB Anti-blooming
- ADC Analog to Digital Converter
- APS Active Pixel Sensor
- BST/BSI Backside Thinning/Illumination
  - CCD Charge Coupled Device
  - CDS Correlated Double Sampling
  - CFA Color Filter Array
  - CIS CMOS Image Sensor
- (C)MOS (Complementary) Metal Oxide Semiconductor
  - CTIA Capacitive Trans-impedance Amplifier
  - CVF Charge to Voltage conversion Factor
  - DCSN Dark Current Shot Noise
    - DS Double Sampling
    - DSC Digital Still Camera
  - DSLR Digital Single-Lens Reflex
  - DSNU Dark Signal Non-Uniformity
    - EMI Electro-Magnetic Interference
    - FBT Frame Blanking Time
    - FD Floating Diffusion
    - FF Fill Factor
    - FPN Fixed-Pattern Noise
    - **FPS** Frames Per Second

- FSI Frontside Illumination
- FWC Full-Well Capacity
  - GS Global Shutter CMOS technology
- HDR High Dynamic Range
- ITR Integrate Then Read
- **kTC** Reset noise of capacitor C
- LBT Line Blanking Time
- (MOS)FET MOS Field-Effect Transistor
  - MTF Modulation Transfer Function
  - PGA Programmable Gain Amplifier
  - PLS Parasitic Light Sensitivity (Shutter Efficiency)
  - PPD Pinned Photodiode
  - PPS Passive Pixel Sensor
  - PRNU Photo-Response Non-Uniformity
    - PSN Photon Shot Noise
    - QE Quantum Efficiency
    - RGB Red Green Blue
    - ROI Region of Interest
    - RS Rolling Shutter
    - RTS Random Telegraph Signal
    - SF Source Follower
    - SoC System-on-Chip
  - SPAD Single Photon Avalanche Diode
  - RWI Read While Intergrate: Pipelined electronic shutter, often GS
  - SN Storage Node
  - SPI Serial Peripheral Interface
  - TG Transfer Gate

#### **1.3.** ORGANIZATION OF THESIS

The thesis is organized in four chapters. The first includes the motivation behind this work as well as a list of the most common abbreviations the reader will find throughout the document. The second gives a timeline of the historic evolution of electronic imaging, focusing on the CMOS imager. An attempt is made at covering the principle points of the wide background of CIS, from the detector physics to building and characterizing CMOS imagers. The third chapter of this work, describes the design of a high-speed global shutter imager, including the floor-planning, readout scheme, pixel and justification of major design decisions. Finally, chapter four gives a conclusion and future work prospects.

## 2

### BACKGROUND

"If I have seen further it is by standing on ye sholders of Giants"

Isaac Newton

"There is one simplication at least. Electrons behave ... exactly the same way as photons; they are both screwy, but in exactly the same way..."

**Richard Feynmann** 

Since the very first steps of civilization, humans have had the need to create visual representations of the world around them to convey messages, explain their surroundings and preserve their memories. From the paintings of Lascaux, the mosaics of Mesopotamia to the first photographs of Nicéphore Niépce — the instinct to visually capture the world around us has always been with us, invariant throughout time.

With the advent of the transistor and the progressive development in the field of semiconductor physics, possibilities to perform solid-state electronic imaging started to emerge.

#### **2.1.** HISTORY OF ELECTRONIC IMAGING

- **1963** Morrison (Honeywell) demonstrates a photosensitive potentiometer (photopot), a photoscanner and a two-dimensional light spot position detector manufactured in standard semiconductor technology. [1]
- **1964** Horton, et al. (IBM) report the scanistor, a semiconductor device to perform image scanning. [2]
- **1966** Schuster & Strull (Westinghouse) report a 50×50 monolithic array of phototransistors. [3]
- **1967** Weckler (Fairchild) suggests operation of a floating p-n junction in photon flux integrating mode, where the junction photocurrent is integrated on a reverse-biased p-n junction capacitance. [4]
- 1967 Weimer, et al. (RCA) report a 180×180 TFT element self-scanned sensor. [5]
- 1968 Dyck & Weckler (Fairchild) report a 100×100 array of photodiodes. [6]
- **1968** Noble (Plessey) reports the first active pixel design using a buried photodiode, MOS source-follower transistor & an on-chip charge integrating amplifier. [7]
- **1970** Boyle & Smith (AT&T Bell Labs) report the first charge-coupled device (CCD). [8] The image quality achieved was so superior it resulted in a near complete halt of MOS imaging research. The low fixed pattern noise and small pixel size contributed heavily to the success of CCDs in the industry
- *Landmark:* Birth of CCD / Nearly halted research in MOS-based imaging: CCDs showed better performance across the board – they allowed for more compact pixels driven by only 3 phases, exhibited a lower readout noise, no fixedpattern noise and had low on-chip power dissipation. During the 70's and early 80's CCD technology was aggressively developed, resulting in major improvements of various performance aspects of these devices. Such included quantum efficiency, fill factor, dark current, charge transfer efficiency, smear, readout rate, lag, readout noise and dynamic range. The radiation tolerance of MOS devices (such as needed for example in space applications) kept research in MOS technology active during these CCD-dominated years. Hitachi and Matsushita continued with the use of the MOS technology for camcorder applications. [9], [10]
  - **1985** Hitachi combines a MOS sensor with a horizontal three-phase CCD shift register or "Bulk Chage-transfer Device (BCD)". [11]
  - 1989 VLSI Vision Ltd reports an integrated passive pixel sensor (PPS) array. [12]
  - 1992 NASA's Jet Propulsion Laboratory (JPL) begins researching the active pixel sensor.
  - **1993** JPL reports the first 128x128 active pixel array. [13]
  - 1995 The 4T PPD active pixel is invented. [14]

- The first successful high-performance CMOS IS demonstrated by JPL, AT&T and National Semiconductor.
- Canon releases the EOS-D30, their first CMOS-based DSLR featuring an APS-C inhouse designed sensor. [15]
- Canon releases the EOS-1Ds, their first CMOS-based DSLR featuring a full frame in-house designed sensor. [15]
- Nikon releases the D2X, their first CMOS-based DSLR in the "DX" form factor. It features 12MP, designed by Nikon and manufactured by Sony. [16]
- 2007 Canon develops a 50MP CIS for DSC.
- Nikon releases the D3, their first full-frame CMOS-based DSLR. It features 12MP, designed by Nikon and manufactured by Renesas.
- 2008 Canon develops a 120MP CMOS for DSC.
- 2008 Sony commercializes BSI-CIS for DSC.
- OmniVision follows shortly after, shipping a 1.4μm pitch 1/3" BSI-CMOS, fabbed by TSMC. [17]
- **2009** The first submicron pixel  $(0.9 \,\mu\text{m})$  is manufactured by Sony. [18]
- Sony announces at ISSCC'17 Industry's First 3-Layer Stacked BSI CMOS Image Sensor with DRAM for Smartphones. It features 21.2MP, 4k video at 60fps, 1080p video at 1000fps [19]

#### **2.2.** SILICON PHOTODETECTORS

ATERIALS in the solid state are separated into three groups depending on their electrical conductivity – insulators, semiconductors and conductors. This is dictated by the band structure of the atoms comprising the material. Electrons bound to an atom can only possess specific levels of energy, which are grouped in energy bands and separated by a "forbidden" range, called the bandgap. As can be seen in Figure 2.1, in the case of conductors the two energy bands overlap (*a*), allowing electrons to easily move from valence to conduction. Reciprocally, in the case of insulators valence electrons can't overcome the very large bandgap and contribute to conduction (*b*). Semiconductors, as the name implies, are found in between the two aforementioned cases. As such, valence electrons may gain enough energy – such as from thermal excitation – to cross the bandgap and move from the valence to the conduction band (*c*).



Figure 2.1: Energy bands of a conductor (a), an insulator (b) and an intrinsic semiconductor (c)

Semiconductors exhibit a vast change in conductivity by introducing minute amounts of impurity atoms in the crystal, a process known as doping. It's this property that renders semiconductors the most essential material in manufacturing electronics. Silicon (14), one such semiconductor, comprises in various forms 25% of the Earth's crust, and growth of high quality silicon dioxide can be thermally achieved. Due to silicon's abundance and material characteristics it has been the primary semiconductor of choice for the microelectronics industry [22].

Silicon, belonging to group IV, has four valence electrons which form bonds with neighboring atoms resulting in a crystal. When doped with a group III or group V element, an acceptor or donor atom is produced. In the case of the acceptor atom, the three electrons leave an electron vacancy in the lattice, while in the donor's case, the fifth electron is not bound by the lattice, making it a free carrier. In an intrinsic semiconductor, an electron possessing an energy level equal to half the bandgap has equal probability of being in the valence or conduction band. This energy level is called the intrinsic Fermi level. Doping the semiconductor with an acceptor (p-type) results in a shift of the Fermi level away from the intrinsic level, towards the valence band. Respectively, doping with a donor (n-type) shifts the Fermi level towards the conduction band [21].

#### LIGHT ABSORPTION IN SILICON

When silicon gets illuminated, there is an energy transfer from the impinging photon to the material. Depending on the photon's energy level, one of three mechanisms is involved:

- E < 1MeV: Photoelectric effect dominates: In 1839 A. Becquerel discovered the photovoltaic effect, illustrating a close relationship between light and electricity. The photoelectric effect was first observed by Hertz [20] and later described by A. Einstein, awarding him the Nobel prize in Physics in 1921. The impinging photon undergoes an interaction with an absorber atom in which the photon completely disappears. In its place, energetic photoelectrons are ejected from their atomic binding. This results in free electrons and "holes", or rather electron vacancies. The two are commonly referred to as an electron-hole pair (EHP). This is the mechanism that is leveraged in order to fabricate photodetectors in silicon.
- E ∈ [1MeV 10MeV]: Compton scattering dominates: The incoming photon is deflected through an angle θ with respect to its original direction. This change of momentum results in a decrease of the photon's energy. This energy loss is manifested as a decrease in the photon's frequency, according to E=hv.
- E > 10MeV: Pair production dominates: At energies greater than 1022keV (twice the rest mass energy of an electron and positron), the photon, passing near the nucleus of an atom, is subjected to strong field effects from the nucleus and its energy may get converted into an electron-positron pair according to Einstein's mass-energy equivalence equation  $E = mc^2$ . The positron then annihilates with an electron from the crystal, producing two gamma-rays of 511keV.



Figure 2.2: Mechanisms of photon absorption versus energy

Photodetectors are used to convert signals from the optical to the electrical domain and find their main applications in digital imaging and high-speed communication systems. When photons with wavelengths around the 200-1200*nm* region of the electromagnetic

spectrum interact with silicon, the photoelectric effect is expected almost exclusively as the absorption mechanism. Leveraging on this matter-photon interaction, silicon photodetectors typically rely for their operation on a two-step process. First, the generation of light-induced electron-hole pairs, and second their separation, via means of an electric field, and subsequent collection. The collected charge is finally used for the production of an electrical output signal. Impinging photons must have a sufficiently high energy to be absorbed, in order to be able to overcome the bandgap ( $E_{photon} = hv \ge E_g$ ). This imposes a fundamental minimum photon energy (i.e. maximum wavelength) that a silicon photodetector can detect. This can be calculated as follows:

$$E_{min} = hv_{min} = hc/\lambda_{max}$$
$$\Rightarrow \lambda_{max} = hc/E_{min}$$
$$\Rightarrow \lambda_{max} = hc/E_{g}$$

In the case of the bandgap of silicon ( $E_g = 1.14eV$ ), this value is calculated to be just over 1100nm. For wavelengths nearing  $\lambda_{max}$ , an auxiliary phonon is involved for a complete direct transition in an indirect bandgap material such as silicon. Hence the probability of absorption decreases significantly for wavelengths around this value. Since there is less chance of photon absorption within a given volume of the silicon substrate, the quantum efficiency reflects that – meaning that more photons are required to yield the same number of photoelectrons. As can be seen in the diagram of Figure 2.3, quantum efficiency (QE) decreases for longer wavelengths, tapering off to zero around  $\lambda_{max}$ . Photons with a greater than that wavelength have near-zero probability of being absorbed and thereby no photoelectrons are produced, resulting in zero QE.



Figure 2.3: Quantum efficiency versus wavelength in 4µm thickness BS (data courtesy of Caeleste) Remark: The oscillations seen are a result of Fabry-Pérot etaloning due to dissimilar refractive indices in the CMOS stack

Another parameter that is dependent on the impinging photon's energy, is the absorption depth in the semiconductor. It refers to the distance from the surface of the semiconductor material a photon of certain energy is most likely to be absorbed.

The photon flux at a depth *z*, possessing sufficiently high energy (E=hv) to be absorbed in the silicon, is given by:

$$\Phi(z) = \Phi_0 \cdot e^{-\alpha \cdot z}$$

Where  $\Phi_0$  is the impinging flux and  $\alpha$  the absorption coefficient, which is both material and photon energy dependent. Small wavelength (high energy) photons, such as UV and blue photons, are absorbed very close to the surface of the semiconductor, whereas for red and IR photons with lower energy this depth is much greater, in the order of micrometers. Taking Silicon as an example in the plot of Figure 2.4, photons with wavelengths of approximately 400 nm are absorbed within tens of nanometers from the surface, while on the other extreme, photons with wavelengths of 1 µm-1.1 µm are absorbed at depths in the order of tens of micrometers from the surface. [23].



Figure 2.4: Optical absorption coefficient and penetration depth as a function of wavelength for various semiconductors. (From [22])

In the absence of an applied or built-in electric field to separate the photogenerated electron-hole pairs, they will recombine and emit either light or heat. As previously mentioned, to detect the optical signal, the photogenerated carriers must be separated into holes and electrons and, in the majority of cases, collected (integrated) – to detect the signal efficiently, the free carrier recombination must be kept at a minimum as any recombined carrier is irrecoverably lost. In the following paragraphs the photodiode and pinned photodiode – the two most commonly used Si-based photodetectors – will be discussed, along with a brief explanation of why the Pinned Photodiode (PPD) is the de facto building block of the vast majority of digital CMOS imagers today [24].

• **Photodiode:** The photodiode, a type of planar p-n junction, is very widely used due to its performance as a light sensor and ease of manufacturing in standard CMOS processes. A cross-section of such a planar diode is shown in Figure 2.5along with the band & potential diagrams. The fabrication of a p-n junction involves diffusing impurities (acceptors from group III and donors from group V) into the silicon, proceeded by their activation – typically via thermal annealing. This creates large concentration of electrons in the donor (or n-type) region and holes in the acceptor (or p-type) region. Due to these gradients, a diffusion current is established as electrons travel from the n-type to the p-type region and holes respectively. As mobile electrons move away from the n-type region, they leave behind positively charged ionized atoms, referred to as "space charge". The reciprocal happens in the p-type region. The charged atoms, being immobile due to the lattice structure, form a region known as "space charge" or "depletion" region where an electric field is present. The potential of this electric field is referred to as "built-in potential" or  $V_{bi}$ . Consequently, the generated electrical field results in a drift current of carriers, opposing the diffusion current. Thus, charge neutrality inside the diode is maintained.



Figure 2.5: Cross-section of n-/p-sub photodiode with band & potential diagrams

As previously mentioned, when an impinging photon possessing an energy E = hv (where *h* is Plank's constant and *v* the frequency) greater than the bandgap of silicon is absorbed inside the semiconductor material, an electron-hole pair is generated. Electrons and holes generated within the depletion region are swept to either the p- or n-type region of due to the electric field present in the junction, resulting in a drift photocurrent through the junction from n- to p-type region. Electron-hole pairs generated outside the depletion region, but in its vicinity, diffuse to the depletion region and again are swept by the electric field present [22].

In order to produce a significant and easily measurable output signal, the photo-

diode is in most cases (with the exception of SPADs), operated as a photon flux integrator – a technique first reported by Weckler in [4]. In this mode, the diode is biased at a reverse voltage (below the breakdown voltage) and subsequently opencircuited. Under no illumination, the open-circuit voltage across the diode decays due to thermal generation of carriers, which result in a leakage current in the junction. This is known as "dark current", due to its presence even in the absence of light. Dark current is unwanted since it introduces additional noise, called "dark current shot noise", a result of the variance of thermally generated carriers under identical conditions [25]. When the diode is illuminated, a photocurrent is additionally established, with a magnitude proportional to the level of illumination, which results in a faster decay of the voltage across the diode .

$$\frac{dV_{out}}{dt} = \frac{I_{phot}}{C_{PD}(V)}$$

Figure 2.6 shows a model of the photodiode with its voltage-dependent junction capacitance  $C_{PD}(V)$  and the reset switch (RES), along with a diagram of the photodiode voltage  $V_{out}$  versus time. From 0 to  $t_1$  the diode is being reset via the RES switch. At  $t_1$  the reset switch is opened and the integration begins. As more charge is integrated, the voltage across the diode drops. Even at no illumination, the dark current will result in some voltage drop – this is called the "dark signal". At progressively increasing levels of illumination, the photocurrent established in the diode causes an increase in the slope of the response, as more charges are integrated per unit of time. At  $t_2$  the signal is sampled and the diode is subsequently reset again. Following that, the reset level is also sampled (at moment  $t_3$ ). The difference between the two samples constitutes the net photosignal.



Figure 2.6: Simple photodiode model & integrating photodiode output signal sample

The response speed of the photodiode depends on three main parameters: *a*) the diffusion time of carriers outside the depletion region, *b*) the drift time of carriers inside the depletion region and *c*) the time for evacuation of charges from the photodiode.

• **Pinned Photodiode:** In order to eliminate the image lag *(i.e. incomplete charge evacuation)* present in the p-n photodiode, a p+np- photodiode structure was proposed by Teranishi in 1982 [28]. In 1984, the structure received the name "pinned photo-diode" (a.k.a. PPD) in a paper published by Burkey et al. at Kodak [26]. In this paper the numerous advantages of the PPD are described. Such include the increased charge capacity compared to the traditional photodiode, due to the additional p-n junction formed between the n-type region and the p+ pinning implant and shielding of the Si – SiO<sub>2</sub> interface by the aforementioned implant, thereby greatly reducing dark current. We thus obtain the structure illustrated in 2.8, showing the buried n/p-sub junction as well as the p+ pinning implant at the surface, forming the p+/n junction.



Figure 2.7: Potential diagrams of a PD *(left)* and PPD *(right)* photodiodes (From [28], redrawn)

This back-to-back structure of two junctions results in some interesting properties that are of great value to the image sensor designer. Most notably, image lag can greatly reduced. The main source of image lag in traditional photodiodes is attributed to the transfer gate entering a subthreshold regime towards the end of the charge transfer, as the potential of the diode  $\psi_{PD}$  drops from its initial value  $\psi_{initial}$  and reaches the sub-threshold potential  $\psi_{subth}$  (see Figure 2.7). This results in incomplete charge evacuation of the diode during the transfer period, since the p-n junction is not fully depleted yet ( $\psi_{dep} > \psi_{subth}$ ), therefore some charges may not get transfered and remain on the junction.

On the other hand, the PPD structure may be fully depleted at a potential smaller than the subthreshold one ( $\psi_{PD} < \psi_{dep}$ ), as can be seen in the potential diagram of Figure 2.7. This depletion voltage is often called "pinning voltage" or  $V_{pin}$ . As a result, the entirety of the charge stored in the photodiode can be transferred quickly, within the transfer period and without the transfer gate entering into the subthreshold regime, resulting in near elimination of image lag. It should be noted, however, that other mechanisms such as potential pockets between the photodiode and transfer gate or charge spill-back at high illumination may also introduce image lag.

Furthermore, the PPD structure can be fully depleted, resulting in a thicker depletion region which provides an improvement in light sensitivity when compared to the traditional p-n photodiode. As mentioned in [28], the dark current of this photodiode is significantly lower, owing to two effects. Firstly, the p+ pinning layer serves to shield the buried n/psub junction from the generation-recombination centers present at the silicon-oxide interface. Secondly, the fact that the n/p-sub diode is buried results in a junction with greatly reduced surface defects, another contributor to dark current.



Figure 2.8: Simplified cross section of a PPD

This double diode structure also exhibits greater charge storage capacity compared to a simple p-n junction, resulting in a higher imager dynamic range. Finally, due to the complete charge transfer, the PPD allows for correlated double sampling which cancels or greatly reduces various noise contributors. The kTC noise of the readout can be completely eliminated and the source follower MOS-FET's 1/f noise and offset can be decreased to a great extent. Since the PPD structure is completely depleted, kTC noise of the photodiode is entirely absent.

Even though compared to a simple p-n diode, the PPD is more complex to implement and performs comparatively worse at very high illumination levels, it should be evident by the benefits mentioned above why this structure is the preferred choice and hence finds its way into most CMOS image sensors today [29].

#### **2.3.** ELECTRONIC SHUTTERS

**F** ROM the 4T pixel operation, detailed in Section 2.4.2, we can see that each row of pixels may be reset independently, allowing for purely electronic shuttering and eliminating the need for a mechanical shutter. Electronic shuttering has a number of advantages, such as allowing for very short exposure times, high-speed burst capturing and continuously variable shutter angle control since there is no physical rotating shutter. Additionally, the absence of complex and delicate mechanical parts makes for a fully solid-state camera system which can also be miniaturized to a much greater extent (ex. for usage in mobile phones) [27].



Figure 2.9: Rolling shutter principle

In the simplest variants of CIS however, be it PPS or APS, no memory elements are present in the pixel array. Consequently, pixel signals need to be read out of the pixel array and sampled on its periphery. Readout involves the column wires (typically one per column), so in order to avoid contention this readout operation must happen sequentially, row by row. Due to this sequential nature, this is called rolling shutter (RS). Albeit rolling shutter being the standard approach in the large majority of commercial CMOS imagers up till recently, it has a noteworthy pitfall. It results in a slight temporal shift in the integration time of each row, as shown in the timing diagram of Figure 2.9. While this intraframe shift is perfectly acceptable for quasi-still photography, when imaging rapidly changing scenes or fast-moving targets, motion artefacts such as the "jello" effect and motion skew are produced. A few examples of motion skew and non-uniform exposure due to rolling shutter are illustrated in Figure 2.10.



Figure 2.10: Various rolling shutter induced artefacts



Figure 2.11: Rolling versus Global shuttering

For applications where the aforementioned artefacts cannot be tolerated, like in highspeed imaging, an electronic global shutter (GS) is required. Also called "snapshot" shutter, global shuttering refers to the synchronous integration of the entire pixel array (Figure 2.12), thus eliminating any motion artefacts. In CCD-based sensors, global shutter is present "for free" due to fundamental modus operandi of the sensor. Unfortunately, in CMOS image sensors with their memory-like column and row readout architecture, only rolling shutter is possible in a single pass readout when using a 4T pixel. However, due to the versatility of the CMOS process, placing some form of memory in-pixel is very much possible, allowing global shuttering at the cost of reduced fill factor. The first global shutter capable pixel using an in-pixel sample & hold capacitor as analog memory was proposed by Yadid-Pecht *et al.* in 1991 [31].



Figure 2.12: Global shutter principle

In section 2.4.2, a few reported global shutter capable pixel designs are shown. It is evident that in this work, due to the very high speed nature of the sensor, GS is the only option as far as shuttering is concerned. The pixel architecture developed and how it achieves global shutter and HDR operation will be discussed in section 3.3 in detail.

#### **2.4.** PIXEL TYPES

T HE following section provides an overview of a few different pixel types. We distinguish two large categories, the Passive Pixel Sensor (PPS) and Active Pixel Sensor (APS), depending on whether an in-pixel amplifier is present or not.

#### **2.4.1.** PASSIVE PIXEL SENSORS (PPS)

The passive pixel or "1T" pixel is the simplest photodiode pixel and consists of the diode itself and a single transistor used as a select switch (Figure 2.13), which connects the diode to the column wire. Being the simplest pixel design, it offers great fill factor (FF), an advantage that is unfortunately countered by its poor noise performance, due to mismatch between the small pixel capacitance and the large column wire capacitance [29]. The operation of the 1T pixel is fairly straightforward. First, the select transistor is turned on and the photodiode is reverse biased through the column wire. After the transistor is turned off, the integration phase may begin. In the dark, the voltage on the junction decays over time due to the dark current of the diode. Under illumination, the photocurrent is directly proportional to the level of illumination, so it the charge [4]. By turning on the select transistor once again, the charge can be read out via means of a charge transimpedance amplifier (CTIA).



Figure 2.13: Structure of a 1T (PPS), 3T (APS) and 4T (PPD-APS) pixels

#### 2.4.2. ACTIVE PIXEL SENSORS (APS)

#### **3T PD APS PIXEL**

In order to improve on the performance of the passive pixel, Noble in 1968 proposed the addition of an in-pixel buffer, giving birth to the Active Pixel Sensor (APS) concept [7]. Noble's work led to the development of the three transistor (3T) pixel as we know it to-day (Figure 2.13). The detection cycle begins by turning on the RESET transistor, thereby resetting the photodiode to voltage VDD. By turning off the RESET transistor, the integration begins and photogenerated charge is integrated on the junction's capacitance, thereby reducing the voltage across it. The pixel is then read out by turning on the SE-LECT transistor. This allows a current source at the bottom of the column wire to bias the SF transistor which acts as a near-unity gain source follower, buffering the voltage of the photodiode to the column wire. Then, the voltage response can be measured on the column wire.

A commonly employed technique to reduce noise due to process and power supply variations is Double Sampling (DS) which can be performed as follows. After the integration and sample of the resulting voltage, the pixel is reset again and another readout operation is performed. By subtracting the signal and reset values, there is an improvement in the aforementioned noise contributors. However, it should be noted that due to the uncorrelated nature of the two samples, noise contributions such as kTC noise cannot be eliminated. An improved and well known variant of this technique, named Correlated Double Sampling (CDS) is possible in 4T pixels and will be discussed later.

In recent CIS designs, the 3T pixel has been phased out in favor of the 4T PPD pixel due to its superior performance.

#### **4T PPD APS PIXEL**

The four transistor (4T) pixel, built around a pinned photodiode (PPD) is the workhorse of modern CMOS image sensors. It retains the RES, SEL and SF transistors from the 3T design, but adds a fourth one, dubbed the 'transfer gate' (TG) as depicted in Figure 2.13. Its detection cycle begins with turning on the transfer gate and reset transistors. This results in the complete evacuation of all charge from the PPD as well as the floating diffusion (FD) node and resets the pixel. Subsequently, the transfer gate is turned off, allowing for charge integration on the PPD. At the end of the integration time, the reset transistor is turned on in order to reset the floating diffusion node and the resulting reset level is sampled via the source follower for the purposes of CDS operation. Following that, the transfer gate is turned on, allowing transfer of all signal charges under the TG and into the FD node, which results in a drop of its potential. Once the transfer is complete, the transfer gate is turned off again and the signal level is now sampled via the source follower. It should be noted that the TG pulse voltage, the doping profile under the transfer gate and the FD potential must cause a monotonic increase in potential  $\psi_{PPD} < \psi_{TG} < \psi_{FD}$  (see figure 2.7) from the PPD to FD to allow complete transfer of all signal carriers. Any carriers under TG at the end of the transfer should be subsequently transferred to FD at the end of the pulse period and not spill back to the PPD [24].

This configuration possesses a few advantageous properties. For one, the reset level on the FD node can be sampled prior to the signal value resulting in the two samples being correlated. As mentioned above, this means that CDS is possible and apart from removing noise contributions that DS does, it can also fully eliminate correlated noise contributors such as kTC, as well as reduce source-follower 1/f noise and residual offset. Also, by virtue of the fact that signal charges are read out from the floating diffusion node and not the photodiode itself, the diode capacitance is decoupled from the FD capacitance, allowing for much higher charge to voltage conversion gain (CVF) without having to sacrifice charge storage capacity of the photodiode. Another benefit of using a PPD-based pixel is that the photodiode exhibits significantly lower dark current due to the heavily doped p+ pinning implant. Due to all the reasons mentioned, a well-designed 4T pixel built in a CIS-optimized process can achieve noise levels down to a a few electrons.

#### **GLOBAL SHUTTER PIXELS**

As discussed in Section 2.3, global shutter operation involves synchronous integration of the entire pixel array of the sensor and subsequent row-by-row readout of the resulting signal. There exists a wide variety of pixel topologies that achieve this operation, the simplest one being the ubiquitous 4T pixel, shown in 2.14a.



Figure 2.14: Some types of GS-capable pixels (From [30], redrawn)

This four-transistor configuration allows for the beginning of the next exposure while the previous one is being read out, known as integrate-while-read operation (IWR), but only with uncorrelated double sampling (DS), not in-pixel correlated double sampling (CDS). It should also be noted that at high illumination levels, there is "blooming" *(i.e. charge leakage)*, due to the lack of charge draining infrastructure in the photodiode. This means that after full well is reached, charges will overflow from the pinned photodiode (PPD)

into the floating diffusion, unimpeded by the transfer gate. A variant of this topology, as seen in 2.14b, supports global flushing of the photodiode, resolving the blooming issue, but can be operated only in integrate-then-read (ITR) mode. With the addition of one transistor, the 5T pixel, show in 2.14c, allows for independent resetting of the photodiode and storage node, giving it the ability of IWR and suitability for high illumination levels. Nonetheless, the noise performance of all the mentioned pixel topologies is relatively poor in global shutter operation due to the presence of kTC noise, since CDS is not possible.

To mitigate this, an additional transistor can be added the form of an in-pixel sample and hold, resulting in a 6T pixel, shown in 2.14d. The advantage is that one can perform CDS to greatly improve the pixel's noise performance, however only ITR can be performed with this design. This topology, similarly to the 4T IWR pixel, still suffers from poor performance at high illumination, which is often the case with high speed, global shutter imagers. With the further addition of a transistor, the 7T pixel design (2.14e) is capable of IWR and CDS, as well as being able to handle high illumination without the issue of charge overflow from the photodiode, since the photodiode reset transistor  $RS_{PD}$  can be used as an overflow drain. [30]
#### **2.5.** TECHNOLOGY SCALING

Everyone involved in silicon design and manufacture is familiar with Moore's law and how through the evolution of CMOS technology nodes, smaller feature size has allowed for integrating progressively more transistors on a chip. While this effect does translate to CMOS image sensors, the fact they have to interact with light means certain particularities to how CIS feature size and pixel pitch have evolved. Figure 2.15, reproduced from [29] illustrates the evolution over time of the minimum feature size as according to ITRS, technology node used by CIS and pixel size.



Figure 2.15: CIS Technology Scaling From [29]

A. Theuwissen, in [29], highlights three noteworthy remarks about the subject. Firstly, it is evident that the technology node used to manufacture image sensors consistently lags behind cutting edge nodes reported by the International Technology Roadmap for Semiconductors (ITRS). This is attributed to the fact that these processes are used to manufacture digital circuits and therefore incompatible with the analog nature of image sensors due to high leakage, lower light sensitivity, increased device noise etc. Secondly, despite this offset, CIS technology nodes scale at near identical rate to standard digital CMOS processes. Finally, pixel pitch –a parameter that greatly influences, among others, the imager's cost and camera volume– scales with the technology node used with a ratio of approximately 20. While reducing pixel size has commercial benefits for image sensor manufacturers, pixel size reduction results in degradation in near-all performance characteristics of the pixel, both optically and electrically. These include lowering of the dynamic range, degradation of the signal-to-noise ratio, reduction of the depth of field and more.

#### **2.6.** TYPICAL CIS FLOORPLAN

The simplest APS CMOS image sensor floorplan consists of the pixel array, row and column scanning and decoder circuits, column amplifiers and a single chip-level ADC, as shown in Figure 2.16a. The photodiode charge is converted into a voltage by the in-pixel source follower amplifier, read out via the column wire and amplified. Finally, all the columns are multiplexed and output via a single (on- or off-chip) ADC. This architecture exhibits high area and power efficiency and low fixed-pattern noise due to all pixel values being read out by the same ADC. However, the single output limits the resulting frame rate. A popular variant which greatly improves the imager's output speed is the usage of column-level analog outputs or on-chip ADCs, as shown in Figure 2.16b. The tradeoff here is that imperfect matching between column circuits will introduce additional fixed-pattern noise, higher chip area and power consumption. Finally, the ADCs can be implemented inside the pixel, thereby moving the digitization of the signal as early as possible in the signal chain, as shown in Figure 2.16c. While this is an interesting concept, the advanced level of 3D integration that is required to manufacture such sensors, the effects of limited area on ADC performance and the low pixel fill-factor make it so that the DPS is still not used by mainstream CIS designs.



Figure 2.16: Floorplan of different ADC configurations

Thanks to the evolution of CMOS technology, it has become possible to manufacture entire camera Systems-on-Chip (SoC), integrating the imaging core, ADCs, digital image post-processing, sophisticated digital interfaces and more on the same chip, however the core floorplan is in most cases quasi-invariant, comprising of the pixel array, a Y addressing circuit to perform row-by-row operations, an X scanning circuit to scan the imaging plane on the horizontal direction and the column/readout circuits.

# **2.7.** COMMON CIS METRICS

N the following chapter the most common image sensor specific performance metrics will be discussed, along with their influence on the resulting image.

- **Resolution (Pixel Count):** In a general context, image resolution refers to the amount of spatial detail present in an image. However, in the digital imaging context, the term is often used interchangeably with pixel count, referring to the number of pixels the imager uses, typically expressed as an integer pair in the horizontal and vertical direction respectively (ex. 1920H x 1080V for fullHD). Referring to the overall number of pixels of an imager is also a popular alternative to specifying pixel count, especially in the consumer space. The typical unit employed is the Megapixel (MP), equivalent to one million pixels. When digital imaging started becoming really ubiquitous in the consumer segment, camera marketers alluded to the megapixel being the be-all-end-all metric of image quality, sparking the so-called "megapixel race". A notable case is the Nokia Lumia 1020, released in 2013, featuring a 41MP 2/3" BSI sensor. It will hopefully become clear to the reader of this thesis, however, that pixel count is only one of the many non-orthogonal factors that define the final image quality an imager produces.
- **Frame Rate:** Frame Rate (FR) is defined as the maximum number of full-resolution images the imager can produce per second, typically expressed in frames per second (fps). This is what is casually referred to as "speed" of a sensor. It should be noted that oftentimes multiple "passes" might be required for a complete image frame, depending on the mode of operation and pixel topology. A "pass" is defined as the readout of the pixel array and subsequent output of the read data. A "sample" refers to settling one signal on the column wire and sampling it, on-chip, on a capacitor. For example a 4T pixel can operated in various modes:
  - *One pass and one sample:* RS without CDS. The signal is settled on the column and read out in a rolling manner.
  - One pass and two samples: RS with on-chip CDS. The reset signal is settled on the column and sampled, the photodiode transfer is performed and the signal is again settled on the column. Then the CDS is performed and the resulting signal is read out.
  - *Two passes and one sample per pass:* RS and GS possible with off-chip CDS. The reset signal is settled on the column and read out. Then, the signal is also settled and read out and the off-chip CDS is performed.

In effect, this means that the resulting frame rate also depends on the pixel type, the use of CDS or not and the type of shutter used.

Oftentimes in CMOS sensors, besides the full-resolution frame rate, additional specifications are given for reduced resolutions at higher frame rates. This cropped area of the full frame is called Region-of-Interest (ROI) and is made possible due to the ability of CMOS sensors to perform random access readout of pixels. The

reason that a higher frame rate can be achieved at reduced pixel count is that, typically, the primary speed bottleneck in imagers does not originate from producing or sampling image data on-chip, but rather from transferring the captured data off-chip (and/or converting to a digital output). Thus, by reducing the number of pixels that are employed to image a scene and increasing the frame rate, the data rate remains unaltered. This feature is available on most CMOS imagers, allowing the user to trade off size of field-of-view (or quality) for image acquisition speed.

• **Fill Factor:** Fill factor (FF) is defined as the ratio of the area of the pixel's photosensitive area over the total pixel area. Front side illuminated CMOS sensors (FSI) always have a FF smaller than unity, due to the fact that a part of the total pixel area is occupied by the pixel's transistors and metal interconnects. However, backside illuminated CMOS sensors (BSI) and CCD sensors achieve a fill factor nearing 100%.



Figure 2.17: Pixel top view showing reduced FF

The most prevalent solution to the reduced fill factor is the employment of microlenses (Figure 2.19), which help focus the incoming light onto the area of the photodiode, yielding an augmentation of the effective FF. Intuitively one would think that the usage of microlenses in backside illuminated sensors is moot due to unity fill factor, however in practice they are implemented in most BSI sensors. The reason is that microlenses also have the advantageous property of providing compensation of vignetting. Vignetting – a term stemming historically from film photography, refers to the reduction of brightness in the periphery of an image due to limitations in the lenses of that era. In CIS, a similar effect may arise as a consequence of lower photon absorption in the periphery of the imaging plane, due to a large chief ray angle of the incoming light.

In addition to the increased light absorption in the pixel due to a steeper light ray angle, microlenses also provide a reduction of interpixel optical crosstalk. This would otherwise result in chromatic aberration away from the center of the pixel array, especially in the case of red light, due to its large absorption depth in the silicon. The use of microlenses as well as the optical crosstalk effect are illustrated in Figure 2.18.



Figure 2.18: Use of microlenses in BSI for vignetting compensation



Figure 2.19: Microscope image of an FSI sensor (Sony IMX038,  $5.6\mu$ m pixel pitch) showing partially removed microlenses, exposing the Bayer pattern CFA underneath

• **Quantum Efficiency:** As seen in Section 2.2, silicon photodetectors rely on the photogeneration of electron hole pairs and their subsequent separation to avoid recombination. If the photon's energy is not sufficiently high or it does not get absorbed, it traverses the photosensitive volume and is ultimately lost. Idem if recombination occurs, the photon is not detected. Quantum efficiency is defined as the product of the probability of an impinging photon to generate an electron hole pair and the probability of the pair to not recombine, but be separated and subsequently detected. Simply put, then, QE is the ratio between the number of electrons that contribute to the output signal and the number of photons falling on the pixel.

• **Modulation Transfer Function:** Image sensors detect light intensity variations not only in temporal but also in the spatial domain. The resolution of this spatial sampling is dependent primarily on the number of pixels available and the overall imager size. Resolution is defined as a measure of the highest spatial frequency that can be recorded at a specific contrast for a sinusoidal input. Since contrast refers to a ratio of values, two pixel values are required, therefore resolution is expressed in line-pairs per millimeter (lp/mm) [23].



Figure 2.20: Decrease of MTF with increasing spatial frequency

In the ideal case, the highest MTF achievable by an imager at Nyquist frequency is approximately 63%. This is due to the sinusoidal input signal and the square nature of the ideal sampling pixel:  $(A_{bright} - A_{dark})/(A_{bright} + A_{dark}) = 2/\pi$ .



Figure 2.21: Sampling a sinusoidal input with ideal pixels at Nyquist frequency  $(1/f_{in} = 2Pitch_{pixel})$ 

- **Full Well Capacity:** Full well capacity refers to the number of electrons that can be contained in the potential well of the photodiode. As discussed in Section 2.8, the full well capacity is intimately linked with the dynamic range that the imager can achieve. The top of the range is determined by the well capacity and at the bottom by the presence of the noise floor. Typical values of FWC range in the order of approximately 10ke<sup>-</sup> to 1Me<sup>-</sup>.
- Frontside/Backside Illumination: The CMOS process involves step-by-step manufacturing of semiconductor devices on a bulk silicon wafer. The result is a thick layer of bulk silicon that mostly acts as a mechanical support, followed by a thin region where the active devices are located. On top of these, the polysilicon layer is

deposited followed by the metal interconnect layers at the very top. Up till recently, CMOS image sensors were operated in the same orientation as the manufacturing process occurred. Light enters at the top of the "stack", past the interconnects and goes on to be detected in the photodiode. Due to the direction of illumination, this is called frontside illumination or FSI. While optimization steps can be taken to improve the performance of such a structure (such as careful metal layout to create a "funnel" shaped structure to guide light towards the diode, special steps in the process to fabricate "light pipes" and frontside thinning), it remains, however, suboptimal. A significant number of the photons that arrive at the surface of the sensor are lost due to scattering and absorption on their way to the photodiodes and the high aspect ratio opening in the metal interconnects results in poor angular response, exacerbating vignetting



Figure 2.22: Cross-section of FSI and BSI configurations

The superior solution involves manufacturing the image sensor much in the same way as in FSI, but illuminating it from the back side, giving the technique its name, backside illumination or BSI. Obviously, before any light can reach the photodiodes, the thick layer of a few hundred microns of bulk silicon must be thinned down to a few microns of thickness – a process known as backside thinning. While the concept is not new [32], developing the methods to perform backside thinning uniformly and reliably in large scale is what only recently allowed BSI to be implemented in consumer devices. The result is a near-unity fill factor, since metal interconnects are now located beneath the diodes, improved quantum efficiency, better blue and UV spectral response as well as increased angular response, since the photodiodes are located very close to the sensor's surface. A couple of drawbacks of BSI are increased parasitic light sensitivity (PLS) since the active devices are no longer shielded by the metalization layers and increased cost of production.

• **Dynamic Range:** The dynamic range is defined as the ratio of the brightest over the dimmest object that is detectable:

$$DR = 20 log_{10}(N_{sat}/N_n) [dB]$$

 $N_{sat}$  the maximum number of carries that can be measured (limited by the full well capacity - FWC) and  $N_n$  the minimum number of carries detectable (limited by the noise floor). In the above equation the two expressions are equivalent since the signal power is proportional to the impinging photon flux.

It is often the case that we wish to image a scene with a very wide gamut of illumination (in the order of 100dB). The human eye is capable of an impressive dynamic of approximately 90dB, while CMOS APS sensors can only deliver a range of about 40-80dB in a single exposure. This has led to the emergence of various creative techniques to achieve higher dynamic range.

The most common ways to boost dynamic range is through lowering of the noise floor, expanding the charge storage capability of the pixel, having a variable conversion gain, altering the charge integration time and creating a non-linear photoresponse.

• **Signal to Noise Ratio:** The signal to noise ratio is defined as the ratio of the wanted signal over the noise at this input level:

$$SNR = 20 \log_{10}(N_{sig}/N_n) \ [dB]$$

Two types of noise components contribute to SNR degradation, photon shot noise and the electronic readout noise. At low illumination where few photons are captured by the pixel, photon shot noise is lower than the device noise. Thereby, SNR is limited by the electronic noise. As the illumination is increased, so is the photon shot noise due its square-root dependence on the input signal. Since device noise is independent of illumination, the SNR of the imager is now limited by shot noise.

• Conversion Gain: A CMOS pixel is a device that converts a signal from the optical domain (incident photon flux) into the electrical domain (typically voltage). Ideally, there is a perfectly linear relation between optical and electrical signal. The term conversion gain refers to the coefficient between output signal produced and photocarriers generated. The term is often used interchangeably with charge-to-voltage factor (CVF) and typical units are μV/e<sup>-</sup>. It is of great interest to the CIS designer to maximize the CG of an imager in order to achieve better SNR and lownoise performance. In the typical CMOS 4T pixel, CG depends on the floating diffusion's capacitance and all associated parasitic capacitances such as the source follower's input capacitance.

## **2.8.** Sources of Non-ideality in CIS

I MAGE SENSORS, as all circuits, particularly analog, exhibit many different deviations from their designed theoretical operation due to a plethora of mechanisms. In imager jargon, these non-idealities are all categorized as noise, regardless of whether they strictly fit the definition. Some noise contributors are fundamental, inherent to the quantum nature of light and can't be overcome. Some can be suppressed by design optimization or calibrated for, and others are almost exclusively dependent on the quality and tolerances of the CMOS manufacturing process. Thus, a thorough understanding of the inner workings and subtleties of imager noise as well as how it is influenced by circuit design is of significant importance to the CIS designer.

Image sensor noise can be divided into two large categories, temporal noise and spatial (fixed-pattern) noise. Temporal noise stems from time varying processes, such as thermal excitation, device 1/f noise, EMI etc., causing fluctuations in the pixel's output signal. On the contrary, spatial noise originates from time-invariant non-uniformity in the manufacturing of the devices, from pixel to pixel and column to column, ADC matching and others. An overview of the major noise contributions of CIS is summarized in the table of Figure 2.1.

|                                                    | Temporal Noise                                                                   | Spatial Noise                                                 |
|----------------------------------------------------|----------------------------------------------------------------------------------|---------------------------------------------------------------|
| Quasi-invariant with T, $T_{int}$ and illumination | Device noise such as kTC,<br>1/ <i>f</i> , temporal row noise,<br>RTS, ADC noise | Column and row FPN, cos-<br>metic flaws                       |
| Dependent on T, t <sub>int</sub>                   | Dark current shot noise<br>(DCSN)                                                | Dark signal non-uniformity<br>(DSNU)                          |
| Signal dependent                                   | Photon shot noise (PSN)                                                          | Random and column<br>photo-response non-<br>uniformity (PRNU) |

Table 2.1: Overview of the different noise contributors

Therefore, the total output noise present in a pixel can be expressed as a sum of all the uncorrelated noise sources present:

$$\overline{v_{pix}^2} = \overline{v_{PSN}^2} + \overline{v_{kTC}^2} + \overline{v_{thermal}^2} + \overline{v_{1/f}^2} + \overline{v_{DCSN}^2}$$

Where  $v_{PSN}$  is the photon shot noise,  $v_{kTC}$  is the kTC (or reset) noise,  $v_{thermal}$  and  $v_{1/f}$  are the MOSFET thermal and 1/f noise respectively and  $v_{DCSN}$  the dark current shot noise. By dividing the pixel's voltage noise with the CVF (Charge to Voltage Factor), we can obtain an equivalent noise in terms of electrons on the photodiode. Idem to inputreferring noise in amplifiers, this allows noise performance comparison across different pixels of varying architectures and gains.

#### **Temporal Noise**

• **Photon Shot Noise:** Shot noise originates from the very nature of light, it thus cannot be avoided and represents the fundamental lowest noise limit an ideal imaging system can achieve. The number of photons arriving in the detector, their timing and amount of generated of electron-hole pairs are all stochastic processes. The PSN noise on the signal can therefore be described using a Poisson distribution:

$$Q_{noise} = \sqrt{Q_{sig}}$$

where  $Q_{noise}$  is the PSN and  $Q_{sig}$  the signal (in  $e^-$ ). In the ideal case of complete absence of other noise contributors, the imager would be photon shot-noise limited, exhibiting a signal to noise ratio (SNR) dependent only on the input signal:

$$SNR_{PSN} = \frac{Q_{sig}}{Q_{noise}} = \sqrt{Q_{sig}}$$

The above equation implies that even in the case of the ideal imager, if one wishes to increase the SNR, simply more photons need to be captured and converted into electron-hole pairs. Unfortunately, there is no way of circumventing this fundamental limitation. For the CIS designer, this has two implications. First, if the photon shot noise is to be reduced, the pixel's fill factor (FF) and the quantum efficiency (QE) need to be maximized. Second, given an SNR specification, the minimum pixel full-well capacity (FWC) is a priori inferred by that specification.

 Thermal (or Johnson-Nyquist) Noise: Thermal noise is present in all resistors and resistive devices such the resistive channel between drain and source of a MOSFET. Its root cause is carrier thermal agitation in the conductor and it exhibits a spectral density that is approximately constant up to extremely high frequencies. It can be modeled either as a series voltage source or a parallel current source as follows:

$$v_n = \sqrt{4kTRB}$$
 or  $i_n = \sqrt{4kTB/R}$  (in the case of a resistor *R*)  
 $i_n = \sqrt{4kT\frac{2}{3}g_mB}$  (in the case of a MOSFET in saturation)

where k is Boltzmann's constant, T the absolute temperature, R the resistance, B the bandwidth and  $g_m$  the MOSFET's transconductance.

• **kTC** (or Reset) Noise: kTC noise expresses the uncertainty of charge stored on a capacitor due to thermal excitation of the carriers. It does not originate from the capacitor itself, but rather from the thermal noise being "frozen" at a random value when the capacitor is open-circuited. Thus, if a voltage is sampled multiple times

on a capacitor, a variation in the sampled voltage can be observed. The meansquare noise voltage  $v_n$  and charge uncertainty on the capacitor  $Q_n$  in such a case are given by:

$$v_n = \sqrt{kT/C}$$
  $Q_n = \sqrt{kTC}$ 

where k is Boltzmann's constant, T the absolute temperature and C the capacitance. Pixel "reset" noise is due to this very effect, connecting and disconnecting the floating diffusion (FD) capacitance from a DC voltage results in uncertainty on the reset voltage. Fortunately, pixel reset noise can be greatly suppressed via a range of techniques, the most used being correlated double sampling (CDS) as well as through various techniques of performing the pixel's reset (soft, hard-tosoft and active [33] reset) in order to mitigate the issue.

• Flicker (or 1/*f*) Noise: Flicker (or 1/*f*) noise refers to the noise exhibiting a power spectral density which is inversely proportional to frequency. While historically there has been some controversy as to the exact origin of this type of noise in MOS-FETs, it is now attributed to dangling bonds at the oxide interface, which result in extra energy states. These energy states capture and release carriers at random time intervals, thereby introducing flicker noise in the drain current. Typically, it is modeled as a series voltage source at the gate with a value of:

$$\overline{\nu_n^2} = \frac{K}{C_{ox}WL} \cdot \frac{1}{f}$$

where *K* is a process-dependent constant,  $C_{ox}$  the oxide capacitance, *W* and *L* are the transistor's width and length, and *f* the frequency.

As the name implies, the magnitude of 1/f noise is inversely proportional to frequency, making it a prevalent contributor at low frequencies. Due to this, 1/f noise is easily noticeable by the human vision. Fortunately, CDS can greatly reduce flicker noise due to its high-pass filter-like behavior.

• **Dark Current:** Dark current refers to the parasitic leakage current of the photodiode under no illumination conditions. The primary cause behind dark current is the presence of generation centers with energies near the mid-gap due to impurities it the Silicon lattice such as contaminants, dislocation faults, vacancies etc. Dark current is either expressed as a current  $[e^{-}/s]$  or as current per unit area  $[pA/cm^2]$  or  $[e^{-}/m^2/s]$ . In principle, calibration of dark current is straightforward and performed by means of a "dark frame" reference. In practice, however, its spatial non-uniformity (DSNU) and temporal variability, such as dark current shot noise (DCSN) and dark current random telegraph signal (DC RTS), make the calibration process more cumbersome. The use of a pinned photodiode in modern pixel designs greatly contributes in the reduction of dark current. This is thanks to the buried diode, suffering less defects, and the shielding of the oxide interface by the pinning layer (see Figure 2.8).

#### Spatial Noise

- Fixed-Pattern Noise (FPN): Fixed-pattern noise is an umbrella term which is used to refer to all time-invariant non-idealities that exist in the image produced by the sensor. As the name suggest, these artefacts are omnipresent under all levels of illumination and from frame to frame. While FPN is not noise under the strict definition of the term, due to the similar resulting effects in the image, it is classified as such. One of the origins of FPN is threshold voltage  $(V_{th})$  variation, primarily of the source-follower transistor, from pixel to pixel. Another common source is mismatch between different columns of the imager, due to variation in the column circuitry. This produces variations along entire columns of the image, thus it's also known as column FPN. Stationary row noise originating from the imager's row driver can also potentially be another source of FPN. Finally, ADC mismatch, depending on the readout architecture, might contribute to column FPN in the case of column-parallel ADCs or yield more complicated artefacts in the case of output multiplexing. As a general case, pixel-level FPN is effectively suppressed by correlated double sampling, while column-level FPN may or may not be improved by it, depending on the location of the CDS operation along the signal chain. In all cases however, off-chip digital calibration of FPN is very effective and relatively easy to perform since there is no time dependence.
- **PRNU and DSNU:** Photo-response non-uniformity (PRNU) is used to express the variation in pixel response under uniform illumination. Off-chip calibration of PRNU is possible by means of a gain map, which will apply the appropriate gain correction for each pixel of the imager.

Similarly, dark signal non-uniformity (DSNU) is used to express pixel offset in the absence of light. Correction of DSNU is done by a so called "dark frame", however the possible high dependence of the dark signal on temperature complicates this calibration.

In most cases, the imager will include some sort of reference pixel (such as "black pixels") in the periphery of the functional pixel array. These pixels produce a dark signal even in the presence of light which is used for calibration of the aforementioned effects amongst others. One can create black pixels either electrically or optically. The former involves modifying a normal pixel such that the photodiode is permanently reset, rendering the pixel light insensitive. The latter is implemented by means of a metal shield on top of the pixel, physically blocking the light from reaching the photodiode. A potential caveat with using optically black pixels is that the metal shield might potentially influence the dark current of the pixel, reducing its usability as a reference for calibration.

# **2.9.** HIGH-SPEED APPLICATIONS

With the progressive advancements in CIS image quality and their ability to perform massively parallel readout, very high overall frame rate and data throughput can be achieved. This has resulted in a plethora of new and emerging applications for high speed image acquisition. Such include time-of-flight applications [34] for non-stereo vision depth mapping and rangefinding [38], motion capture and position tracking [35]-[36], high-speed machine vision, optical flow estimation [37] and biological imaging such as fluorescence lifetime imagining microscopy and particle velocimetry. It is clear then that a wide plethora of existing and emerging applications rely on high speed imaging, requiring ever-improving speed and noise performance.



(a) Particle velocimetry of a methane flame [40]



(b) Fluorescence lifetime imaging [39]



(c) Human face captured with time-of-flight camera [41]

Figure 2.23: Examples of high-speed imaging applications

# **2.10. SUMMARY**

In this chapter a general background on CMOS image sensors is presented. A timeline of important steps and milestones in history of electronic solid-state imaging is presented. The principles that allow photodetection in silicon are introduced, explaining how the photoelectric effect is leveraged in order to measure light in the electrical domain. The two types of electronic shuttering are shown, followed by a quick view of passive and active pixels, along with the most commonly found active pixel architectures. Further on, the influence of process technology scaling in CIS is discussed. A typical elementary floorplan of a CMOS image sensor is shown, along with the different possible ADC configurations. Following that, the principle performance aspects that quantify the quality of an imager are presented, as well as the various sources of non-ideality that degrade the final image quality, both invariant and not with time. Finally, several applications of high speed imaging are mentioned, spanning various fields such as research, medical and industrial machine vision.

#### REFERENCES

- S. Morrison. "A new type of photosensitive junction device," Solid-State Electronics, vol. 5, pp. 485-494 (1963)
- J. W. Horton, R. V. Mazza and H. Dym. "The scanistor A solid-state image scanner," Proceedings of the IEEE, vol. 52, no. 12, pp. 1513-1528 (1964)
- [3] M. A. Schuster and G. Strull. "A monolithic mosaic of photon sensors for solid-state imaging applications," IEEE Transactions on Electron Devices, vol. ED-13, no. 12, pp. 907-912 (1966)
- [4] G. P. Weckler. "Operation of p-n Junction Photodetectors in a Photon Flux Integrating Mode," IEEE Journal of Solid-State Circuits, vol. 2, no. 3, pp. 65-73 (1967)
- [5] P. K. Weimer, G. Sadasiv, J. E. Meyer, L. Meray-Horvath and W. S. Pike. "A self-scanned solid-state image sensor," Proceedings of the IEEE, vol. 55, no. 9, pp. 1591-1602 (1967)
- [6] R. H. Dyck and G. P. Weckler. "Integrated arrays of silicon photodetectors for image sensing," IEEE Transactions on Electron Devices, vol. 15, no. 4, pp. 196-201 (1968)
- [7] P. J. W. Noble. "Self-scanned silicon image detector arrays," IEEE Transactions on Electron Devices, vol. 15, no. 4, pp. 202-209 (1968)
- [8] W. S. Boyle and G. E. Smith. "Charge coupled semiconductor devices," The Bell System Technical Journal, vol. 49, no. 4, pp. 587-593 (1970)
- [9] S. Ohba et al. "MOS area sensor: Part II—Low-noise MOS area sensor with antiblooming photodiodes," IEEE Transactions on Electron Devices, vol. 27, no. 8, pp. 1682-1687 (1980)
- [10] K. Senda, S. Terakawa, Y. Hiroshima and T. Kunii. "Analysis of charge-priming transfer efficiency in CPD image sensors," IEEE Transactions on Electron Devices, vol. 31, no. 9, pp. 1324-1328 (1984)
- [11] H. Ando *et al.* "Design consideration and performance of a new MOS imaging device," IEEE Transactions on Electron Devices vol. ED-32, no. 5, pp. 1484-1489 (1985)
- [12] VLSI Vision Ltd. "Matrix array image sensor chip". USA United States Patent US5345266A, Sept. 23, 1989
- [13] S. K. Mendis, S. E. Kemeny and E. R. Fossum. "A 128 × 128 CMOS active pixel image sensor for highly integrated imaging systems," Proceedings of IEEE International Electron Devices Meeting, Washington, DC, USA, pp. 583-586 (1993)
- [14] P. Lee, R. Gee, M. Guidash, T-H. Lee, and E.R. Fossum. "An active pixel sensor fabricated using CMOS/CCD process technology," IEEE Workshop on Charge-Coupled Devices and Advanced Image Sensors, Dana Point, California (1995)

- [15] Wikipedia. "Comparison of Canon EOS digital cameras". https://en.wikipedia.org/wiki/Comparison\_of\_Canon\_EOS\_digital\_cameras
- [16] Ken Rockwell. "Nikon DSLR History". http://www.kenrockwell.com/nikon/dslr.htm
- [17] Image Sensors World Blog. "Omnivision Demos 1.4um BSI Sensor". http://image-sensors-world.blogspot.be/2008/05/omnivision-demos-14um-bsi-sensor. html
- [18] K. Itonaga *et al.* "0.9μm pitch pixel CMOS image sensor design methodology," IEEE International Electron Devices Meeting (IEDM), Baltimore, MD, pp. 1-4 (2009)
- [19] T. Haruta *et al.* "A 1/2.3inch 20Mpixel 3-layer stacked CMOS Image Sensor with DRAM," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, pp. 76-77 (2017)
- [20] Board of Regents *Report of the Board of Regents* Smithsonian Institution, United States National Museum, Smithsonian Institution, p. 239 (1914)
- [21] H. Veendrick. "Nanometer CMOS ICs, from Basics to ASICs". Springer (2017) ISBN: 978-3-319-47595-0
- [22] Simon M. Sze, Kwok K. Ng. "Physics of Semiconductor Devices, 3<sup>rd</sup> Edition". Wiley (2006) ISBN: 978-0-471-14323-9
- [23] A. J. P. Theuwissen. "Solid-State Imaging with Charge-Coupled Devices". Kluwer (1995) ISBN: 978-0-306-47119-3
- [24] E. R. Fossum and D. B. Hondongwa. "A Review of the Pinned Photodiode for CCD and CMOS Image Sensors," IEEE Journal of the Electron Devices Society, vol. 2, no. 3, pp. 33-43 (2014)
- [25] D. Durini. "High Performance Silicon Imaging, 1<sup>st</sup> Edition". Woodhead Publishing (2014) ISBN: 978-0-85709-598-5
- [26] B. C. Burkey *et al.* "*The pinned photodiode for an interline-transfer CCD image sensor,*" IEEE Transactions on Electron Devices, pp. 28-31 (1984)
- [27] Henri Maître. "From Photon to Pixel: The Digital Camera Handbook". Wiley (2015) ISBN: 978-1-84821-847-5
- [28] N. Teranishi, A. Kohono, Y. Ishihara, E. Oda and K. Arai. "No image lag photodiode structure in the interline CCD image sensor," IEEE Transactions on Electron Devices, pp. 324-327 (1982)
- [29] A. J. P. Theuwissen. "Better Pictures Through Physics," IEEE Solid-State Circuits Magazine, vol. 2, no. 2, pp. 22-28 (2010)

- [30] Stefan Lauxtermann, Adam Lee, John Stevens, Atul Joshi. "Comparison of Global Shutter Pixels for CMOS Image Sensors," IISW Intl. Image Sensor Workshop, Maine, USA (2007)
- [31] O. Yadid-Pecht, R. Ginosar and Y. Shacham-Diamand. "A Random Access Photodiode Array for Intelligent Image Capture," IEEE Transactions on Electron Devices, vol. 38, no. 8, pp. 1772-1780 (1991)
- [32] M. M. Blouke, J. E. Hall and J. F. Breitzmann. "A 640 kilopixel CCD imager for space applications," Digest. International Electron Devices Meeting, pp. 412-414, Washington, D.C., USA (1978)
- [33] B. Pain, T.J. Cunningham, B. Hancock, G. Yang, S. Seshadri and M. Ortiz. "Reset noise suppression in two-dimensional CMOS photodiode pixels through columnbased feedback-reset," Digest. International Electron Devices Meeting, pp. 809-812, San Francisco, USA (2002)
- [34] Y. Kato et al. "320x240 Back-illuminated 10um CAPD pixels for high speed modulation Time-of-Flight CMOS image sensor," 2017 Symposium on VLSI Circuits, Kyoto, pp. C288-C289 (2017)
- [35] S. Kawahito, D. Handoko, Y. Tadokoro and A. Matsuzawa "Low-power motion vector estimation using iterative search block-matching methods and a high-speed nondestructive CMOS image sensor," IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 12, pp. 1084-1092 (2002)
- [36] Y. Nakazawa, H. Makino, K. Nishimori, D. Wakatsuki and H. Komagata "Indoor positioning using a high-speed, fish-eye lens-equipped camera in Visible Light Communication," International Conference on Indoor Positioning and Indoor Navigation, Montbeliard-Belfort, pp. 1-8 (2013)
- [37] A. O. Ercan, Feng Xiao, Xinqiao Liu, SukHwan Lim, A. El Gamal and B. Wandell "Experimental high speed CMOS image sensor system and applications," Proceedings of IEEE Sensors, vol. 1, pp. 15-20 (2002)
- [38] M. Baba, T. Konishi and N. Kobayashi "A new fast rangefinding method based on a non-mechanical scanning mechanism and a high-speed image sensor," IEEE Instrumentation and Measurement Technology Conference Sensing, Processing, Networking. IMTC Proceedings, Ottawa, vol. 2, pp. 957-962 (1997)
- [39] https://www.picoquant.com/applications/category/life-science/ fluorescence-lifetime-imaging-flim
- [40] https://en.wikipedia.org/wiki/Particle\_image\_velocimetry#/media/File:PIV\_ through\_stagnation\_flame.jpg
- [41] https://en.wikipedia.org/wiki/Time-of-flight\_camera#/media/File:TOF\_Kamera\_3D\_ Gesicht.jpg

# 3

# **THIS WORK**

I used to wonder how it comes about that the electron is negative. Negative-positive these are perfectly symmetric in physics. There is no reason whatever to prefer one to the other. Then why is the electron negative? I thought about this for a long time and at last all I could think was 'It won the fight!'

Albert Einstein

Measure what is measurable, and make measurable what is not so Galileo Galilei

In this chapter the design of a high speed, global shutter CMOS image sensor with HDR and flexible exposure control will be described. First, the sensor's top-level architecture will be detailed, followed by the pixel design, as well as the readout and digital circuits. The design for testability will also be discussed briefly, along with a few sample modi operandi of the sensor.

# **3.1.** INTRODUCTION

I N this chapter the design of the sensor is detailed. The top-level architecture of the imager is discussed, the GS/HDR pixel is shown along with its driving and readout circuits and the signal path from pixel to output. The digital circuits of the imager are also shown. Additionally, the circuits that will allow for testing and characterization of the sensor once manufactured are also briefly discussed at the end of the chapter.

As the design cycle progresses, initial performance specifications and the top-level architecture study trickle down and will dictate the specific implementation of the subcircuits, whose performance will be verified by means of simulation. After the layout of a circuit is complete, a post-layout extraction of the parasitics will allow for a more accurate simulation, giving better insight on the resulting performance of the circuits once implemented in silicon.

## **3.2.** System Architecture

I N this section the high-level architecture of the image sensor will be detailed. In the case of a preliminary top-level study of the design, the sensor's subcircuits don't yet have an exact transistor-level schematic but the focus is on high-level design decisions such as the sensor's floorplan, the readout architecture, the pixel topology, the kernel size and number of output channels.

#### **3.2.1.** TOP LEVEL ARCHITECTURE & FLOOR PLAN

Image sensors are the largest analog/mixed-signal integrated circuits, sometimes exceeding single-reticle size. As such, through careful and systematic groundwork, a floorplan of the image sensor must be established in the very early stages of the design process. This involves preliminary calculations considering the die area required, signal propagation delay, data throughput (thus determining the necessary number of output channels required), power requirements, packaging considerations etc.

One design parameter that is greatly influenced by the requirement for high-speed operation is the sensor's power consumption. Small amplifier settling time, rapidly switching digital circuits and charge/discharge of parasitic capacitances amongst others have direct design trade-offs with power consumption. Thus, it is of paramount importance that the sensor designer makes an accurate estimation of the power required for the operation of the imager and foresees on-chip power distribution requirements, accounting for voltage drop, inductive ringing and power supply crosstalk.



Figure 3.1: The top-level block diagram of the sensor

During this initial floor-planning phase all the major circuit blocks of the image sensor are provisionally laid out and the first planning is made on how the signal and power supply routing will be performed in the later stages of the design. Amongst others, things that are considered include control signal distribution and uniformity across the pixel array, settling time of reference voltages, peak current consumption that each power supply will need to handle with acceptable voltage drop and more. The floor plan block diagram of this work is shown in Figure 3.1 depicting all major functional blocks of the sensor.

From this planning phase, design guidelines will emerge that will dictate how many bondpads will be required, how they will be placed on the periphery of the IC, how functional blocks of the imager will be organized, laid out and interconnected amongst others.

#### **3.2.2.** THE QUADRANT SPLIT

S INCE the sensor described in this work is expected to produce a very high output data rate, a quadrant-based architecture is chosen at this early design stage. The quadrant approach is a proven technique used in high-speed imagers which involves splitting the sensor along the East-West and North-South axes (shown in Fig. 3.2).

This technique results in a few advantages; first, the North-South split reduces the length of the column wires by a factor of 2, thus reducing settling time of the video signal on the column wire by a factor of 4 (since both resistance and capacitance of the wire is reduced by half). The second advantage of splitting along this axis is that  $2 \times N$  rows can be simultaneously addressed while having only N column wires per pixel, thus simplifying the pixel array addressing scheme. Similarly, the East-West split reduces the horizontal length of the video bus, increasing the settling speed of the output signal from the Sample & Hold structures (see Section 3.6.4 for more detail).



Figure 3.2: Quadrant split of the sensor

While the quadrant split approach results the aforementioned benefits, which are highly desirable in a high speed image sensor, there can be drawbacks to this technique. Most notably these include cosmetic artifacts due to slight circuit variations from quadrant to quadrant which become easily perceivable at the position on the pixel array where the quadrants abut. A well-known example is the "crosshairs" effect, which manifests itself as a cross pattern at the center of the pixel array, where all four quadrants meet.

#### 3.2.3. THE KERNEL

I N the case of high-speed imagers there is a need to achieve high framerates. Therefore multiple output channels are used in parallel to overcome the per channel bandwidth limitations, outputting multiple pixel values at a given moment in time. From a design standpoint, it is convenient to group the pixels that will be output simultaneously in a group. This group of pixels is called a kernel. We can distinguish between localized kernels and delocalized (or distributed) kernels, depending on whether the pixels belonging to the kernel are abutting or spread across the array.





#### **3.2.4.** PIPELINED READOUT

I MAGE sensors are systems with a considerably large number of inputs, generating a tremendous amount of data – getting the generated image data off-chip is the principle speed bottleneck the high speed sensor designer will be faced with. Ideally, it would be possible to implement an entirely parallel readout of the entire pixel array, much like the human eye. However, the highly planar nature of CMOS means that captured information has to be read out from the periphery of the pixel array, thus requiring multiplexing and serialization of the output data.

A wide plethora of interesting solutions have been proposed throughout the development of CIS which aim to alleviate this bottleneck. One of these is the approach of in-situ analog memory, a concept dating from the days of the CCD [1]. While on-chip memory can allow for extremely high framerates to be achieved, poor memory depth limits image capture to a small number of frames and is, thus, not a viable solution for continuous video operation. For instance, a global shutter image sensor capable of an impressive one Tpixel/sec has been reported in 2013 by Tochigi et al. [2] with a maximum burst of 128 frames, however. Three-dimensional integration is a very intriguing prospect for imaging as it can allow for massively parallel pixel data acquisition, digitization and processing, overcoming the limitations of traditional peripheral I/O. An impressive representative of the promises of 3D integration is Sony's three layer stacked DRAM mobile phone sensor [3]. The three chips that comprise the system are interconnected with through-silicon vias (TSVs) and are reported to be as follows. The top chip contains the pixel array and is manufactured in a 90nm BSI CIS optimized process. Following is the center chip, containing the system's 1Gbit high-speed frame memory and pixel row drivers, manufactured in a 30nm DRAM optimized process. The bottom chip contains the ADCs, digital signal processing and MIPI output interface. It is manufactured in a 40nm logic optimized process technology.

Since for this work we don't have access to such an advanced 3D manufacturing process, optimization steps need to be made in order to ensure that the I/O infrastructure is utilized to its full potential. This essentially means that the output channels of the sensor must operate at the highest possible bandwidth that allows reaching the requirements for settling accuracy and, very importantly, that no idle I/O time should be present, where no data is being output from the image sensor.

In a non-pipelined column-parallel image sensor, there exists an output "dead time" or line blanking time (LBT). The term line blanking time, or horizontal blanking time, originates historically from raster-scanned CRT displays. It used to refer to the total duration of the synchronization pulse and front/back porches of the video signal required when drawing one line on the display. Similarly, in the imager lexicon, it refers to the time column signals need to settle, the CDS operation is performed and the signals for a specific line produced, during which no data is output from the sensor. Such an overhead is naturally not desired in time-restricted high-speed systems such as this work. The problem is solved by employing a pipelined readout architecture, for uninterrupted continuous data output. The pipelined operation in this work is achieved by using two sample and hold structures per column wire. The first structure performs the sampling of the row under control (kernel row N+1) and the second is used to output the previously sampled value (kernel row N) to the next sub-circuit down the signal chain. Due to the requirement of column-level CDS, each S&H structure needs to be able to sample two values, a signal and a reset level. Therefore, the final sampling circuit for each column wire is comprised of two banks of two sampling capacitors each. The circuit is illustrated, showing the two pipelining S&H structures, in the simplified diagram of Figure 3.4, alongside an example of a column signal and the signals sampled by the S&H structure.



Figure 3.4: a) Sample and Hold circuit diagram and b) associated waveforms

Due to the column multiplexing scheme, there is a subtle but important effect that must be addressed in order to achieve true zero line blanking time (LBT) operation, which is detailed below. As previously mentioned, the imager features two banks of sampling capacitors, an odd and an even one. Two FPGA-generated control signals (odd\_even\_0\_to\_15 and odd\_even\_others) are used to control from which bank of capacitors the readout will take place. The odd\_even\_0\_to\_15 control dictates the bank parity for the first 16 kernels (from the center outwards), while the odd\_even\_others does the same for the rest of the kernels. The reasoning behind this grouping scheme will become apparent below. Looking at the standard pipelined X-scanner timing diagram of Figure 3.5, the odd/even controls are operated in unison, sampling the entirety of the kernel row in the odd or even banks in an alternating fashion. At the end of the "pixel S&H [n+1]" period, no new sampling may occur since both S&H banks are being utilized – one bank is being used for readout of row [n] and the other for sampling the signals of row [n+1]. This effect prevents the continuous, uninterrupted output of data and results in line blanking time.

The ability to independently control the S&H banks of the first sixteen kernels from the rest makes it possible to introduce a timing difference between them. As shown in the "zero-LBT" timing diagram of figure 3.6, the "odd\_even\_others" control is delayed by approximately the duration of one LBT period in this mode, compared to the nominal scan (3.5). Therefore, the modified operation can be summarized as follows:

Row [n] has already been sampled on the even S&H banks and is being read out during the period denoted "X scan [n]". At the beginning of the "pixel S&H [n+1]" period both controls are high, thereby sampling the entire [n+1] row, this time on the odd bank of S&H capacitors. At this point, the state of the sampling banks is as follows: Row [n] stored on even capacitors and being read out and row [n+1] being stored on odd capacitors. At the end of the "pixel S&H [n+1]" period, the readout of row [n] and row [n+1] overlaps – the last kernels of row [n] and the first ones of [n+1] are being output simultaneously. Here is where the independent odd/even controls are needed: During the overlap period, the last kernels of row [n] are read out from the even capacitor banks, while the first kernels of row [n+1] are read out from the odd banks. Once the readout of row [n] is complete, the "odd\_even\_others" control also toggles, allowing for the entire odd bank to begin sampling of the new signals.

One additional detail that differs between the two timing schemes is the operation of the "frame\_sync" and "X\_sync" controls, which are used to reset the horizontal scanning registers. The specifics of these controls and how they affect the generation of the eight divided X clocks can be founds in Section 3.7.1 - X Scanner.

Thanks to this method, true zero-LBT can be achieved, resulting in continuous data output from the sensor and therefore utilizing the imager's output channels as efficiently as possible.



Figure 3.5: Standard pipelined X-Scanner timing diagram



Figure 3.6: Modified zero-LBT pipelined X-Scanner timing diagram

#### **3.2.5.** SIGNAL PATH: FROM PIXEL TO OUTPUT

The signal path of the image sensor begins with the pixel, in which phototransduction occurs, converting impinging photons into a charge signal. First, the pixel is reset and the reset level of the floating diffusion is read out via the source follower. The source follower is biased by means of a constant current source load, as shown in the simplified diagram of Figure 3.7. The reset signal settles on the column wire, to which the sample and hold circuits are connected and the signal is sampled on the reset capacitor of one of the sampling circuits. In the case where no CDS is performed, this step of the process is omitted.



Figure 3.7: Simplified diagram of the sensor's readout path

At the end of integration, the signal charges are transferred to the floating diffusion and once again, read out via the source follower. When both reset and signal levels have been sampled, the sample and hold circuit output switches are turned on, allowing the PGAs to settle the two sampled values on the video bus pair. Following that, the two pseudo-differential signals are converted into a true differential signal and amplified in single to differential amplifier (S2D). At this point the S2D output is further buffered by the output stage amplifier and digitized in an off-chip ADC. As soon as the ADC conversion is complete, the output channel multiplexer switches to the next S2D and the digitization process starts once again for the signals of the next kernel.

#### **3.3.** GS/HDR PIXEL

A the heart of the imager is the pixel array, with the pixel constituting the beginning of the sensor's signal chain, converting light into the electrical domain. Even though the design of a pixel might be assumed to be a simple feat, the circuit being made up from a handful of devices, it's far from it. In fact, it's a design process that requires a seasoned analog designer with in-depth knowledge of the device physics associated, the CMOS process as well as the experience necessary to foresee how the circuit design and physical layout will influence the final performance aspects of the detector.

The pixel used in this work is an eleven transistor (11T) pinned photodiode global shutter pixel. The pixel schematic is illustrated in Figure 3.8, along with its physical layout in Figure 3.9. Integrate-while-read (IWR) functionality is possible thanks to the independent charge flushing of the photodiode node via transfer gate  $TG_F$ , which permits resetting of the PPD while the previously stored signal is read out of the floating diffusion (FD) node via the source follower transistor. Snapshot shuttering is performed by globally transferring the generated photoelectrons from the PPD into the storage capacitance MEM via the  $TG_1$  transfer gate. Then, in a rolling manner, the floating diffusion node (FD) is reset and the reset level is sampled on the sample and hold structures. When this is done, the  $TG_2$  transfer gate is pulsed, allowing the stored photocharge to transfer from the storage node (MEM) to the floating diffusion (FD) and again read out and sampled onto the peripheral circuits.



Figure 3.8: GS/HDR pixel schematic

One of the numerous trade-offs that have to be considered when designing a pixel is full well capacity of the floating diffusion versus the charge-to-voltage factor or CVF. A low floating diffusion capacitance increases CVF, thus aiding in the reduction of the effect of circuit noise. However, the finite voltage swing of the FD means a decreased full-well charge. On the other hand, a large FD capacitance mitigates the limited full well but at

a loss of CVF. Ultimately, a single "pixel gain" (or specific CVF) limits the dynamic range achievable by the pixel. A wide plethora of solutions is proposed in literature for extending the dynamic range of CMOS imagers, such as split diodes, multiple gains, charge skimming, logarithmic response pixels and many more [4]–[8].

Caeleste's three-level HDR method [9] uses transfer gates  $TG3_A$  and  $TG3_B$  connected to the photodiode (PPD) as lateral overflows. The skimmed charge is transfered to in-pixel capacitors  $C_A$  and  $C_B$ , which can be merged with the floating diffusion node by operating switches  $M_A$  and  $M_B$  for readout. As shown in Section 3.4, the  $TG3_{A,B}$  transfer gates are controlled by a special 3-level driver, which allows for a mid-level voltage  $V_{MM}$  to be placed on these gates.



Figure 3.9: GS/HDR pixel layout

#### **3.4.** PIXEL DRIVER

T HE pixel driver is the circuit responsible for generating the control signals for operating the pixel array (such as RES, SEL, TG, etc.) and constitutes an important part of the Y periphery of the imager. This includes both global control signals, such as these that perform the snapshot operation of the imager, as well as row-by-row ('rolling') signals to realize the sequential readout. As explained in Section 3.1.3, the array control is organized in rows of kernels, each comprising of 16 pixel rows. All pixels in a kernel row are driven using the same control signals, thus reducing the number of required "addresses" and consequently scanning elements in the Y-scanner. The schematic of one pixel driver cell, responsible for the control of one kernel row, is shown in Figure 3.10. Since each pixel row requires one set of drivers (one per control signal), the physical layout of the pixel driver (shown in Figure 3.11) is pitch-matched with the array. This is a constraint that has to be kept in mind when designing the pixel driver, ensuring that the driving circuits of each row will fit in the limited vertical space dictated by the pixel pitch.



Figure 3.10: Pixel driver simplified schematic, showing various driver cell types, and array

Each driver cell is comprised of latches and digital logic, level shifters and end-state drivers. The latch and logic part of the driver allow for either global or rolling addressing of the kernel row. Level shifters decouple the signal-specific power supplies from the digital domain power supply. The reason for this is two-fold – some pixel controls require a "high" level different than that of the digital logic and, additionally, power supply induced crosstalk between toggling controls is also minimized. It should also be noted that there are 16 end-state drivers for each control signal, one for each row of pixels comprising the kernel row.



Figure 3.11: Layout of the pixel driver assembly for one kernel row (West side of chip)

In this design there are five different types of driver cell. Type I is used for driving the second transfer gate control (TG<sub>2</sub>) and HDR capacitor merge switches ( $M_A \& M_B$ ). Its schematic is shown in Figure 3.12. It consists of a multiplexer, a D-latch and two end-state inverters. As mentioned, the endstate inverters have a separate power supply domain. The circuit is controlled by six control inputs: An active low select signal, *nad*-*dressed*, generated by the Y scanner. Two ASPI-generated control bits which determine the behavior of the output of the cell under both addressed active and inactive state (*TG2\_addressed* and *TG2\_other*). These may be configured so that the driver output is controlled by either the ASPI or the bondpad input. A bondpad input to provides fast FPGA control for the *TG2* signal, for which an ASPI control *TG2\_global* selects whether the signal is applied in a rolling fashion or globally on the entire array. Finally, the *nop* control is used for ROI (see Section 3.7.3)



Figure 3.12: Schematic of a Type I driver cell used for TG2, MA&MB

The Type II driver (Figure 3.13) is used for driving the pixel reset (RES). The circuit is identical to the Type I driver, with the exception of the final inverter, which features programmable driving strength.



Figure 3.13: Schematic of a Type II driver cell used for RES

As seen in Figure 3.14, two ASPI-generated control bits allow to the inverter to have a strength of 1, 6, 11 or 16. This feature allows for control of the transition slope of the digital signal - when fast control is not required, the lower strengths help to avoid large peak currents and improve switching uniformity across the array.



Figure 3.14: Schematic of the programmable strength inverter

The Type III driver (Figure 3.15) is used for driving the first transfer gate (TG1X) and the flush transfer gate (TG\_flush) of the pixel. As described in Section 3.5, four TG1 and flush controls need to be generated for the "Kernelscan" feature. The circuit uses an XOR gate at the input, a D-latch, and two inverters, again with the final being of programmable strength. Each of the four Kernelscan groups (A,B,C and D) of the kernel row has an individual ASPI-generated control, in order to realize the required timing. The TG1 bondpad control is shared amongst all drivers. The dummy multiplexers are present to better match the signal propagation delay across different driver types.



Figure 3.15: Schematic of a Type III driver cell used for TG1 and TG\_flush

The Type IV driver (Figure 3.16) is used for driving the select (SEL) signal to read out the pixel. Since the select is never operated globally this is a rolling-only variant of the Type I driver, lacking the input multiplexer and logic gates associated with global control. Again, both ASPI and bondpad control of the addressed row are possible.



Figure 3.16: Schematic of a Type IV driver cell used for SEL

The Type V driver (Figure 3.17) is used for driving the two HDR transfer gates (TG3A and TG3B). Since the CA and CB capacitors of the pixel are used as an 'overflow' in HDR mode, as described in Section 3.3, the driver must be able to drive the corresponding transfer gates at a mid-level  $V_{MM}$ .



Figure 3.17: Schematic of a Type V driver cell used for TG3A and TG3B

Figure 3.18 illustrates a top-level simulation of the pixel driver, showing each of the control signals at the bondpad level and at the center of the pixel array.



Figure 3.18: Transient top-level simulation of the pixel driver inputs and outputs (taken at the center of the pixel array)

# **3.5.** KERNELSCAN

The KernelScan feature further expands on the concept reported in [10] – it enables for each pixel in a kernel (e.g.  $2 \times 2$ ,  $4 \times 4$ ) to have a different integration time  $t_{int}$ . During this special modus operandi, it is possible to perform the integration of these pixels independently, thereby obtaining higher time resolution. This extra temporal resolution is obtained at the cost of primarily lower spatial resolution and also additional complexity in the pixel interconnects and driving circuits. More control wires are required for the TG\_flush and TG1 controls, as well as special drivers for these controls which enable the KernelScan operation.

As an example, a  $2 \times 2$  kernel for this special operation is shown in Figure 3.19. Within each kernel, the integration time for each pixel (A, B, C & D) can be fully independently controlled by toggling the corresponding TG\_flush<sub>[A,B,C,D]</sub> and TG1<sub>[A,B,C,D]</sub> under ASPI control.

An exemplary timing diagram showing the different integration times  $t_{int}$  for pixel A, B, C, and D is given in Figure 3.20. This figure also illustrates how the integration time for each pixel is generated by controlling TG\_flush and TG1. The integration times for the pixels aren't required to be contiguous, they can be arbitrarily positioned and varied within the overall frame time. An example timing scheme of such a non-contiguous integration time is shown in Figure 3.21.



Figure 3.19: The KernelScan concept showing four pixels with different integration times

The kernel size can be 2x2, 4x4, or 8x8, etc. But larger the kernel size, larger the number of wires within the pixel layout since each pixel in one kernel requires its own TG\_flush and TG1. 2x2 kernel need 4 wires in one row, 4x4 kernel need 8 wires in one row which is difficult to fit them into  $16\mu$ m pixel pitch. So 2x2 kernel scan is chosen.


Figure 3.20: Example KernelScan timing diagram showing the different *TG\_flush* and *TG1* signals



Figure 3.21: Example timing showing arbitrary position and length for the different integration times

There are several possible ways the KernelScan pixels can be arranged in the pixel array. Figure 3.22 shows three different patterns (A, B and C) for a 2×2 kernel scan. Pattern A is chosen in this work. This configuration is chosen due to its ease of implementation and the fact that the distance between neighboring A/B/C/D pixels is about the same, leading to the most uniform image with the least amount of artifacts.



Figure 3.22: Three different possible patterns for arranging the KernelScan pixel groups

# **3.6.** READOUT CIRCUITS

## **3.6.1.** COLUMN MULTIPLEXING

In the design of high speed image sensors, every time consuming process and speed bottleneck must be identified and reduced as much as possible. One such speed limiting factor along the signal chain is the column wire. Due to its large parasitic capacitance and resistance, significant time is needed in order for the pixel's output to settle on the wire. Since this settling is RC-limited, one cannot alleviate the problem by, for instance, sacrificing power consumption to acquire faster settling.

The scheme used to overcome this limitation uses multiple column wires and multiplexing at the S2D outputs, as shown in Figure 3.23. A group of eight kernels is illustrated, of which each is comprised of 8(H) x 16(V) pixels, as explained in Section 3.2.3. Each blue wire illustrated represents one column wire and is connected to two sample-hold stages, one for "RESET" level and another for "SIGNAL" level. Each RESET-SIGNAL pair is connected to its corresponding S2D amplifier. In total, there are eight S2D amplifiers comprising a channel. The outputs of these eight amplifiers are multiplexed to one output stage amplifier through switches SW<0> to SW<7>.



Figure 3.23: Column multiplexing scheme

The timing diagram of the column multiplexing scheme is shown in Figure 3.24. It illustrates how the X scanner will connect the "video buffers" to their corresponding video lines and S2D amplifiers sequentially, i.e. KERNEL<0> to KERNEL<15>.

Each kernel period occupies eight pixel master clock (X\_CLK) cycles, allowing for sufficient time for the signal to settle on the video bus and be sampled by the S2D amplifier. Each S2D amplifier is driven with 2 phases, namely "phi1" and "phi2", as shown in the figure. A detailed explanation of the phases of operation of the S2D amplifier are provided in Section 3.6.5.

At the end of phi2, the multiplexing switches  $SW_0 - SW7$  connect sequentially each S2D amplifier to output stage. In 3.24, only the three timing diagrams of  $SW_0$  to  $SW_2$  are illustrated for the sake of simplicity. The phi1 and phi2 signals that drive the S2Ds originate from the corner of the imager and are generated by the X-Clock generator (described in Section 3.7.2) and regenerated locally at close proximity to the S2D amplifiers along with the SW multiplexing signals (described in Section 3.6.5).



Figure 3.24: Column multiplexing timing overview

To summarize, this multiplexing scheme ends up reducing the large column settling time by a factor of eight, at the expense of a multiplicity of the S2D circuits and column wires. Since the main goal of this imager is to achieve high speed operation, this configuration constitutes an acceptable trade-off.

# 3.6.2. COLUMN LOAD

The column load's primary function is to sink a constant current via the column wire to bias the in-pixel source follower amplifier. The current source is implemented using an NMOS transistor, allowing for a programmable column current, adjustable by means of the biasing circuits found in the corners of the imager. Additionally, a second vital function of the column load is the column wire precharge. The column wire, which runs vertically across half the height imaging array (and the entire height in the case of a nonquadranted sensor) presents a comparatively large parasitic capacitance than needs to be charged for each pixel value that is settled on the wire.



(a) Schematic





Figure 3.25: Column load unit cell schematic (a) and layout (b)

The source follower can charge the column in a short amount of time as it's able to provide a high current from its power supply. However, discharging the column in order to read out the next signal from the pixel is a slow process due to the large capacitance being now discharged by the relatively small current sunk by the NMOS current source. The precharge function of the column load circuit serves to greatly accelerate this discharge, allowing for much shorter time between subsequent column signals. This is achieved by a second constant current source that is connected to the column wire and activated by a digital control signal. Through a programmable biasing circuit, the current sunk by the second source can be adjusted, thus dictating the intensity of the precharge effect.

Furthermore, the column load cell includes three additional NMOS transistors, controlled by "Test", "enable\_ARB" and "ARB", which stands for Anti Row-Banding. The two are SPI-generated controls used for the "shorted column test" (described in Section 3.8.3) and Anti Row-Banding feature. The ARB transistor is driven by an analog voltage, generated by an on-chip RDAC, and provides means tuning low-side clamping of the column voltage. In the case of a saturated or defective (hot) pixel, the column voltage can potentially reach a very low value resulting in the column current source coming out of saturation. In turn this reduces the IR drop on the power supply lines and as a consequence increases the bias for the rest of the columns due to an increase in  $V_{gs}$ . The final effect is that neighboring columns will appear brighter, forming a bright band on the affected row (hence the name Anti Row-Banding). Once the anti-ARB feature is turned via the "enable\_ARB" control, the column voltage is clamped via the ARB transistor (Figure 3.25a) and the described effect is eliminated.

Figure 3.26 shows a column transient simulation for two pixels, illustrating the column voltage, sample and hold reset and signal voltage, as well as some of the associated control signals.



Figure 3.26: Simulation of a column and two illuminated pixels

### **3.6.3.** Amplifying Sample and Hold

Sample and Hold circuits, as the name implies, are analog circuits that can sample a voltage of continuously varying nature and hold the constant value for a certain period of time. They are, in essence, an analog memory element. The simplest S&H circuit is built using a switch and a capacitor. By closing the switch, the capacitor is charged to the input voltage. When the switch is opened again, the voltage is stored on the capacitor. In almost all cases, the capacitor voltage is buffered in order for the output of the S&H to be able to drive the next sub-circuit. An example of such a simple implementation is shown in Figure 3.27. In Figure 3.28 the input and output waveforms are plotted, along with the sampling moments where the switch is closed.

Sample and Hold circuits are widely employed in CMOS imagers, as the main means of performing the Double Sampling (DS) and Correlated Double Sampling (CDS) noise compensation techniques. When the CDS operation is implemented inside the pixel, we talk about pixel-level CDS, whereas when CDS is done in the peripheral circuits of the imager, we talk about column-level CDS. Both, however, employ the usage of some form of S&H circuit.



Figure 3.27: Simplest implementation of a zero-order Sample & Hold circuit

Even though pixel-level CDS might initially appear as a more elegant and attractive solution to the problem of correlated double sampling, the loss in fill factor due to the added in situ memory element and the added complexity, are not desired in high speed image sensors where sensitivity is of great importance. One such example is this work, where the integration time is very short and a lowering in sensitivity would lead to higher photon shot noise.



Figure 3.28: Zeroth-order Sample and Hold input (grey) and output (red) - Sampling moments indicated by the dotted lines

Thus, in this case we opt to sample the pixel signals in the periphery of the image sensor. The pipelining architecture of the sampling circuit was briefly introduced in Section 3.2.4, illustrating how the zero blanking time operation is achieved using two sampling structures, each comprising of two banks of capacitors. In this section, more detailed is provided on the design and features of the circuit.

As in any noisy signal chain, providing amplification of the signal as early as possible yields an improved noise performance. In the same way, succeeding the source follower amplifier inside the pixel, it is desirable to introduce a gain stage as early as possible. This signal amplification is typically provided by a programmable gain amplifier (PGA) at the bottom of the column wire, providing various levels of amplification. After the PGA, the signal is sampled on the Sample and Hold structures and continues down the signal chain.

In this work, the very large number of output channels and general complexity that is needed to achieve the required imaging speed, results in a very dense physical layout. This makes it impossible to use Metal-Insulator-Metal (MIM) capacitors and a typical PGA solution using a capacitor in the feedback loop. Instead, we chose to combine the amplification with the sampling circuit, resulting in a S&H circuit that can also provide gain via switched capacitor action.



Figure 3.29: The Sample & Hold circuit with amplification feature

The amplifying S&H circuit has two main modes of operation, selectable via the SPI interface of the imager. It can either work as a traditional unity-gain sampling circuit or as an amplifying sample and hold circuit. The two modi operandi are detailed below.

- **Unity-gain mode operation:** In this mode, switches  $\Phi_S$  and  $\Phi_A$  are fixed in the closed and open position respectively. Only the input sampling switch is operated to sample the column signal. In this configuration, the circuit behaves like a traditional S&H with an equivalent capacitance of  $C_{total} = C_1 + C_2 + C_3 + C_4 = 4C_{unit}$ .
- **Amplifying mode operation:** In this mode we can distinguish two phases in the circuit's operation. A sampling and an amplifying phase. When sampling, switches  $\Phi_S$  are closed and the column voltage is sampled to all four capacitors in parallel. Once sampling is complete, switches  $\Phi_S$  are opened and via switches  $\Phi_A$ , the four

capacitors are connected in series (or "stacked"), resulting in the voltage across them being the sum of voltages on the individual capacitors. It is evident however that if the bottom plates of the capacitors were connected to signal ground, the resulting voltage across the series connected capacitors would greatly exceed the input range of the succeeding buffer amplifier. Instead, the bottom plates of the capacitors are connected to two DC reference voltages, DC<sub>1</sub> and DC<sub>2</sub>. Then, the voltage across the four capacitors in the amplification phase is:

$$V_{out} = DC_1 + V_{C1} + V_{C2} + V_{C3} + V_{C4}$$
$$V_{out} = 4 \times V_{in} - (DC_1 + 2 \times DC_2)$$

Due to the fact that the S&H capacitances are implemented using accumulation capacitors (see Figure 3.30), it must be made certain that the junction between the n-well and substrate remains reverse biased. During amplifying operation for instance, a saturated or hot pixel might result in a column voltage  $V_{in} = 0.6V$ . With a  $V_{DC1} = 0V$  and  $V_{DC2} = 2.4V$ , the resulting voltages of nodes N2 and N4 would be  $V_{N2} = 2V_{in} - V_{DC2} = -1.2V$  and  $V_{N4} = 4V_{in} - 2V_{DC2} - V_{DC1} = -2.4V$ , therefore forward biasing the aforementioned junctions. This would not only impede the operation of the S&H but also result in high substrate peak current. Having two independent DC references and being able to limit the lowest voltage of the column via the Anti Row-Banding (ARB) feature (see Section 3.6.2), ensures that this condition is avoided.



Figure 3.30: Cross-section of the accumulation capacitor used in the amplifying S&H

### **3.6.4.** COLUMN READOUT CIRCUITS

The column readout circuits are responsible for sampling the column signals and multiplexing them out onto the video bus. For each column, the readout circuits are comprised by four amplifying sample-and-hold circuits (detailed in Section 3.6.3), four socalled "video amplifiers" and four multiplexing switches. Optionally, the video amplifiers can preset both video lines to a programmable voltage as a means of eliminating any memory effects (such as ghosting). An overview of the design is shown in Figure 3.31, along with the physical layout of the circuits in Figure 3.33. The multiplexing switches are controlled by two signals (kernel\_select\_odd & kernel\_select\_even) which are generated by the X scanner circuit. Depending on the value of the 'odd\_even' controls (explained in Section 3.2.4), the corresponding 'kernel\_select' parity signal is generated.



Figure 3.31: Overview of the column readout circuits showing the S&H, video buffer and kernel select logic

Apart from the "kernel\_select" controls, the column circuits require six additional control signals which dictate how the sampling operation will occur. These are "SHS", "SHR" and "SHR\_Vblack" for both odd and even parities. The two SHS controls perform the sampling of the pixel's signal level in all modes for each respective bank of capacitors. When on-chip CDS mode is used, the dark reference (or reset) level originates from the pixel and read out via the column wire. As such, the "SHR\_Vblack" controls are unused.

However, when the sensor is operated in its high-speed non-CDS mode, only the signal level is read from the pixel since, as discussed previously, the column settling time is an important limiting factor the sensor's operating speed. Therefore a "dark reference" must be provided to the S2D for correct operation, due the pseudo-differential nature of its input. Similarly, this voltage refence is also used as a dark level for outputting the low gain signal of HDR mode. In both these cases, two sampling switches controlled by "SHR\_Vblack\_odd" and "SHR\_Vblack\_even" allow sampling of said reference " $V_{black}$ " on the reset capacitors. This reference voltage is generated by a resistive 8-bit digital-to-analog converter (DAC). It should be noted that while this reference for no-CDS operation could have been provided directly to the input of the S2D amplifier, it is of interest to place it as early as possible along the signal path. By doing so, the reference signal experiences near-identical disturbances (such as crosstalk coupled from surrounding circuits and power supplies) as the pixel signal. Since the S2D produces a signal from the difference of its two inputs, induced common-mode disturbances are minimized to a greater extent.



Figure 3.32: Logic to generate the control signals for the column readout circuits

Figure 3.32 shows the digital logic used to generate the six aforementioned controls. In total there are three input signals, generated by the controlling FPGA. These are sample reset (SHR), sample signal (SHS) and odd\_even\_others, whose function is described in Section 3.2.4. The ASPI generated 'on\_chip\_CDS' signal selects which set of dark reference sampling switches will be controlled via the bondpad. When set to low, the SHR line controls SHR\_Vblack\_odd or SHR\_Vblack\_even (depending on the parity set by odd\_even\_others). When set to high, the SHR\_odd or SHR\_even are selected instead. Since the signal is always sampled from the column, the 'on\_chip\_CDS' signal has no effect on it.



Figure 3.33: Layout of the column circuits of one column showing a S&H stage and video amplifier

Figure 3.34 shows the reset and signal output of the video amplifiers being settled on the video bus and Figure 3.35 shows the eight phases of select and preset signals controlling the video amplifiers. The preset control presets the reset and signal video bus lines to a known voltage prior to signal settling, in order to eliminate any possible memory effect.



Figure 3.34: Transient simulation showing the outputs of the first Reset-Signal pair of video buffers along with the select and preset control signals



Figure 3.35: Video amplifier control sequence

# **3.6.5.** SINGLE-TO-DIFFERENTIAL AMPLIFIER (S2D)

With the S&H stage storing the reset and signal values originating from the pixel, the video amplifier is selected by the X Scanner and the two voltages are allowed to settle on a video bus pair (as shown in Figures 3.7 and 3.23), after which they will be sampled by the S2D ( shown in Figure 3.36).



Figure 3.36: Schematic of a single S2D amplifier

It's at this point in the signal path that the CDS operation is performed by the S2D. By subtracting '*Reset*' from '*Signal*', it converts the pseudo-differential input to a fully differential output, which can then be buffered and sent off-chip for digitization. The S2D can be regarded as two principle sub-circuits – a sampling circuit comprised of input buffers  $A_1$ , switches  $\Phi_1$  and  $\Phi_2$  and capacitors  $C_S$ , followed by a capacitive trans-impedance amplifier (CTIA) stage  $A_3$  with programmable gain.

The operation of the S2D amplifier is divided into a sample and an amplify phase,  $\Phi_1$  and  $\Phi_2$  respectively. During  $\Phi_1$  both '*Reset*' and '*Signal*' are sampled on capacitors  $C_S$  and the feedback capacitors reset. During  $\Phi_2$  the bottom plates of the  $C_S$  capacitors are connected to the corresponding common-mode voltage reference ( $V_{CMP}/V_{CMN}$ ), while the top plates are connected to the inverting inputs of the  $A_3$  amplifiers. This results in the original signal being amplified by a gain  $G = C_S/C_F$ , where  $C_F$  the parallel combination of  $C_{F0}$  and any of the enabled feedback capacitors  $C_{F1}$  and  $C_{F2}$ . Therefore, the voltages of the two outputs outp  $_{S2D}$  and outn  $_{S2D}$  can be expressed as follows:

$$\begin{split} V_{outp} &= -(V_{signal} - V_{reset}) \times \frac{C_S}{C_F} + V_{CMP} \\ V_{outn} &= +(V_{signal} - V_{reset}) \times \frac{C_S}{C_F} + V_{CMN} \end{split}$$

It is worth noting that voltage clamping of the output is also implemented at this point in the signal chain. Potential defective 'hot' pixels or other faults along the signal chain may produce output signals outside the nominal range. This would result in clipping of the output stage amplifiers, resulting in incomplete settling of the following sample – manifesting itself as ghosting in image.

Output voltage clamping serves to prevent this condition from occurring. Two reference voltages  $V_{clamp_H}$  and  $V_{clamp_L}$ , are generated by on-chip DACs. These voltages are then buffered and connected to two pairs of clamping diodes, implemented using diode-connected NMOS transistors. If the output of the A<sub>3</sub> CTIA amplifier goes higher than  $V_{clamp_H} + V_{th}$  or lower than  $V_{clamp_L} - V_{th}$ , the voltage is clamped. Since the additional current is sunk (or sourced) by the clamp buffers A<sub>4</sub>, they should be sized accordingly, stronger than the A<sub>3</sub> amplifiers.



Figure 3.37: Layout of the eight S2D amplifiers of one output channel in a 4×2 configuration (rotated 90° CCW)

As previously mentioned in Section 3.6.1, eight S2D amplifiers are multiplexed out to a single output channel. In layout therefore, they are grouped into a single cell, whose width must be pitch-matched with that of the output stage. The layout of such a cell is illustrated in Figure 3.37. The eight phases of the phi1 and phi2 clocks required for the S2Ds of each quadrant are derived from the master pixel clock in each corner of the chip by the X-Clock Generator cell (see 3.7.2). In order to maintain uniformity of these signals across the width of the quadrant, the clocks are routed to its center and then distributed to the S2Ds by means of a clock tree. The acceptable worst-case clock skew is what determines the number of stages the tree requires. In the case of this work, a two stage buffered clock tree is used, which feeds four groups of 256 S2Ds (i.e. 32 outputs), resulting in a worst-case skew of under 1 ns. It is of interest to keep the number of stages as low as possible, in an effort to minimize peak power consumption and noise generated by the digital circuits in the proximity of the sensitive analog circuitry of the S2Ds. The clock tree scheme is shown in Figure 3.38.

The physical layout of a group of 32 S2D amplifiers is shown in Figure 3.39. A transient simulation of an S2D amplifier at reduced clock speed is used to check for settling accuracy, shown in Figure 3.40. It illustrates an alternating darkest to brightest signal output from the  $A_3$  CTIA amplifier, which is used as an input to the inter-stage buffer  $A_5$  (see section 3.6.6). Also shown in the same figure is the output of the inter-stage buffer  $A_5$  settling on the input gate of the final output buffer  $A_6$ . Finally, input/output transfer and gain plots are shown for programmable gains 1 and 4 in figures 3.41 and 3.42, which are used to determine linearity of the S2D circuit and effectiveness of the voltage clamping circuit.



Figure 3.38: Distribution of the S2D clock signals



Figure 3.39: Layout of 32 channels of S2Ds showing the amplifiers and control circuits



Figure 3.40: Transient simulation showing the output of the S2D's CTIA stage, output of the interstage buffer and off-chip ADC input



Figure 3.41: Simulation of the S2D transfer at gain one (two differentially) with the clamping function disabled (case 0) and enabled (case 1)



Figure 3.42: Simulation of the S2D transfer at gain four (eight differentially) with the clamping function disabled (case 0) and enabled (case 1)

#### **3.6.6.** OUTPUT STAGE

As shown in the signal path overview (Section 3.2.5), eight S2D amplifiers are multiplexed out to one output stage, which is comprised of four buffer amplifiers. The fast data rate (eight times that of the S2D amplifiers) of the multiplexed output, combined with the high driven load of the PCB track and ADC input require very large transistors in the output amplifiers  $A_6$  to achieve accurate and fast settling. Therefore, due to their large input gate capacitance, a single S2D amplifier is unable to drive at the required speed at a reasonable power consumption. To get around this issue, an inter-stage buffer amplifier  $A_5$  is used following the multiplexing switches, as shown in Figure 3.43.



Figure 3.43: The output stage of a single channel showing the two stages of amplifiers and bondpads

As previously mentioned, the output buffers are responsible for outputting the analog pixel values to the off-chip ADC at the full rate of the pixel clock. Therefore, it needs to be made certain that the large capacitive load presented by the PCB track and the input capacitance of the off-chip ADC can be driven with sufficient settling accuracy by the amplifiers (Figure 3.44). This is done by simulating the worst case signal excursion from one sample to the next, from completely black (-1V) to completely white (+1V), with a modeled load of the PCB and ADC at the output. In figure 3.45, the memory effect (effect of the previous sample on the current) is plotted versus the output stage bias current (in  $\mu$ A). Above approximately 7.5  $\mu$ A, the worst case memory effect is smaller than 1%, as shown in the plot.



Figure 3.44: The load driven by the output amplifier



Figure 3.45: Simulation of the worst-case memory effect of the output stage amplifier

# **3.7.** DIGITAL BLOCKS

**I** N order for the analog core of the imager to operate and produce an image, a wide variety of control signals need to be generated and orchestrated in time precisely. These include the pixel control signals, such as these that perform the reset, select and charge transfer operations. The general rule is to run these control lines along the East-West axis of the imager core. In addition, scanning signals are required in both the X and Y direction to address a specific group of pixels at any given moment. All of the aforementioned signals, that are required by the analog core, need to be in precise temporal alignment with the group of controls that realize the sampling, amplifier control and output multiplexing. It is common to group the generation circuitry of a large majority of these controls in a functional circuit block called the Sequencer.

The reliable way to synthesize these control signals, while maintaining precise timing under all operating conditions, is by deriving them all from a global, stable, high-frequency timebase. It is commonplace for the X-clock (also called pixel clock) to be chosen. The X-clock can be either generated off-chip using a crystal oscillator or an FPGA and provided externally to the imager or can be synthesized on-chip by means of an oscillator and a phase-locked loop (PLL).

In the following sections the principle digital circuits of the imager will be introduced, along with their operation and design considerations.

### 3.7.1. X&Y SCANNERS

The job of the X and Y scanners is to generate the column and row select signals respectively, ensuring that only the pixels that will be read out are active and connected to the downstream readout electronics. In their simplest incarnation they typically consist of a series interconnected chain of D-type flip-flops. By presetting and then clocking all the flip-flops in unison, a single "one" will propagate down the chain, thereby "scanning" it from beginning to end.

#### X SCANNER

In a typical X-scanner (Figure 3.46), the pixel array would be scanned from left to right, going through all the columns, one by one. However, as discussed in Section 3.2.2, this sensor uses a quadrant split — consequently, each individual quadrant needs to be scanned separately. In effect, the X-scanner is comprised of four chain groups, one for each quadrant, resulting in the sensor being scanned from the center and outwards.



Figure 3.46: Example of a typical X-scanner flip-flop chain

Previously, in Section 3.6.1, the time-division multiplexing readout strategy is discussed. A fundamental requirement for this scheme to work is the ability to scan the pixel array using eight scanning signals with a constant phase offset, from one to the next, as illustrated in Figure 3.24. Hence, in this work, each X-scanner circuit for each quadrant is comprised of eight parallel scan chains, as illustrated in Figure 3.47. The X-Clock generator circuit, described in Section 3.7.2, takes care of the generation of the eight synchronization pulses and offset clocks which are required to drive the entire circuit. The synchronization signals serve to reset the chain and load it with the appropriate values (HIGH for the first flip-flop and LOW for the rest of the chain) in order to initiate the subsequent scan. Each output of the flip-flop chains is connected to two AND logic gates. The other input of each of these gates is an ASPI-generated signal (odd/even). The resulting output (kernel\_select) determines which parity of the column circuits will be used to store the pixel signals (as shown in Figure 3.31). The physical layout of one X-scanner cell is shown in Figure 3.48.



Figure 3.47: The 8-phase X-scanner developed for this work



Figure 3.48: Layout of the X-scanner circuit block

#### **Y SCANNER**

The Y Scanner is comprised of a simple flip-flop scan chain with much more relaxed specifications compared to its horizontal counterpart. As is evident, line scanning frequency is much lower than the horizontal one, advancing by one row once the entire imager has been scanned horizontally. For a quick estimate of the relation between the two scanning frequencies, ignoring line overhead time, we can write  $T_{Yscan} = N_H \cdot T_{Xscan}$ , for a non-quadrant sensor and  $T_{Yclk} = N_H/2 \cdot T_{Xclk}$ , for a quadrant-based sensor, where  $N_H$  the horizontal kernel count. A block diagram of the Y scanner and row driver for the West side of the chip is shown in Figure 3.49, showing how kernel rows are addressed.



Figure 3.49: Block diagram of the Y periphery showing the Y scanner

## 3.7.2. X-CLOCK GENERATOR

The X-Clock Generator circuit (shown in Fig. 3.50), located in the corner blocks of the sensor, is responsible for generating the necessary signals to drive the X Scanner. Its primary function is to divide the incoming high speed pixel clock (X Clock) into eight phases with a constant 45° phase offset to drive the eight X Scanner chains. Additionally, it generates the synchronization signals that reset the scan chains of the X Scanner, in order to initiate a new scan.



Figure 3.50: Eight phase X-Clock generator schematic

The operation of the circuit is summarized as follows: The high frequency pixel clock is provided to the circuit as an input (clk\_in), which is used to clock the four D-type flip-flops configured as a frequency divider when the framesync signal is low. This generates the eight divided clocks "clkdiv<0:7>". These eight clocks are used as an input to the NON OVERLAP cell to produce a differential non-overlapping replica of these signals.

Additionally the high frequency X Clock is also provided to the first sync cell. Its function is to receive a clock and a pulse input at any arbitrary time, and in turn produce an output pulse that is aligned in phase with the clock input and has a duration of exactly one clock cycle. The synchronization signal, having been processed by the sync cell is now guaranteed to be aligned with the master pixel clock and free of glitches. This clean version is used as an input to two NAND gates, which in combination with an ASPI-generated sync\_method (and complementary nsync\_method) control signal, determines whether the incoming synchronization pulse is used as a frame synchronization or line synchro-

nization. In the former case, the clock divider is reset, in addition to sync signals being generated, in order to reset the scan chains of the X Scanner. In the latter, the phase of the divided clocks is not corrupted, since the clock divider is kept running uninterrupted, with the sync signals generated at the next appropriate moment in time. The output signals of the circuit are illustrated in Figures 3.5 and 3.6. Additionally, the physical layout of the X-Clock generator circuit is shown in Figure 3.51.



Figure 3.51: Layout of the X-Clock generator circuit

#### **3.7.3.** REGION-OF-INTEREST SELECTOR

In all types of pixel drivers with the exception of Type IV (detailed in Section 3.4), the Dlatch is controlled by a "no operation" (nop) control. When "nop" is set low, the D-latch is transparent, allowing propagation of the signal. When "nop" is set high, the output is latched. In Region-of-Interest (ROI) mode, the "nop" control is used to lock the state of all the appropriate pixel drivers, making the unused pixels act like black pixels. By doing so, the power consumption and peak current of the pixel drivers reduces with decreasing ROI size. The schematic of the ROI selection circuit of the North side is shown in Figure 3.52. The South side schematic is simply a mirrored version.



Figure 3.52: The region of interest selector circuit

The unit cell in the red dashed box of Figure 3.52 is used to control the pixel drivers in one "kernel row" (16 rows of pixels). The bits ROI<0> to ROI<4> are ASPI generated. Setting the ROI<4:0> configuration word high, all "nop" outputs are low, resulting in all pixel drivers being actively controlled and the full array is utilized (no ROI operation). Similarly, setting the appropriate ROI<0:4> word, one has the ability to "lock" the pixels of any arbitrary number of kernel rows to act like black pixels, thereby reducing the effective region of interest.

# 3.7.4. (A) SPI INTERFACE

All configuration parameters of the image sensor (ex. bias voltages, gain settings, region of interest etc.) are programmed via an SPI-like interface. Typically an SPI interface would be implemented using a long configuration shift register into which all the corresponding programming bits would be serially uploaded to. While this slow configuration upload might be adequate for devices that require only a single configuration to be set, for example on power-up, in an image sensor updates in the configuration might need to happen more often, during its operation – hence a high upload speed is required. This can be achieved using an adressable-SPI interface (ASPI). Instead of using a large single configuration register, in the case of ASPI a lot of small registers are used, each with identifiable by a unique address. In that way it is possible to dynamically update only the configuration bits that are of interest at high speed, without the need to re-upload the sensor's entire configuration. An example layout of an ASPI register is shown in Figure 3.53.



Figure 3.53: Example layout of an ASPI register used for configuring the sensor

The interface uses four lines (three inputs and an output) for interfacing with the imager: **SPI\_CLK**, the clock input to the shift registers, **SPI\_DATA**, the data input and **SPI\_LOAD**, an input that is pulsed after the upload is complete to apply the configuration. The fourth line, **SPI\_MBS** is a multi-purpose output that, amongst other uses, permits the read-back of the sensor's configuration for verification purposes. The complete functionality of the SPI\_MBS output is described in more detail in Section 3.8.4.



Figure 3.54: Example control waveforms of an ASPI upload transaction

In Figure 3.54, an example ASPI upload transaction is shown. First the data word to be uploaded is clocked in, followed by the register address. The next bit after the LSB of the address is the upload/readback (U/R) bit. When set to upload, the uploaded data is used to update the addressed register's configuration. When set to readback, the uploaded data should only contain one high bit indicating the register position to be read back. The selected register bit value is asserted onto the SPI\_MBS line with the purpose of verifying the correct value has been set. Last, the SPI\_LOAD line is pulsed to signal the end of the transaction.

# **3.8.** DESIGN FOR TESTING

Planning ahead for the testing and characterization procedures of an image sensor is of great importance, since once the chip is manufactured it is nearly impossible to have access to internal signals for debugging or to make modifications to the design. Therefore, a system called MBS is used (detailed in Section 3.8.4), alongside various ways to perform tests for short-circuits as well as the implementation of test pixels, which vary from specialized pixels insensitive to light which produce a known output value to different variants of the default pixel layout and/or topology. These techniques aid with the identification of potential issues that may arise from either the design or manufacturing, as well as help make a decision in the case a pixel design modification needs to be performed.

### **3.8.1.** TEST & VARIANT PIXELS

A type of so called "variant and test" pixels are placed around the imager's active pixel array. In normal operation of the sensor these pixels do not constitute part of the output image, but are a valuable resource not only for performing various in-camera calibrations but also for optimizing the pixel design, testing and debugging the imager.

Variant pixels with slight alterations from the default pixel design are often implemented, especially in the case of custom image sensors, featuring non-standard pixel designs, with the intention testing new pixel concepts or fine-tuning specific characteristics of the pixel such as image lag, floating diffusion full-well and others.

The most common type of variant pixels are dark pixels. They produce a dark signal, regardless of the illumination of the scene. This is achieved through either optical means, whereby the pixel's photodiode is shielded from light (typically using a metal shield) or electrical means. In the latter case, one has several options for creating such a dark pixel. The reset transistor may be tied permanently high, therefore continuously resetting the floating diffusion, or alternatively the floating diffusion node itself can be tied to the pixel power supply. However, the preferred method, where applicable, consists of permanently tying the pixel's flush gate (seen in Section 3.3) high, so that any generated photocharges in the diode are continuously evacuated to the pixel power supply. This is typically the method of choice, due to the fact the resulting dark pixel most closely resembles the imaging pixels, both in its structure and operation. As a result, the dark pixels exhibit row and reset noise, floating diffusion dark current and other non-idealities in the same was as the default pixels, making them a more useful tool for calibration of such effects. Another common type of such specialty pixel not discussed here are auto-focus pixels which, in conjuncture with other components in the camera, allow for automatic focusing of the camera's lens.



Figure 3.55: Example implementation of CDS-capable grayscale pixel design and output of a row of sixteen such pixels

Test pixels are principally used during the development, debugging and testing phase of the image sensor. They primarily consist of pixels with a known predictable output to aid during the initial testing of the sensor. One such example of fixed output test pixels are "grayscale pixels", where a group of pixels is made such that it produces a linear gradient from darkest to brightest signal, useful for the characterization of the imager's analog chain. One way of implementing grayscale pixels is by replacing the photodiodes with a multiple tap resistive voltage divider. This implementation of grayscale pixels has the advantage of being CDS-capable, thereby requiring no modification to the timing of the sensor in order to read out. The concept and output from such a set of pixels is illustrated in Figure 3.55. Another example of fixed-output test pixels is the "bit pattern" - a group of pixels forming a recognizable pattern (such as text or a logo) made up by bright and dark pixels, aiding during the debugging of the output data remapping performed by the acquisition system.

## 3.8.2. VDD<sub>pix</sub> DISABLING

The VDD<sub>pix</sub> disabling circuit (Figure 3.56) is an ASPI-controlled flip-flop scan chain which allows for disconnecting the pixel power supply on a single-column basis. While unused in the normal operation of the imager, it constitutes a helpful addition for the test engineer during the characterization phase, or if the need to debug an issue with the imager arises.



Figure 3.56: Illustration of the VDD<sub>pix</sub> disabling concept

As shown in Figure 3.57, the circuit's unit cell is comprised of a D-type flip-flop, an inverter, an OR gate and two PMOS switches placed in series with the power supply of each column (one for the source follower's supply and one for the pixel's reset supply).



Figure 3.57: Illustration of a single  $VDD_{pix}$  Disabling cell

The inverting output  $\overline{Q}$  of the D flip-flop is used for driving the two logic gates that control the PMOS transistors, so as to reduce the loading of the non-inverting output Q, reducing the chance of signal propagation errors across the chain.

The layout of eight  $VDD_{pix}$  disabling cells is shown in Figure 3.58 with the pass transistors and digital logic clearly visible.



Figure 3.58: Layout of the VDD<sub>pix</sub> Disabling circuit

# 3.8.3. COLUMN TEST

Due to the high speed specification of this work, many column wires are required to get around the limitation of their large time constant, as seen in Section 3.6.1 – in fact there are 16 column wires for every pixel pitch of  $16 \mu m$ . Therefore the probability of shorting between adjacent wires increases. Since a single short might not be immediately evident under normal image testing, an odd/even shorted column test feature is implemented to test for that eventuality with greater ease.



Figure 3.59: Shorted Column Test (Shown for the first 8 columns)

The concept is illustrated in Figure 3.59. Switches controlled via ASPI connect either all odd or even columns to the pixel power supply  $VDD_{pix}$ , while the rest are pulled to ground via the column load. This creates a pattern of alternating black and white columns in the output image, where shorts can be easily identified.

### **3.8.4.** THE MBS

The Mixed Boundary Scan (MBS) is a system comprised of a chip-wide low speed analog bus and a variety of MBS cells. The cells are small circuits that under ASPI control (see 3.7.4), allow the connection of various nodes on the chip to be connected to the MBS bus. With the bus exposed to the outside world via a bondpad, the MBS system allows the test engineer to "probe" any node equipped with a sense cell for the purposes of debugging the operation of the chip. Additionally, it is also possible to apply an externally generated voltage to a node via the MBS bus. The final use case for MBS is verification of the ASPI configuration. Every ASPI register is capable of placing arbitrarily any of its bits onto the MBS bus. This enables for the ability to read back the entirety (or part) of the data contained in the registers and ensure an accurate configuration has been uploaded and set.



Figure 3.60: The three fundamental MBS cell structures

Since MBS needs to be as versatile as possible and work with mixed signals of both the analog and digital nature, as well as sometimes be used in a bidirectional fashion, there exists a variety of MBS cells to serve each purpose. Figure 3.60 illustrates the a functional diagram for the three main types of MBS cell. The analog and digital sense cells are used to probe internal nodes. The drive cell can be used to either probe a low impedance node where no buffering is needed, such as a power supply, or to alternatively assert an external voltage onto a node on-chip.

# **3.8.5.** WAFER PROBING

Wafer probing refers to contacting pads of each individual integrated circuit on a silicon wafer by means of very fine needles. This process allows preliminary testing and characterization of the fabricated ICs prior to dicing, establishing early on if their manufacture was successful or if signs of defect are present. Additionally an early form of performance assessment is possible. While this is a helpful step in the production chain of ICs, with chips such as image sensors which potentially feature a great number of bondpads, both the cost of the complex and delicate probecard that carries all the contacting needles and the probability of causing a short-circuit increases exponentially.

To avoid the tremendous complexity and cost of such a probecard, a multiplexing system has been developed which allows sequential readout of all output channels on one side of the sensor via one differential pair.



Figure 3.61: Waferprobe scheme

The sensor features four differential probe output channels in total (PROBE\_OUT N/P), one for each quadrant (NW, NE, SW, SE). These comprise the outputs of the wafer probe bypass system. Each of the quadrant's output channels features a switch and **set output**, controlled via the sensor's

ASPI interface. Once the appropriate code is uploaded, the corresponding output channel will be connected to the PROBE\_OUT(N/P) output bus for probing. The wafer probe multiplexing concept is shown for a single corner of the chip in Figure 3.61.

# **3.9.** MODES OF OPERATION

The sensor has three modes of operation, all using global shuttering. The default mode (Mode 1) is read-while-integrate (RWI) operation with on-chip correlated double sampling (CDS) and no high dynamic range (HDR) function. The second mode (Mode 2) features RWI and CDS, with the addition of HDR operation. The final operating mode (Mode 3) is a RWI operation with no CDS or HDR.

#### **OPERATING MODE 1**

In Mode 1, both the pixel's high gain signal and dark reference are sampled and read out. Therefore more time is needed in order to settle and sample the two values, leading to a decreased framerate with, however, the benefit of correlated double sampling. Since only the pixel's high gain is used (no HDR), the TG3A and TG3B low gain transfer gates are permanently low, same as the merge switch controls MA and MB. The timing diagram for mode 1 is shown in Figure 3.62.

#### **OPERATING MODE 2**

In Mode 2, three values are read out from the pixel (high gain signal, high gain dark reference and low gain signal), allowing for both CDS and HDR operation at an additional cost of framerate, however. Just prior to the end of the integration time, the floating diffusion, storage node and merge capacitor are reset. At the end of the integration time the TG3 transfer gate is turned on at mid-level, thereby skimming the large majority of charge accumulated in the photodiode into the low gain capacitor. Then the TG1 transfer gate is also pulsed on, which allows the remaining charge on the diode to be transfered to the storage node of the pixel. After that, with the merge switch off, the rolling readout of the high gain frame can commence, where floating diffusion node reset level is read out, a rolling TG2 pulse transfers the storage node charge to the floating diffusion and the high gain signal is also read out. Following after that is the low gain frame, where the merge switch connects the low gain capacitor to the floating diffusion and the low gain signal is read out. The timing diagram for mode 2 is shown in Figure 3.63.

#### **OPERATING MODE 3**

The highest framerate is achieved in Mode 3, since only the high gain signal needs to be settled on the column wires and read out. The TG3 transfer gate and merge switch controls are permanently low and the reset control and TG2 transfer gate are toggled globally for the entire pixel array in order to reset the storage node and floating diffusion. Then, the reset is turned off and the TG1 transfer gate is globally toggled to transfer the photodiode charge to the floating diffusion node. No pixel dark reference is provided to the readout circuits for CDS and an external dark reference is used in its place, as described in Section 3.6.4. The timing diagram for mode 3 is shown in Figure 3.64.



Figure 3.62: Timing diagram for operating mode 1



Figure 3.63: Timing diagram for operating mode 2


Figure 3.64: Timing diagram for operating mode 3

## 3.10. SUMMARY

In this chapter the design of this high-speed global shutter image sensor is described. Starting with the top-level architecture of the chip, describing the high level initial decisions, such as the separation of the pixel array into four quadrants, the kernel concept and size as well as the uninterrupted pipelined readout architecture. An overview of the signal path, starting from the pixel all the way to the analog output of the imager is presented, in order to give the reader a clearer top-level view of how light information is captured and propagates throughout the circuits of the sensor in order to be read out and form an image.

Next, the 11T global shutter HDR pixel is presented, alongside the pixel driver that generates all the necessary driving signals to operate the pixel. Further on, the KernelScan concept is introduced, thanks to which four subsets of pixels within a kernel can be adjusted to have an independent integration time, resulting in higher temporal resolution at the expense of lower spatial resolution.

Then, progressing logically further down the signal path, the readout circuitry is described. This includes the column multiplexing scheme – which plays an essential role in achieving the very high frame rate of this imager. Details are given on the circuits associated with the readout of the column signal, such as the column biasing current sources, the switched-capacitor sample and hold circuits with amplification, the single to differential amplifiers as well as the output stage amplifiers that drive the outputs of the chip.

Furthermore, an overview of the digital circuits of the sensor is given. These serve to control and coordinate the operation of the analog circuitry so as to capture and read out an image from the pixel array. Such circuits include the X and Y scanners, the X-Clock generator, the Region of Interest selector, as well as the ASPI circuits that constitute the digital control interface to the imager.

Having gone over the entirety of the signal path of the imager and the supporting circuits, the testing provisions of the design are discussed, along with their intended purpose. These include the test/variant pixels and the  $VDD_{pix}$  disabling, column test, MBS and wafer probe circuits.

Finally, the chapter is concluded with a brief overview of the different modes of operation of the sensor, alongside table 3.1, which summarizes the principle design specifications for this work.

| Specification              | Target                                  |
|----------------------------|-----------------------------------------|
| Process technology         | 0.11 μm GS CIS                          |
| Pixel size                 | $16 \mu m^2$                            |
| Pixel type                 | 11T Global Shutter HDR                  |
| Pixel count (H)            | 1280                                    |
| Pixel count (V)            | 1024                                    |
|                            |                                         |
|                            |                                         |
| Noise (with CDS)           | $5 e_{\rm rms}^-$                       |
| Noise (without CDS)        | 20 <i>e</i> <sup>-</sup> <sub>rms</sub> |
| Correlated double sampling | On-chip                                 |
| Read mode                  | IWR/GS with CDS                         |
|                            |                                         |
| ADC                        | Off-chip                                |

Table 3.1: Principle design specifications for this work

#### REFERENCES

- T. Goji Etoh *et al. "A CCD image sensor of 1Mframes/s for continuous image capturing of 103 frames"*. IEEE International Solid-State Circuits Conference. Digest of Technical Papers, San Francisco, CA, USA, pp. 30-386 (2002)
- [2] Y. Tochigi et al. "A Global-Shutter CMOS Image Sensor With Readout Speed of 1-Tpixel/s Burst and 780-Mpixel/s Continuous". IEEE Journal of Solid-State Circuits, vol. 48, no. 1, pp. 329-338 (2013)
- [3] T. Haruta et al. "A 1/2.3inch 20Mpixel 3-layer stacked CMOS Image Sensor with DRAM". IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, 2017, pp. 76-77 (2017)
- [4] Takayanagi, I.; Yoshimura, N.; Mori, K.; Matsuo, S.; Tanaka, S.; Abe, H.; Yasuda, N.; Ishikawa, K.; Okura, S.; Ohsawa, S.; Otaka, T. "An Over 90 dB Intra-Scene Single-Exposure Dynamic Range CMOS Image Sensor Using a 3.0 μm Triple-Gain Pixel Fabricated in a Standard BSI Process". Sensors, 18, 203 (2018)
- [5] Willassen, T. et al. "A 1280x1080 4.2 μm Split-diode Pixel HDR Sensor in 110nm BSI CMOS Process". Intl. Image Sensor Workshop, Vaals, The Netherlands (2015)
- [6] T. Lulé, C. Mandier, A. Glais, G. Roffet, R. Monteith, B. Deschamps, STMicroelectronics, Imaging Division, Grenoble, France, STMicroelectronics, Imaging Division, Edinburgh, UK "High Performance 1.3MPix HDR Automotive Image Sensor" Intl. Image Sensor Workshop, Vaals, The Netherlands (2015)
- [7] Johannes Solhusvik, Jiangtao Kuang, Zhiqiang Lin, Sohei Manabe, Jeong-Ho Lyu, Howard Rhodes, OmniVision Technologies "A comparison of high dynamic range CIS technologies for automotive applications". Intl. Image Sensor Workshop, Snowbird, Utah USA (2013)
- [8] Christian Bouvier, Yang Ni, New Imaging Technologies SA, France "Logarithmic Image Sensor for Wide Dynamic Range Stereo Vision System" Intl. Image Sensor Workshop, Snowbird, Utah USA (2013)
- [9] Caeleste CVBA. "Three level transfer gate". United States Patent US9780138B2, Oct. 3, 2017
- [10] Kabushiki Kaisha NAC Image Technology, NAC Image Technology KK. *"High-speed video camera"*. European Patent Application EP2451149A3, Mar. 19, 2014

# 4

## **CONCLUSION**

This work describes the design of a Caeleste high speed global shutter CMOS image sensor manufactured in a  $0.11 \,\mu$ m technology, in which I participated in under the guidance and supervision of Gaozhan Cai. The feature highlights of the sensor are:

- Full frame framerate >10.000 fps
- •
- Flexible integration time
- Electronic global shutter
- High dynamic range
- 1k×1k active pixels
- 16 µm pixel pitch

My most significant contributions to this work have been:

- Schematic design of the X&Y scanner, pixel driver, ROI selector, Sample & Hold and S2D blocks, as well as certain parts of the corner blocks, I/O blocks and top-level schematic
- Functional verification by simulation of various sub-circuits such as the pixel driver, Sample & Hold and video amplifier, column circuits, S2D amplifiers
- Layout, DRC and LVS physical verification and post-layout simulations of the S2D amplifiers and associated circuits

### FUTURE WORK

The sensor was taped out in August 2018. The first silicon has been fully processed and as of March 2019 is placed on hold for dicing and assembly, pending the manufacture of the sensor package. At this point the imager's test and characterization phase will begin.