

**Delft University of Technology** 

### High Precision Charge Detection Frontend Electronics

Mohammad Zaki, A.R.

DOI 10.4233/uuid:1496bf1b-7827-4ddd-9438-ac30ff56b938

Publication date 2025

**Document Version** Final published version

Citation (APA) Mohammad Zaki, A. R. (2025). High Precision Charge Detection Frontend Electronics. [Dissertation (TU Delft), Delft University of Technology]. https://doi.org/10.4233/uuid:1496bf1b-7827-4ddd-9438-ac30ff56b938

#### Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

#### Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology. For technical reasons the number of authors shown on this cover page is limited to a maximum of 10.

# High Precision Charge Detection Frontend Electronics

Alireza Mohammad Zaki



# High Precision Charge Detection Frontend Electronics

Alireza Mohammad Zaki

## High Precision Charge Detection Frontend Electronics

### Dissertation

for the purpose of obtaining the degree of doctor at Delft University of Technology by the authority of the Rector Magnificus Prof.dr.ir. T.J.H. Vlugt, chair of the Board for Doctorates, 4 July 2025 at 10:00

by

### Alireza MOHAMMAD ZAKI

Master of Science in Electronic Engineering, Politecnico di Milano, Italy born in Tehran, Iran This dissertation has been approved by the promotors.

Composition of the doctoral committee:

| Rector Magnificus      | chairperson                                |
|------------------------|--------------------------------------------|
| Prof. dr. S. Nihtianov | Delft University of Technology, promotor   |
| Dr. S. Du              | Delft University of Technology, copromotor |

Independent members:

| Prof. dr. ir. W.A. Serdijn | Delft University of Technology                 |
|----------------------------|------------------------------------------------|
| Dr. P. Ramachandra Rao     | Delft University of Technology                 |
| Prof. dr. J. Anders        | University of Stuttgart, Germany               |
| Prof. dr. K. Ozanyan       | University of Manchester, United Kingdom       |
| Dr. Y. Wang                | ASML Silicon Valley, USA                       |
| Prof. dr. A. Yarovoy       | Delft University of Technology, reserve member |

The research described in this thesis was supported by the Dutch Top Consortium for Knowledge and Innovation (TKI) HTSM.

Statement: The language and style quality of this thesis is enhanced by AI (ChatGPT), followed by a thorough text review.



Copyright © 2025 by Alireza Mohammad Zaki

ISBN 978-94-6384-810-7

All rights reserved. No part of this publication may be reproduced or distributed in any form or by any other means or stored in a database or retrieval system without the prior written permission of the author.

An electronic version of this dissertation is available at

https://repository.tudelft.nl/

To my family

# **Table of Contents**

| 1 | In  | troduction                                                       | .1  |
|---|-----|------------------------------------------------------------------|-----|
|   | 1.1 | Motivation                                                       | .1  |
|   | 1.2 | Main Question and Research Methodology                           | . 2 |
|   | 1.3 | Thesis Organization                                              | . 2 |
|   | 1.4 | References                                                       | . 4 |
| 2 | Ba  | ackground Overview                                               | .5  |
|   | 2.1 | Introduction                                                     |     |
|   | 2.2 | BSE Detector                                                     | 10  |
|   | 2.3 | PIN-Diode Readout Modes                                          | 12  |
|   | 2.3 | .1 Short Circuit Mode                                            | 12  |
|   | 2.3 | .2 Open Circuit Mode                                             | 14  |
|   | 2.4 | Target Application Specifications                                | 15  |
|   | 2.5 | State-of-the-Art Readout ASICs                                   | 18  |
|   | 2.5 | .1 Small Pixel High Rate Detector (SPHIRD)                       | 19  |
|   | 2.5 | .2 Low Noise PIXel (LNPIX)                                       | 21  |
|   | 2.5 | Fast Single Photon Counting Pixel (PXF40)                        | 22  |
|   | 2.5 | .4 Versatile Readout ASIC with High Count Rate Capability (IBEX) | 24  |
|   | 2.5 | .5 Micro-channel Plate Readout ASIC (MIRA)                       | 25  |
|   | 2.5 | .6 Open Circuit Mode Readout Pixel                               | 26  |
|   | 2.5 | .7 Conclusions                                                   | 28  |
|   | 2.6 | References                                                       | 31  |
| 3 | R   | eadout Solutions for Short Circuit Operation Mode 3              | 3   |
|   | 3.1 | Introduction                                                     | 33  |
|   | 3.2 | Analog Readout Frontend for Short Circuit Operation Mode         | 33  |
|   | 3.3 | Preamplifier Stage                                               | 35  |
|   | 3.3 | .1 Operation Principle                                           | 36  |

| 3.3                                                                              | .3.2 Core Amplifier                                                                                                                                                                                                                                                                                                                                |                                                                                       |  |
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|--|
| 3.3                                                                              | .3.3 Feedback Resisistor                                                                                                                                                                                                                                                                                                                           |                                                                                       |  |
| 3.3                                                                              | .3.4 Noise                                                                                                                                                                                                                                                                                                                                         |                                                                                       |  |
| 3.3                                                                              | 3.3.5 Design and Implementation                                                                                                                                                                                                                                                                                                                    |                                                                                       |  |
| 3.4                                                                              | Signal Shaping Filter                                                                                                                                                                                                                                                                                                                              |                                                                                       |  |
| 3.4                                                                              | .4.1 Passive High-pass Signal Shaping                                                                                                                                                                                                                                                                                                              | Filter                                                                                |  |
| 3.4                                                                              | .4.2 Active High-Pass Signal Shaping H                                                                                                                                                                                                                                                                                                             | Filter                                                                                |  |
| 3.5                                                                              | Threshold Discriminator                                                                                                                                                                                                                                                                                                                            |                                                                                       |  |
| 3.5                                                                              | .5.1 Discriminator Design for Passive S                                                                                                                                                                                                                                                                                                            | Shaping Filter72                                                                      |  |
| 3.5                                                                              | .5.2 Discriminator Design for Active Sl                                                                                                                                                                                                                                                                                                            | haping Filter76                                                                       |  |
| 3.6                                                                              | Preamplifier Reset Generator                                                                                                                                                                                                                                                                                                                       |                                                                                       |  |
| 3.7                                                                              | Conclusions                                                                                                                                                                                                                                                                                                                                        |                                                                                       |  |
| 3.8                                                                              | References                                                                                                                                                                                                                                                                                                                                         |                                                                                       |  |
| 4 D                                                                              |                                                                                                                                                                                                                                                                                                                                                    |                                                                                       |  |
| 4 R                                                                              | Keadout Solutions for Open C                                                                                                                                                                                                                                                                                                                       | ircuit Operation Mode 85                                                              |  |
| 4 K<br>4.1                                                                       | Readout Solutions for Open C<br>Introduction                                                                                                                                                                                                                                                                                                       | -                                                                                     |  |
|                                                                                  | -                                                                                                                                                                                                                                                                                                                                                  |                                                                                       |  |
| 4.1                                                                              | Introduction                                                                                                                                                                                                                                                                                                                                       |                                                                                       |  |
| 4.1<br>4.2<br>4.3                                                                | Introduction<br>Readout Frontend for Open Circuit O<br>Readout Frontend Architecture                                                                                                                                                                                                                                                               |                                                                                       |  |
| 4.1<br>4.2<br>4.3<br>4.3                                                         | Introduction<br>Readout Frontend for Open Circuit O<br>Readout Frontend Architecture                                                                                                                                                                                                                                                               | 85 peration Mode                                                                      |  |
| 4.1<br>4.2<br>4.3<br>4.3                                                         | Introduction<br>Readout Frontend for Open Circuit O<br>Readout Frontend Architecture                                                                                                                                                                                                                                                               | 85<br>peration Mode                                                                   |  |
| 4.1<br>4.2<br>4.3<br>4.3<br>4.3<br>4.4                                           | Introduction<br>Readout Frontend for Open Circuit O<br>Readout Frontend Architecture                                                                                                                                                                                                                                                               | 85<br>peration Mode                                                                   |  |
| 4.1<br>4.2<br>4.3<br>4.3<br>4.3<br>4.4<br>4.4                                    | Introduction   Readout Frontend for Open Circuit O   Readout Frontend Architecture   .3.1 Matching of Detector and Dummy   .3.2 Periodic Sampling.   Design and Implementation   .4.1 Dynamic Comparator.                                                                                                                                          | 85<br>peration Mode                                                                   |  |
| 4.1<br>4.2<br>4.3<br>4.3<br>4.3<br>4.4<br>4.4<br>4.4                             | Introduction   Readout Frontend for Open Circuit O   Readout Frontend Architecture   .3.1 Matching of Detector and Dummy   .3.2 Periodic Sampling   Design and Implementation   .4.1 Dynamic Comparator   .4.2 Offset Compensation                                                                                                                 | 85<br>peration Mode                                                                   |  |
| 4.1<br>4.2<br>4.3<br>4.2<br>4.2<br>4.2<br>4.2<br>4.2<br>4.2                      | Introduction   Readout Frontend for Open Circuit O   Readout Frontend Architecture   .3.1 Matching of Detector and Dummy   .3.2 Periodic Sampling.   Design and Implementation   .4.1 Dynamic Comparator   .4.2 Offset Compensation   .4.3 Threshold Generation                                                                                    | 85<br>peration Mode                                                                   |  |
| 4.1<br>4.2<br>4.3<br>4.3<br>4.3<br>4.3<br>4.4<br>4.4<br>4.4<br>4.4               | Introduction   Readout Frontend for Open Circuit O   Readout Frontend Architecture   .3.1 Matching of Detector and Dummy   .3.2 Periodic Sampling.   Design and Implementation   .4.1 Dynamic Comparator.   .4.2 Offset Compensation   .4.3 Threshold Generation.   .4.4 Capacitor Matching Mechanism                                              | 85     peration Mode     85     87     Capacitors     88     90     90     98     106 |  |
| 4.1<br>4.2<br>4.3<br>4.3<br>4.3<br>4.3<br>4.3<br>4.4<br>4.4<br>4.4<br>4.4<br>4.4 | Introduction   Readout Frontend for Open Circuit O   Readout Frontend Architecture   .3.1 Matching of Detector and Dummy   .3.2 Periodic Sampling.   Design and Implementation   .4.1 Dynamic Comparator   .4.2 Offset Compensation   .4.3 Threshold Generation   .4.4 Capacitor Matching Mechanism   .4.5 Additional Blocks                       | 85<br>peration Mode                                                                   |  |
| 4.1<br>4.2<br>4.3<br>4.3<br>4.3<br>4.4<br>4.4<br>4.4<br>4.4<br>4.4<br>4.4<br>4.4 | Introduction   Readout Frontend for Open Circuit O   Readout Frontend Architecture   .3.1 Matching of Detector and Dummy   .3.2 Periodic Sampling.   Design and Implementation   .4.1 Dynamic Comparator   .4.2 Offset Compensation   .4.3 Threshold Generation   .4.4 Capacitor Matching Mechanism   .4.5 Additional Blocks   .4.6 Pixel Overview | 85<br>peration Mode                                                                   |  |

| 5 Expen            | rimental Results                                                  | 125 |
|--------------------|-------------------------------------------------------------------|-----|
| 5.1 Qua            | lifications Test Setup                                            | 125 |
| 5.1.1              | Goals of the Qualification Tests                                  | 125 |
| 5.1.2              | Device Under Test (DUT)                                           | 126 |
| 5.1.3              | Test PCB                                                          | 127 |
| 5.1.4              | Detector Emulating Circuit (DEC)                                  | 129 |
| 5.1.5              | Pulse Time Width Modulator                                        | 131 |
| 5.1.6              | Multiple Pulse Generator                                          | 131 |
| 5.1.7              | Shift Register                                                    | 132 |
| 5.1.8              | Linear Feedback Shift Register                                    | 133 |
| 5.1.9              | Data Acquisition Board                                            | 134 |
| 5.1.10             | Poissonian-Distributed Trigger Pulses                             | 135 |
| 5.2 Exp            | erimental Qualification Results                                   | 136 |
| 5.2.1<br>Circuit N | Experiemental Qualification of Readout Channels Operating<br>Mode |     |
| 5.2.2<br>Circuit M | Experiemental Qualification of Readout Channels Operating<br>Mode | -   |
| 5.3 Refe           | erence                                                            | 162 |
| 6 Concl            | lusions and Future Works                                          | 163 |
| 6.1 Con            | clusions                                                          | 163 |
| 6.2 Mai            | n Findings and Contributions                                      | 166 |
| 6.3 Futi           | ıre Works                                                         | 167 |
| 6.3.1              | Short Circuit Operation Mode                                      | 168 |
| 6.3.2              | Open Circuit Operation Mode                                       | 169 |
| 6.4 Refe           | erences                                                           | 171 |
| Appendix A         |                                                                   |     |
| 11                 | Loop and Stability Analysis                                       |     |
|                    | nal Loop Analysis                                                 |     |

| A.2.1 Total Loop Analysis  |  |
|----------------------------|--|
| A.2.2 Stability Assessment |  |
| Appendix B                 |  |
| B.1 FPGA Code              |  |
| Appendix C                 |  |
| C.1 Chip Gallary           |  |
| Summary                    |  |
| Samenvatting               |  |
| List of Publications       |  |
| Acknowledgments            |  |
| About the Author           |  |

### **1** Introduction

#### 1.1 Motivation

Scanning electron microscopes (SEMs) have undergone extensive development in the last years [1], [2], [3]. Modern SEMs generate images revealing details of a specimen with nanometer resolution. In order to achieve such high resolution, the electron optics demands using a smaller primary electron beam of relatively lower-energy electrons, finely focused on the target surface, delivering a scanning spot of only few nanometers [4], [5]. This results in a smaller secondary electron beam consisting of backscattered and secondary electrons, which are sensed by detector and readout electronics. A well-known disadvantage of SEM is the time needed to produce an image of a meaningful area, due to the small scanning spot [6], [7]. This is one of the challenges using SEM as an in-production-line inspection tool in the semiconductor industry. Solution for this is to increase the scanning speed.

Overall modern SEMs require highly-sensitive detector followed by a low-noise, wide-bandwidth, power efficient readout electronics. These three requirements contradict each other and pose an enormous challenge for the readout electronics, especially when the PIN-diode based detector has a large area in order to capture all secondary and backscattered electrons, and hence has large capacitance [8], [9], [10]. A solution for this is to divide the area of the detector to small parts (pixels), each of them having their own readout channel [10]. This approach reduces the capacitance of the pixel, which helps to improve the signal-to-noise ratio (SNR) and to extend the bandwidth of the readout channels, demanded by the increased scanning speed. However, having multiple channels poses a pressure on the allowed power consumption of each pixel readout electronics [11]. Finally, the weak secondary electron beam, the high number of pixels, and the very short clock periods (based on the fast scanning speed), lead to the need of detecting single electrons landing on a pixel within one clock period, which converts the electron current sensing mode of operation into a "single electron counting mode". Finally, the electron counting for each pixel has to happen with a very low count error (5 ppm or lower), in order the SEM to produce a good quality image.

#### 1.2 Main Question and Research Methodology

The main question to be answered in this thesis is: which is the frontend readout electronic architecture and circuit solution, which can provide the best performance with respect to power efficiency (power consumption lower than 500  $\mu$ W), high time-resolution of 2.5 ns, and electron count error lower 10 ppm?

To answer the main question, a systematic approach is undertaken, beginning with a comprehensive study of potential frontend electronic architectures. This study aims to assess their capabilities in meeting the expected performance criteria, focusing particularly on critical parameters such as noise, speed, and power consumption. Given the pivotal role of the analog frontend, and especially the preamplifier, in determining the precision, speed, and accuracy of signal processing, this component receives special attention during the evaluation of existing architectures. The outcome of this study informs the development of two new readout frontend electronic architectures specifically tailored to address the challenges of noise, speed, and power efficiency.

According to these defined criteria, novel readout frontend architectures are investigated and implemented. Their performance is evaluated through analysis, simulations, and experimental qualifications. This iterative design and evaluation process ensures that the proposed solutions not only meet but exceed the required performance criteria, establishing a new state-of-the-art in readout frontend electronics for SEM applications.

#### 1.3 Thesis Organization

Figure 1-1 provides an overview of the thesis organization, outlining the content and focus of each chapter. Chapter 2 delves into the operating principles of PIN-diode, emphasizing the critical need for low-noise, high-speed, and high-precision readout frontend. It also reviews state-of-the-art solutions for readout interfaces in similar applications, analyzing their limitations and identifying opportunities for enhancement.



Fig. 1-1. Organization of the thesis.

Chapters 3 and 4 investigate the performance limits of two readout frontend architectures tailored for the short circuit and open circuit operation modes of the PINdiode, respectively. These chapters detail the design concepts, implementation strategies, and performance advantages of the proposed solutions.

Chapter 5 describes the experimental setup and auxiliary blocks required for the qualification of the proposed readout frontends to validate the performance of the designed prototypes. Moreover, Chapter 5 presents the results of the experimental qualifications. It includes a detailed characterization of the signal at each functional block, noise performance evaluations, and an assessment of detection accuracy. Chapter 6 concludes the thesis by summarizing the key contributions and insights from this research. It also offers recommendations for future work, suggesting potential directions for further improvement and innovation based on the outcomes of the study.

#### 1.4 References

- [1] N. Erdman, D. C. Bell, and R. Reichelt, "Scanning Electron Microscopy," in Springer Handbook of Microscopy, P. W. Hawkes and J. C. H. Spence, Eds., Cham: Springer International Publishing, 2019, pp. 229–318. doi: 10.1007/978-3-030-00069-1\_5.
- [2] P. W. Hawkes, E. Kasper, Principles of Electron Optics, Volume 2. Academic Press, 2017.
- [3] D. B. Williams, C. B. Carter, Transmission Electron Microscopy: A Textbook for Materials Science. Springer, 2009.
- [4] N. Brodusch, H. Demers, and R. Gauvin, Field Emission Scanning Electron Microscopy. in SpringerBriefs in Applied Sciences and Technology. Singapore: Springer, 2018. doi: 10.1007/978-981-10-4433-5.
- [5] L. Frank, M. Hovorka, I. Konvalina, Š. Mikmeková, and I. Müllerová, "Very low energy scanning electron microscopy," Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 645, no. 1, pp. 46–54, Jul. 2011, doi: 10.1016/j.nima.2010.12.214.
- [6] A. R. Faruqi and R. Henderson, "Electronic detectors for electron microscopy," Current Opinion in Structural Biology, vol. 17, no. 5, pp. 549–555, Oct. 2007, doi: 10.1016/j.sbi.2007.08.014.
- [7] W. Zhou and Z. L. Wang, Eds., Scanning Microscopy for Nanotechnology. New York, NY: Springer, 2007. doi: 10.1007/978-0-387-39620-0.
- [8] G. Lutz, Semiconductor Radiation Detectors. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. doi: 10.1007/978-3-540-71679-2.
- [9] H. Spieler, Semiconductor Detector Systems. Oxford University Press, 2005. doi: 10.1093/acprof:oso/9780198527848.001.0001.
- [10] Y. Wang, Z. Dong, R.-L. Lai, and K. Kanai, "Semiconductor charged particle detector for microscopy," WO2019233991A1, Dec. 12, 2019.
- [11] M. Nakhostin, Signal processing for radiation detectors. Hoboken: John Wiley, 2018.

## 2 Background Overview

This chapter explores the state-of-the-art readout frontends of diode-based electron detectors, highlighting advantages and challenges in achieving high-resolution, providing efficient imaging in SEMs. We also discuss the principles and techniques of backscattered electrons (BSEs) and secondary electrons (SEs) detection, focusing on the role of the semiconductor PIN-diode and the balance between the scanning current intensity and the beam spot size.

#### 2.1 Introduction

In electron microscopy, detecting electrons emitted from the specimen surface is essential for obtaining detailed compositional and topographical information about specimens. Scanning Electron Microscopes achieve this by scanning the specimen surface with a primary electron beam and detecting the secondary current produced by both backscattered electrons (BSEs) and secondary electrons (SEs).

Secondary Electrons, Fig. 2-1 (a), are produced when primary electrons undergo inelastic collisions with the atoms in the sample. These collisions cause the ejection of low-energy electrons, typically less than 50 eV. SEs are primarily used to image the surface morphology of a sample due to their generation close to the surface of the specimen, which makes them highly sensitive to surface features. BSEs, Fig. 2-1 (b), on the other hand, are generated through elastic collisions, where the primary electrons are deflected by atomic nuclei within the sample. The energy of BSEs is closely related to the atomic number of the atoms they interact with, making them especially valuable for material composition analysis. Higher atomic number elements tend to backscatter electrons more efficiently, resulting in brighter areas in BSE images that correspond to regions with heavier elements.

In Scanning Electron Microscopy, both SEs and BSEs are crucial for analyzing and imaging specimens, and are collectively referred to as backscattered electrons in this context. These electrons are collected using an electron detector, typically a semiconductor PIN-diode designed with a doughnut-shaped structure [1]. This design features a central hole that allows the primary electron beam to pass through while efficiently capturing backscattered electrons that are emitted from the specimen in various directions. An application-specific integrated circuit (ASIC) is coupled with the PINdiode for signal processing and data acquisition.



Fig. 2-2. Different electron emission mechanisms: a) SE and b)BSE.

Key factors influencing imaging quality and scanning speed include the scanning current intensity and the electron beam spot size. The scanning current intensity determines the number of electrons interacting with the specimen surface, affecting signal strength and image contrast. Higher current intensities can speed up imaging but may also increase specimen damage and reduce resolution [2]. Additionally, prolonged exposure to high-energy electrons can degrade semiconductor detectors, reducing sensitivity and increasing noise over time [2], [3]. Smaller beam spot sizes, which improve spatial resolution, necessitate longer scanning times to cover the entire area. Fast scanning with short sampling periods means only a few electrons strike the detector per period, presenting challenges for precise detection [4], [5].

The strategic placement of the detector above the specimen ensures that it can effectively collect BSEs regardless of their departure angle, maximizing the detection efficiency. Electrons reaching the detector can spread out, creating a larger beam spot on the detector surface compared to the specimen surface. To address this, the detector is often segmented into multiple sensing segments/pixels as shown in Fig. 2-2. As the

number of pixels increases significantly, their area reduces. Under these conditions, the probability of a pixel being hit by more than one electron per scanning step becomes negligible, enabling the use of a single-electron detection mode [1]. Highly segmented detectors are effective in mitigating common issues such as dark current, shot noise, and reduced SNR, which are more problematic in larger detector areas.



Fig. 2-2. Highly pixelized PIN-diode.

A common approach for implementing single-electron detection mode involves connecting each pixel in a two-dimensional matrix of a semiconductor detector to its own pulse processing circuit in the readout ASIC. This configuration, known as a hybrid pixel detector, ensures that the pixel pitch of the detector corresponds to the pitch of the readout channels in the ASIC. Typically, a fine-pitch flip-chip direct physical interconnection method is employed to establish the connection between the detector pixels and the readout ASIC [6], [7].

In BSE detection frontends, the primary function of the readout channels is to detect and count the total number of BSEs impacting the detector within a defined time frame, as specified by the scanning algorithm. Each readout channel generates a logical '1' through the digitizer when a BSE is detected within this timeframe. The total number of BSEs per scanning step is then calculated by aggregating the counts

from all channels that have produced this logical state. The readout channels are designed to convert the charge signal generated by the detector due to BSE impacts into a digital pulse for subsequent post-processing. This involves an analog frontend for signal extraction and a digital backend for data processing, interfaced through either a multi-bit analog-to-digital converter (ADC) or, for event registration, a threshold discriminator functioning as a one-bit ADC [8], [9], [10], [11]. Converting to a digital format is crucial for accurate data analysis and image reconstruction. Figure 2-3 illustrates a simplified block diagram of a typical readout channel. The analog frontend typically consists of a preamplifier that interfaces with the detector, a gain/filter stage designed to amplify and shape the signal, and a threshold discriminator that separates the signal from noise and digitizes the hit data [8], [12]. These components work together to ensure that each detected electron is accurately recorded, enhancing the overall effectiveness and precision of the BSE detection system.



Fig. 2-3. Simplified block diagram of a generic readout channel.

The conversion of the charge signals generated in the detector into voltage signals is a critical step in the readout process, and this is typically handled by the preamplifier stage in the analog frontend. For optimal performance, the preamplifier must be optimized for noise level, bandwidth, and power consumption relative to the capacitance of the pixel detector and the count rate capability of the system [8]. In electron counting systems, an electron signal is recorded only if it exceeds a threshold level set above the intrinsic system noise. While this threshold effectively filters out unwanted noise, it can also result in missed counts if the threshold is too high, thereby reducing detection efficiency. The linear behavior of the readout channel across its dynamic range is maintained if the counter capacity is sufficient and pulse pileup is effectively managed. It is important to mention here that both the secondary and backscattered electrons leaving the surface of the specimen are equally accelerated before reaching the detector. This makes the variation of the energy of the electrons absorbed by the detector relatively small compared with the variation caused by other factors like the Fano noise, for example.

One of the significant challenges for state-of-the-art readout frontend is the accurate registration of weak charge signals and the ability to handle high flux rate signals generated by the detectors. This challenge is further increased by the need to detect subtle electron signals with minimal error rates and high time resolution, typically on the nanosecond scale [13]. Achieving such a high level of precision requires readout channels that offer broad bandwidth and low noise, while also maintaining moderate power consumption to prevent overheating and noise-induced signal degradation. Power efficiency in the readout channels is especially critical when managing a large number of pixels. Excessive power consumption can lead to thermal heating, which not only increases the overall noise floor but also causes bias drift in the circuit, reducing detection accuracy [11], [14]. Therefore, managing power consumption efficiently is essential to maintain overall performance of the system.

Balancing the need for high sensitivity and fast response times with power efficiency is a complex task. The system must be able to detect weak signals without introducing significant noise, handle high flux rates, and maintain low error rates. This delicate trade-off highlights the importance of advanced circuit design in state-of-theart readout frontends, enabling them to meet the stringent demands of high-resolution detection systems.

The error rate in charge signal detection is a critical performance metric that can be compromised by noise and inter-symbol interference (ISI). ISI refers to the overlap and accumulation of signals at the output of a low-bandwidth stage [15]. The intricate interplay between noise and ISI (as illustrated in Fig. 2-4) necessitates an optimal bandwidth that strikes a balance in error rates between the two factors for maintaining high detection accuracy [16], [17], [18].



Fig. 2-4. Conceptual plot of the trade-off between bandwidth, noise, ISI and the detection error rate.

#### 2.2 BSE Detector

The detector used in this work is a PIN-diode, which is similar to a standard PNjunction diode but features an additional layer of intrinsic semiconductor material between the p and n regions. This intrinsic layer, characterized by high resistivity, significantly increases the diode overall depletion width, as illustrated in Fig. 2-5 [8]. The PIN-diode operates under reverse bias, where the reverse voltage depletes the intrinsic layer of the detector.

Figure 2-6 shows a simplified small-signal (ignoring the non-liner behavior of the junction) electrical equivalent circuit of a PIN-diode. The circuit includes a current source ( $I_{Signal}$ ) representing the current generated by electron-hole pair creation, and a capacitor ( $C_D$ ) denoting the diode junction capacitance. The capacitance decreases with an increase in the intrinsic layer width, provided it is fully depleted.

The number and size of detector pixels are key factors in shaping the performance and efficiency of the detection system. Larger pixel areas generally lead to increased leakage current and associated noise contributions, which can degrade the SNR and complicate the design of wide-bandwidth, power-efficient readout electronics.



Fig. 2-5. PIN-diode structure.



Fig. 2-6. Electrical equivalent circuit of PIN-diode.

In advanced detectors, the diode leakage current is often comparable to the gate leakage current of connected MOSFETs, meaning that the shot noise from both leakage currents can contribute to the Dark Count Rate (DCR)—the rate of false detections in the absence of actual electrons [6], [12]. However, in high-quality detectors, the noise from leakage current is usually negligible compared to the noise introduced by the readout electronics. This highlights the importance of optimizing the design of both the detector and the readout electronics to achieve accurate results. Balancing pixel size with the readout channel design is critical to maintaining high detection performance. Larger pixels may capture more signals, but they also generate more noise, while smaller pixels can reduce noise but might limit detection efficiency.

In single-electron detection mode, the detector must have a fast response time to maintain high imaging speeds. The response time of the detector is influenced by two factors: (1) the time required for the generated electrons and holes to reach the two opposite electrodes, and (2) the time constant of the detector, defined by the junction capacitance and series resistance. Each of these factors can independently limit the

detector response time. Given that the pixel area is determined by higher-level system requirements, reducing the junction capacitance by increasing the depletion width of the PIN-diode shortens the time constant. However, increasing the depletion width also lengthens the charge-collection time. The depletion width is determined by the intrinsic layer thickness of the PIN-diode, which must be fully depleted to maintain low series resistance. While higher bias voltage increases the electric field intensity in the depletion region, thereby accelerating the electrons and reducing collection time, extremely high bias voltages are not recommended for this application [13]. Additionally, once electrons reach their speed limit in silicon, further increasing the electric field intensity becomes ineffective.

#### 2.3 PIN-Diode Readout Modes

There are two primary modes for reading out the signal generated by a PIN-diode: short circuit mode and open circuit mode [12], [19]. These modes differ in the method used to handle and process the charge signal produced by the diode. These methods come with their own set of advantages and limitations, impacting the effectiveness and efficiency of charge signal detection.

#### 2.3.1 Short Circuit Mode

In short circuit mode (Fig. 2-7), the PIN-diode is connected to a load with zero impedance ( $Z_L = 0$ ), ensuring that all the charge generated by the diode ( $I_{Signal}$ ) flows directly into the load rather than accumulating in the diode junction capacitance ( $C_D$ ). This mode enables continuous readout of the PIN-diode, offering real-time monitoring of charge signals. While a load with zero impedance is an ideal scenario, it can be practically approximated using a preamplifier that establishes a virtual ground at the output of the diode. This virtual ground provides a low-impedance connection, allowing fast charge transfer from the photodetector to the preamplifier.



Fig. 2-7. Short circuit readout mode of PIN-diode.

An important advantage of the short circuit mode is the prevention of charge accumulation in the diode's junction capacitance ( $C_D$ ). Furthermore, the charge-to-voltage (Q – V) conversion in this mode is virtually independent of  $C_D$  variations, significantly reducing errors caused by capacitance fluctuations and enhancing reliability in detecting small signal variations [20].

A critical aspect of the short circuit mode is optimizing the preamplifier time constant. The time constant, determined by the feedback resistance and capacitance, must be carefully matched to the diode charge collection time ( $t_{Collection}$ ). An excessively large time constant can result in signal pileup, where overlapping charge signals cause errors in BSE detection [8], [21]. Conversely, a very short time constant can lead to insufficient signal integration, reducing overall gain and sensitivity. Proper fine-tuning of the time constant ensures maximum signal gain and minimizes pileup risk. Additionally, short circuit mode excels in handling high event rates, mitigating signal pileup even in high-throughput environments. When combined with a well-designed preamplifier and signal shaping filter, it processes signals efficiently, compensating for pileup while maintaining high precision [20].

Furthermore, noise performance in short circuit mode is significantly influenced by the detector junction capacitance ( $C_D$ ). The junction capacitance, combined with the preamplifier characteristics, affects the overall noise level. High junction capacitance can lead to increased overall noise of the readout channel, which degrades the SNR and overall detection accuracy [13], [21]. Addressing these challenges requires meticulous circuit design, with a focus on reducing noise and optimizing the feedback network. However, short circuit mode's sensitivity to parasitic effects, particularly those associated with  $C_D$ , introduces additional complexity. Maintaining strict impedance control is essential but can limit flexibility, as variations in operating conditions or signal characteristics may necessitate significant adjustments to the preamplifier or circuit design. These requirements add complexity to the system and reduce its adaptability in dynamic environments [21].

Therefore, achieving optimal performance in short circuit mode necessitates precise circuit design and careful selection of preamplifier components. These efforts are essential for minimizing noise, maximizing detection accuracy, and ensuring the system's robustness across a range of operating conditions.

#### 2.3.2 Open Circuit Mode

In open circuit mode (Fig. 2-8), the PIN-diode is connected to a load with infinite impedance ( $Z_L = \infty$ ), causing all the charge generated by the diode ( $I_{Signal}$ ) to accumulate in the diode junction capacitance ( $C_D$ ) rather than flowing into the load. This configuration creates a high-impedance connection, allowing the accumulated charge to be converted into a voltage signal already in the diode itself. The voltage signal is then directly fed to a threshold discriminator, which compares it with a reference threshold level to detect the presence of an incoming signal [13], [20].



Fig. 2-8. Open circuit readout mode of PIN-diode.

The open circuit mode offers significant advantages, particularly in terms of power efficiency. One of the key benefits is that no power is consumed during the conversion of charge into voltage, which is typically the most power-hungry stage in current mode. Another advantage is that the voltage signal is remembered by the diode junction capacitance until it is reset. This persistent voltage signal allows for the use of a dynamic, power-efficient discriminator, which can further reduce power consumption by only activating when necessary. These feature enables efficient signal processing without the need for constant power, making open circuit mode an attractive option for applications that require both high precision and low energy usage.

The performance of the discriminator is critical for accurate signal detection. For effective operation, the discriminator must exhibit high precision and a fast response time. High precision ensures that even subtle signals are accurately detected, while fast response time is essential to minimize the risk of missed signals during rapid successive events. Advanced techniques, such as auto-zeroing and dynamic threshold adjustment, may be employed to further enhance discriminator precision and responsiveness. By ensuring that the discriminator operates effectively, the detection system can achieve higher accuracy and reliability, ultimately improving the overall signal detection process.

Operating in open circuit mode presents several challenges, with charge pileup being one of the most critical [13]. Charge pileup occurs when multiple electrons impact causes the voltage over the junction capacitance ( $C_D$ ) to approach the diode threshold voltage. This leads to signal gain compression—where additional incoming charge results in diminishing changes to the output voltage. This non-linearity reduces measurement accuracy, making it difficult to accurately detect and quantify incoming signals.

#### 2.4 Target Application Specifications

The focus of this work is on investigating the analog frontend of multi-channel readout ASICs integrated with highly segmented semiconductor-based detectors, particularly for applications requiring precise detection and registration of weak charge signals with high time resolution. The goal is to optimize signal extraction, amplification, and noise reduction to improve the accurate capture and registration of BSEs in Scanning Electron Microscopy. This investigation is based on the following three key assumptions regarding BSE behavior in the SEM environment:

- 1. **Single-Electron Detection:** Each pixel in the detector is assumed to register the impact of only one electron within one time frame.
- 2. Low Number of Consecutive Hits: It is assumed that the frequency of electrons hitting each pixel is low enough, so that the probability of more than three electrons hitting the pixel in three consecutive time frames is negligible.
- 3. Fixed Energy of Detected Electrons: The incoming electrons are assumed to have a limited variation in their energy levels. This assumption simplifies the design of the readout architecture by minimizing the need to account for significant fluctuations in the charge generated by the detector per incoming electron.

The first two assumptions are valid due to the high degree of pixelization in the detector, which ensures that each pixel is relatively small and thus less likely to receive multiple electrons impacts in a short time frame. Moreover, the relatively low number of backscattered electrons produced by the specimen (typically a few tens per time frame) further supports the assumption that electron hits on individual pixels will be minimal.

The detector used in this study is highly segmented, consisting of approximately 10000 pixels, each with dimensions of 165  $\mu$ m × 165  $\mu$ m. Each sensing pixel in the detector exhibits a junction capacitance in the range of C<sub>D</sub> = 30 fF – 50 fF, with a charge collection time of t<sub>Collection</sub> = 1.8 ns. The series resistance of the detector (R<sub>S</sub>) is on the order of a few hundred ohms. Combined with the junction capacitance (C<sub>D</sub>), this results in a detector time constant of only a few picoseconds. This time constant is negligible compared to the time resolution of interest in the system, ensuring that the detector's inherent RC delay does not impact the overall temporal performance.

For each BSE impacting the detector pixel, a charge signal is generated with an average amplitude of  $I_{Signal} = 89 \text{ nA} \pm 20 \%$  during the charge collection time. This corresponds to a total charge signal of  $Q_{Signal} = 160 \text{ aC} \pm 20 \%$ , equivalent to approximately 1000 electrons with an uncertainty of 200 electrons. The  $\pm 20 \%$  uncertainty in the generated signal arises from the Fano effect. Fano noise refers to the inherent statistical fluctuations in the number of electron-hole pairs generated by the interaction of incident radiation with the semiconductor detector. The total charge generated is proportional to the energy lost by the impinging BSE within the detector, and the Fano noise reflects the variations in the energy required to generate each electron-hole pair [22], [23]. Table 2-1 provides a summary of the key specifications of the PIN-diode used in this work.

| Specification                                     | Value                                                   |
|---------------------------------------------------|---------------------------------------------------------|
| Number of Pixel                                   | 10000                                                   |
| Pixel Area                                        | 165 μm × 165 μm                                         |
| Junction Capacitance (C <sub>D</sub> )            | 30 fF – 50 fF                                           |
| Series Resistance (R <sub>S</sub> )               | 100 – 200 Ω                                             |
| Charge Collection Time (t <sub>Collection</sub> ) | 1.8 ns                                                  |
| Signal Current (I <sub>Signal</sub> )             | 89 nA ± 20 %                                            |
| Total Charge Signal (Q <sub>Signal</sub> )        | $160 \text{ aC} \pm 20 \% (1000 \text{ e}^- \pm 20 \%)$ |
| Leakage Current $(I_{Leak})$                      | ~10 fA                                                  |

Table 2-1. Key specifications of the PIN-diode.

The target specifications for the readout channels aim to achieve certain electron counting precision with minimum power consumption. Table 2-2 summarizes these specifications:

• **High precision:** The readout channel must be capable of accurately detecting weak charge signals, specifically those less than 200 aC (approximately

1250 e<sup>-</sup>), while maintaining an exceptionally low error rate (less than 5 ppm) and high time resolution (2.5 ns).

• Low power consumption: Each readout channel must operate with power consumption below 500 µW, a crucial requirement for minimizing thermal heating and preventing bias drift in the circuit, which could degrade the performance of the detection system.

These specifications reflect the demands for high accuracy and performance in electron detection, while maintaining power efficiency to support scalability in large detector arrays.

| Specification                              | Value    |
|--------------------------------------------|----------|
| Input Charge Signal (Q <sub>Signal</sub> ) | < 200 aC |
| Power Consumption                          | < 500 μW |
| Time Resolution                            | 2.5 ns   |
| Charge Detection Error Rate                | < 5 ppm  |

Table 2-2. Target Specifications for the Readout Channel.

#### 2.5 State-of-the-Art Readout ASICs

This section reviews state-of-the-art readout frontends designed for high-efficiency charge signal detection in applications requiring precision and low-noise operation. These readout ASICs are typically integrated with segmented semiconductor-based detectors and operate in single photon or single electron detection modes. Six recently reported pixel readout channels are analyzed here. These solutions define the current state-of-the-art in the field and demonstrate the closest to the targeted performance in this thesis. We analyze the performance of each of the six readout channels with its advantages and shortcomings not allowing its direct implementation in the targeted application. Five solutions are tailored for short circuit operation mode of the photodiode, while the last one represents a preliminary exploration and simulation-based design for open circuit operating mode of the photodetector.

The readout interfaces designed for short circuit mode include a preamplifier stage that operates in either transimpedance mode (when  $\tau_F \approx t_p$ ) or charge-sensitive mode (when  $\tau_F \gg t_p$ ), where  $\tau_F$  and  $t_p$  are the feedback time constant and the peaking time at the preamplifier output, respectively. Achieving a large feedback time constant ( $\tau_F$ ) in charge-sensitive mode requires a high feedback resistance ( $R_F$ ) in the preamplifier. Various circuit implementations have been proposed to address this requirement, with many applications employing either a Krummenacher feedback network or an ICON Cell in the charge-sensitive amplifier (CSA) to realize the necessary large  $\tau_F$ . These configurations ensure effective charge collection, signal amplification, and noise suppression, making them ideal for detecting fast and low-energy charge signals with high time resolution.

#### 2.5.1 Small Pixel High Rate Detector (SPHIRD)

A notable example of a readout ASIC designed for high photon flux environments is the SPHIRD-1, as detailed in [24]. This ASIC, fabricated in 40 nm CMOS process, is optimized for high count rate single-photon counting applications in the energy range of 15 keV to 30 keV (equivalent to charge in a range of  $Q_{in} = 4100e^- - 8300e^-$ ). Each readout pixel, as shown in Fig. 2-9, incorporates a preamplifier (operating in transimpedance mode) with a fast discharge block and a detector leakage compensation circuit. The preamplifier's output is AC-coupled to a threshold discriminator with offset trimming blocks. To address DC baseline shift issues caused by pulse undershoot, a baseline restorer (BLR) is implemented after the AC coupling capacitor.

To handle high-speed signal processing and mitigate pulse pileup, SPHIRD-1 incorporates two compensation methods: voltage-based and time-based. The voltagebased method utilizes auxiliary discriminators to detect analog pulses exceeding the amplitude of single-photon hits, which indicates pulse pileup. The time-based method uses time-over-threshold (ToT) technique to measure the pulse duration from the main discriminator, identifying extended digital pulses that also indicate pulse pileup [24].



Fig. 2-9. Block diagram of the SPHIRD-1 pixel readout electronics.

This design achieves low noise performance, with an equivalent noise charge (ENC) of 188  $e_{rms}^-$ , and supports a count rate of up to 28.9 mega counts per second per pixel (Mcps/pixel), consuming 26  $\mu$ W per pixel. However, these advantages come with certain trade-offs. One limitation is the periodic deadtime associated with the auto-zeroing process of the discriminators, which is necessary for offset correction. This deadtime can vary between 3.5 ns and 28.9 ns, depending on the pileup compensation method used. During these deadtimes, the readout channel is blind to any particles hitting the detector surface, which increases the detection error rate and reduces overall accuracy [24].

While SPHIRD-1 demonstrates low power consumption and reasonable noise levels, its time resolution of 34 ns and periodic deadtime render it inadequate for the highprecision, time-sensitive applications central to this thesis. The periodic deadtime introduces a critical limitation, as it interrupts continuous signal monitoring, which is essential for maintaining ultra-low error rates in applications requiring uninterrupted detection accuracy. This performance gap during deadtimes significantly affects the overall system reliability, particularly in scenarios demanding continuous and precise charge signal detection.

Moreover, while SPHIRD-1 benefits from a high SNR for larger input charge signals, its detection accuracy declines for smaller amplitude signals. This is because the noise level remains constant, causing the SNR to degrade as the input signal weakens. Consequently, the reduced SNR for low-amplitude signals leads to diminished detection accuracy, further limiting the applicability of SPHIRD-1 in contexts that prioritize high precision and the ability to resolve small charge signals with minimal error.

#### 2.5.2 Low Noise PIXel (LNPIX)

The LNPIX ASIC, designed using a 130 nm CMOS process, addresses the challenges of high-resolution detection for low-energy photons (less than 10 keV or 2800e<sup>-</sup>) through its advanced circuit architecture [25]. As shown in Fig. 2-10, the ASIC integrates a preamplifier configured in charge-sensitive mode, followed by an active CR-RC shaping filter and two threshold discriminators. The CR-RC shaping filter enhances SNR by providing additional gain, pulse shaping, and noise filtration. The threshold discriminators enable the selection of an appropriate energy window for the incoming photons. To reduce offset variation at the discriminators' inputs, each pixel includes two local offset correction Digital-to-Analog Converters (DACs). The effective threshold applied to the discriminator is the combination of a global threshold and an independent correction voltage for each discriminator.

The preamplifier's feedback network features the Krummenacher network to implement a large resistor as well as compensating for the detector leakage current. Moreover, the feedback network incorporates two equal-sized capacitors ( $C_{F_1}$  and  $C_{F_2}$ ) arranged in parallel, with a rapid discharge mechanism to enhance energy resolution and count rate performance. When an input pulse arrives, it charges both feedback capacitors and registers the photon hit. This triggers a feedback discharge action in the preamplifier, where the plates of  $C_{F_2}$  are initially disconnected from their original nodes and then reconnected with swapped node positions for a few nanoseconds via CMOS switches [25]. This process discharges the preamplifier feedback capacitance within a few nanoseconds but introduces a deadtime of a few nanoseconds to the readout channel. Additionally, the charge-sharing nonidealities of the CMOS switches can lead to variable signal amplitudes after the preamplifier, potentially impacting detection accuracy for the application in this thesis.



Fig. 2-10. Block diagram of the LNPIX pixel readout electronics.

The readout channel achieves an ENC of 44  $e_{rms}^-$  while consuming 42  $\mu$ W per pixel. The fast feedback discharge capability of the preamplifier enables the detector to handle high photon flux rates with low noise levels, making it suitable for high-speed applications [25]. However, this ASIC supports a maximum count rate of up to 0.6 Mcps/pixel which corresponds to a time resolution of 1.6  $\mu$ s. In this regard, the provided time resolution does not meet the requirements of the targeted application in this thesis.

#### 2.5.3 Fast Single Photon Counting Pixel (PXF40)

The ASIC described in [26] introduces an advanced pixel readout IC architecture, designed using a 40 nm CMOS process to enhance the performance of X-ray imaging systems. This ASIC, depicted in Fig. 2-11, includes a preamplifier configured in charge-sensitive mode, with its parameters optimized for low noise and high-speed operation for input photons with 8 keV energy, corresponding to an input charge of 2200 electrons. The preamplifier is followed by an AC-coupled discriminator, where the threshold is set globally but trimmed locally using an inter-pixel trim DAC to cancel offsets and compensate for process drifts.



Fig. 2-11. Block diagram of the PXF40 pixel readout electronics.

The feedback network of the preamplifier integrates a Krummenacher network and a programmable capacitor array, each serving distinct yet complementary purposes. The Krummenacher network is employed to implement a high-value feedback resistor, which facilitates the controlled discharge of the feedback capacitor with a time constant of 4 ns. This configuration not only ensures efficient charge reset but also compensates for the detector's leakage current, maintaining the integrity of the signal.

The programmable capacitor network further enhances the functionality of the feedback network by enabling gain switching and fine-tuning. This flexibility allows for pixel-to-pixel gain uniformity, ensuring consistent performance across the array of detector pixels. Additionally, the programmable capacitors help optimize noise performance and signal processing speed, making the readout channel adaptable to varying signal conditions and enhancing the overall efficiency of the system.

In fast operation mode, the readout channel achieves a maximum count rate of 12.24 Mcps/pixe with an ENC of 212  $e_{rms}^-$  and a power consumption of 45  $\mu$ W per pixel. While this design demonstrates a well-optimized balance between noise, speed, and power consumption, it falls short of meeting the stringent requirements of the targeted application.

One critical limitation is the ENC value, which is comparable to the signal level in the target application. This low SNR adversely impacts detection accuracy, particularly in scenarios demanding high precision. Furthermore, the 81 ns time resolution of the
channel is insufficient for the fast time-response demands of the target application, where a much finer resolution is required to ensure precise event timing. These shortcomings highlight the need for further optimization to address both the noise and timing limitations while maintaining low power consumption.

#### 2.5.4 Versatile Readout ASIC with High Count Rate Capability (IBEX)

The study presented in [27] introduces the IBEX readout ASIC, designed using 110 nm CMOS technology, to enhance photon detection in the energy range of 3 keV to 160 keV (corresponding to an input charge range of  $Q_{in} = 850e^- - 45000e^-$ ). As illustrated in Fig. 2-12, each pixel of the IBEX ASIC integrates a preamplifier configured in charge-sensitive mode with an advanced digital signal processing (DSP) unit. The preamplifier features a configurable feedback network, allowing precise tuning of gain and noise performance. The gain is adjusted by modulating the gate voltage of a transistor in parallel with the feedback network capacitor. The preamplifier is AC-coupled to a pulse-shaping filter, which further amplifies the signal before it reaches two threshold discriminators designed to capture the input signal within the desired energy window. Additionally, each stage of the readout channel is powered by an independent power supply to minimize electronic crosstalk.

The IBEX pixel achieves an ENC ranging from  $89 - 150 \text{ e}_{rms}^-$ , supporting a count rate of up to 10 Mcps/pixel with a power consumption between  $8 - 55 \mu W$  per channel. This design strikes a commendable balance between high performance and energy efficiency, making it a versatile readout frontend suitable for a wide range of applications, as highlighted in [27]. Its wide input dynamic range and low noise performance are particularly advantageous, enabling reliable detection across varying operational conditions.

However, despite these advantages, the IBEX pixel's time resolution of 100 ns falls short of the stringent requirements of the target application for this thesis. Applications demanding precise event timing and high temporal resolution necessitate a significantly faster response to achieve accurate detection and registration of events. As such, while IBEX excels in noise and power efficiency, its limited time resolution makes it unsuitable for this specific application.



Fig. 2-12. Block diagram of the IBEX pixel readout electronics.

## 2.5.5 Micro-channel Plate Readout ASIC (MIRA)

The ASIC described in [28] is designed for reading out charge signals generated by electrons produced in a microchannel plate (MCP). Fabricated in 65 nm CMOS technology, the chip enables photon detection, hit counting, and spatial resolution corrections, addressing challenges such as charge sharing to enhance performance in UV spectrometers.

Figure 2-13 illustrates the block diagram of a readout pixel in the MIRA ASIC, which integrates a charge-sensitive preamplifier, a charge summing stage, a discriminator, and two 17-bit counters to enable zero-dead-time operation. The CSA stage employs a current conveyor (ICON Cell) to achieve a large feedback time constant on the order of tens of nanoseconds. The charge summing stage incorporates a specialized clustering mechanism that combines signals from adjacent pixels, effectively mitigating charge-sharing effects and preserving spatial resolution, as detailed in [28].

Key features of MIRA include its low ENC of 20  $e_{rms}^-$ , which addresses the challenges of high-resolution, high-precision single-photon detection for low-energy photons (with charge signals ranging from  $Q_{in} = 1000e^- - 16000e^-$  at the input of readout). This is achieved through an advanced circuit architecture as described in [28] while maintaining a power consumption of 180 µW per pixel. Although MIRA ASIC achieves good detection accuracy due to its high SNR, the ASIC supports a maximum

count rate of up to 0.1 Mcps/pixel, and the provided time resolution of 10  $\mu$ s does not meet the requirements for the application targeted in this thesis.



Fig. 2-13. Block diagram of the MIRA readout electronics.

## 2.5.6 Open Circuit Mode Readout Pixel

The literature review revealed no existing pixel readout solutions based on the detector's open circuit mode of operation (introduced in Section 2.3.2), apart from the concept described in [13]. A preliminary exploration of such a readout solution is presented in [29], conducted as part of an MSc graduation project under the author's daily supervision.

Next to the advantages mentioned in section 2.3.2, , this method presents several challenges, as highlighted in the preliminary study [29]. A reset mechanism is necessary after each electron detection event or periodically when no events are registered within a given timeframe. This reset restores the PIN diode's voltage to a predetermined level, as depicted in Fig. 2-14, which shows the block diagram of the open circuit mode readout electronics.



Fig. 2-14. Block diagram of the open circuit mode readout electronics.

The resetting process mitigates charge pileup and the slow discharging of the junction capacitor caused by leakage currents, but it also introduces additional complexity. To prevent missing counts, the reset duration must be extremely short. Even with a short reset time, high-frequency events could reduce detection efficiency, an issue that can be addressed by implementing a reset mechanism.

Another critical challenge lies in the elevated performance demands placed on the discriminator, which must process relatively low input signals with high accuracy. Effective detection requires discriminators characterized by low noise, minimal offset, and a stable reference threshold. If the discriminator operates too slowly or lacks the necessary precision, it risks introducing two types of errors: erroneous detections, where noise is incorrectly identified as a signal, and missed detections, where valid electron impacts go unrecognized. These errors can significantly compromise system performance, especially in high-resolution applications that demand precise and accurate detection.

To optimize its performance, the discriminator must strike a careful balance between sensitivity and response time, ensuring it can rapidly and reliably distinguish genuine signals from noise. Offset compensation techniques, such as auto-zeroing, can address challenges related to offset and noise but often come at the cost of periodic deadtime. This deadtime, while compensating for certain performance limitations, may negatively impact system efficiency by reducing the detection window.

Additionally, leakage currents of the reset switches can pose further challenges by distorting the charge stored in the diode's junction capacitor. This distortion is particularly problematic when detecting small signals, as it can undermine the discriminator's ability to maintain accurate and consistent detection thresholds. Addressing these interconnected issues requires meticulous design and tuning to ensure optimal system functionality.

Simulation results from the preliminary study presented in [29] predict promising performance, though practical verification remains necessary. Chapter 4 presents the key conceptual and design choices presented in [29], followed by an improved design. The experimental verification of this improved design is detailed in Chapter 5.

# 2.5.7 Conclusions

Table 2-3 summarizes and compares the key parameters of the readout frontends analyzed in this chapter, with a focus on their effectiveness in detecting fast and lowenergy charge signals. The noise performance of each frontend is quantified using the Equivalent Noise Charge (ENC) metric, which measures noise levels in units of electron charge, ensuring consistency across different architectures.

|                                      | [24]      | [25]   | [26]    | [27]       | [28]   | [29]*     |
|--------------------------------------|-----------|--------|---------|------------|--------|-----------|
| Process [nm]                         | 40        | 130    | 40      | 110        | 65     | 40        |
| Pixel Area [µm <sup>2</sup> ]        | 50×50     | 75×75  | 100×100 | 75×75      | 35×35  | 80×90     |
| Input Charge [Ke <sup>-</sup> ]      | 4.1 - 8.3 | 2.4    | 2.2     | 0.85 - 45  | 1 – 16 | 0.8 - 1.2 |
| ENC [e <sup>-</sup> <sub>rms</sub> ] | 188       | 44     | 212     | 89 - 150   | 20     | 27        |
| Time Resolution [ns]                 | 34        | 1666   | 81      | 100        | 10000  | 2.5       |
| Power/Pixel [µW]                     | 26        | 42     | 45      | 8 – 55     | 180    | 190       |
| Pileup Correction                    | Yes       | Yes    | No      | Yes        | No     | Yes       |
| #Threshold Bins                      | 1 – 3     | 2      | 1       | 2          | 2      | 1         |
| FoM<br>[e <sub>rms</sub> × μs × μW]  | 166.2     | 3078.7 | 772.7   | 71.2 - 825 | 36000  | 12.8      |

Table 2-3. Performance summary of the state-of-the-art readout frontends.

\*Simulation results

To facilitate a fair and comprehensive comparison of the readout frontends, a figure of merit (FoM) is introduced. The FoM is defined as the product of the ENC, time resolution, and power consumption, providing an aggregate performance measure. A lower FoM value indicates superior performance, representing an optimal balance of noise, timing precision, and energy efficiency. High time resolution and accuracy are critical for readout frontends that handle fast and low-energy charge signals. However, achieving these objectives often conflicts with maintaining low power consumption, as noted in references [8], [20]. While energy-efficient frontends excel at reducing power consumption, their detection accuracy typically degrades, particularly under high input flux conditions, as discussed in [30]. This highlights the need for new readout frontends that strikes a balance between high time resolution and accuracy for low-energy particle detection, while also maintaining acceptable power consumption.

This thesis addresses this performance gap by developing a readout circuit specifically designed to detect charge signals from a single pixel implemented as a PIN-diode with junction capacitance in the range of  $C_D = 30$  fF – 50 fF. The objective is to enhance the detection capabilities for signals generated by external electrons impacting the detector at random intervals, achieving both high sensitivity and efficient power usage.

Regarding the proposed solutions for the short circuit mode of operation, while their power consumption aligns with design specifications, they fall short of meeting the stringent time resolution requirements of the targeted application in this thesis. These solutions predominantly use a preamplifier operating in charge-sensitive mode, with a Krummenacher network in the feedback path. The Krummenacher network offers a large resistive path between the input and output nodes of the preamplifier and provides an inductive path to compensate for detector leakage current. While this configuration effectively handles leakage current, its inherent trade-off between noise and shaping time, combined with its sensitivity to noise variations caused by input charge fluctuations, limits its suitability for applications that require low detector leakage currents and high time resolution [31].

An alternative approach is the use of the ICON Cell to implement the CSA. The ICON Cell offers enhanced stability and uniformity across varying input charge levels, making it an excellent choice for applications requiring high precision and consistency. However, its limited ability to compensate for detector leakage current makes it less suitable in environments with significant leakage.

Although the time resolution of the ASIC incorporating a CSA with the ICON Cell [28] significantly exceeds the targeted 2.5 ns, there is considerable potential for further improvement. Advanced design techniques and the optimization of circuit configurations offer promising avenues to enhance the ASIC's response time. Strategies such as minimizing parasitic effects, refining the feedback network, and utilizing innovative amplifier architectures can help achieve the desired time resolution while simultaneously maintaining low noise levels and power consumption. These optimizations are crucial for meeting the stringent performance requirements of the targeted application.

The promising simulation-based results presented in [29] for the open circuit operation mode provide a solid foundation for further investigation and experimental characterization of the proposed readout frontend. However, to fully realize its potential, several additional components need to be integrated into the readout channel. These improvements are necessary not only to enhance performance and detection accuracy but also to facilitate experimental validation and qualification of the system's functionality.

Chapters 3 and 4 present two novel readout frontend architectures, designed specifically for the short circuit and open circuit operation modes of the PIN-diode, respectively. These chapters explore the underlying design concepts, outline implementation strategies, and highlight the performance benefits of the proposed solutions, demonstrating their potential to address the challenges of precision, speed, and noise.

# 2.6 References

- Y. Wang, Z. Dong, R.-L. Lai, and K. Kanai, "Semiconductor charged particle detector for microscopy," WO2019233991A1, Dec. 12, 2019.
- [2] J. I. Goldstein, D. E. Newbury, J. R. Michael, N. W. M. Ritchie, J. H. J. Scott, and D. C. Joy, Scanning Electron Microscopy and X-Ray Microanalysis. New York, NY: Springer New York, 2018. doi: 10.1007/978-1-4939-6676-9.
- [3] A. J. Schwartz, M. Kumar, B. L. Adams, and D. P. Field, Eds., Electron Backscatter Diffraction in Materials Science. Boston, MA: Springer US, 2009. doi: 10.1007/978-0-387-88136-2.
- [4] N. Erdman, D. C. Bell, and R. Reichelt, "Scanning Electron Microscopy," in Springer Handbook of Microscopy, P. W. Hawkes and J. C. H. Spence, Eds., Cham: Springer International Publishing, 2019, pp. 229–318. doi: 10.1007/978-3-030-00069-1 5.
- [5] L. Frank, M. Hovorka, I. Konvalina, Š. Mikmeková, and I. Müllerová, "Very low energy scanning electron microscopy," Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 645, no. 1, pp. 46–54, Jul. 2011, doi: 10.1016/j.nima.2010.12.214.
- [6] R. Ballabriga et al., "Photon Counting Detectors for X-Ray Imaging with Emphasis on CT," IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 5, no. 4, pp. 422–440, Jul. 2021, doi: 10.1109/TRPMS.2020.3002949.
- [7] T. Fritzsch et al., "Flip chip assembly of thinned chips for hybrid pixel detector applications," J. Inst., vol. 9, no. 05, p. C05039, May 2014, doi: 10.1088/1748-0221/9/05/C05039.
- [8] M. Nakhostin, Signal processing for radiation detectors. Hoboken: John Wiley, 2018.
- [9] D. B. Murphy, M. W. Davidson. Fundamentals of Light Microscopy and Electronic Imaging, 2nd Edition. Wiley, 2012.
- [10] P. Seitz and A. J. Theuwissen, Eds., Single-Photon Imaging, vol. 160. in Springer Series in Optical Sciences, vol. 160. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. doi: 10.1007/978-3-642-18443-7.
- [11] C. Hansson, K. Iniewski, X-Ray Photon Processing Detectors. Springer, 2024.
- [12] H. Spieler, Semiconductor Detector Systems. Oxford University Press, 2005. doi: 10.1093/acprof:oso/9780198527848.001.0001.
- [13] Y. Wang, Z. Dong, R.-L. Lai, and K. Kanai, "Semiconductor charged particle detector for microscopy," WO2019233991A1, Dec. 12, 2019.
- [14] R. Ballabriga et al., "Review of hybrid pixel detector readout ASICs for spectroscopic X-ray imaging," J. Inst., vol. 11, no. 01, p. P01007, Jan. 2016, doi: 10.1088/1748-0221/11/01/P01007.
- [15] E. Säckinger, Analysis and Design of Transimpedance Amplifiers for Optical Receivers, 1st ed. Wiley, 2017. doi: 10.1002/9781119264422.
- [16] M. A. Disi, "Single Electron Readout Circuit: SERCuit," M.S. thesis, Dept. Microelectronics, TU Delft, Delft, the Netherlands, 2020.
- [17] M. Al Disi, A. Mohammad Zaki, Q. Fan, and S. Nihtianov, "High-Count Rate, Low Power and Low Noise Single Electron Readout ASIC in 65nm CMOS Technology," in 2021 XXX International Scientific Conference Electronics (ET), Sozopol, Bulgaria: IEEE, Sep. 2021, pp. 1–5. doi: 10.1109/ET52713.2021.9580005.

- [18] A. Mohammad Zaki and S. Nihtianov, "Characterization Challenges of a Low Noise Charge Detection ROIC," IEEE Trans. Instrum. Meas., vol. 71, pp. 1–8, 2022, doi: 10.1109/TIM.2022.3160529.
- [19] C. W. Fabjan and H. Schopper, Eds., Particle Physics Reference Library: Volume 2: Detectors for Particles and Radiation. Cham: Springer International Publishing, 2020. doi: 10.1007/978-3-030-35318-6.
- [20] A. Rivetti, CMOS: Front-End Electronics for Radiation Sensors. Boca Raton: CRC Press, 2018. doi: 10.1201/b18599.
- [21] Sakic, Agata, and Lis K. Nanver. "Silicon Technology for Integrating High Performance Low-Energy Electron Photodiode Detectors," Dissertation, 2012.
- [22] A. A.-G. Helmy and M. Ismail, Substrate noise coupling in RFICs. in Analog circuits and signal processing series. New York: Springer, 2008.
- [23] A. Afzali-Kusha, M. Nagata, N. K. Verghese and D. J. Allstot, "Substrate Noise Coupling in SoC Design: Modeling, Avoidance, and Validation," in Proceedings of the IEEE, vol. 94, no. 12, pp. 2109-2138, Dec. 2006, doi: 10.1109/JPROC.2006.886029.
- [24] P. Grybos et al., "SPHIRD–Single Photon Counting Pixel Readout ASIC With Pulse Pile-Up Compensation Methods," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 70, no. 9, pp. 3248–3252, Sep. 2023, doi: 10.1109/TCSII.2023.3267859.
- [25] R. Kleczek, P. Kmon, P. Maj, R. Szczygiel, M. Zoladz, and P. Grybos, "Single Photon Counting Readout IC With 44 e- rms ENC and 5.5 e- rms Offset Spread With Charge Sensitive Amplifier Active Feedback Discharge," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 70, no. 5, pp. 1882–1892, May 2023, doi: 10.1109/TCSI.2023.3241738.
- [26] R. Kleczek, P. Grybos, R. Szczygiel and P. Maj, "Single Photon-Counting Pixel Readout Chip Operating Up to 1.2 Gcps/mm2 for Digital X-Ray Imaging Systems," in IEEE Journal of Solid-State Circuits, vol. 53, no. 9, pp. 2651-2662, Sept. 2018, doi: 10.1109/JSSC.2018.2851234.
- [27] M. Bochenek et al., "IBEX: Versatile Readout ASIC With Spectral Imaging Capability and High Count Rate Capability," IEEE Transactions on Nuclear Science, vol. 65, no. 6, pp. 1285–1291, Jun. 2018, doi: 10.1109/TNS.2018.2832464.
- [28] E. Fabbrica et al., "MIRA: A Low-Noise ASIC With 35-μm Pixel Pitch for the Readout of Microchannel Plates," IEEE Transactions on Nuclear Science, vol. 71, no. 6, pp. 1339–1347, Jun. 2024, doi: 10.1109/TNS.2024.3401221.
- [29] L. Bouman, "High-Speed Readout Circuit for PIN Single Electron Detector in Voltage Mode," M.S. thesis, Dept. Microelectronics, TU Delft, Delft, the Netherlands, 2023.
- [30] R. Ballabriga et al., "Review of hybrid pixel detector readout ASICs for spectroscopic X-ray imaging," Journal of Instrumentation, vol. 11, Jan. 2016, doi: 10.1088/1748-0221/11/01/P01007.
- [31] E. Fabbrica et al., "Design of MIRA, a low-noise pixelated ASIC for the readout of micro-channel plates," Journal of Instrumentation, vol. 12, Jan. 2022, doi: 10.1088/1748-0221/17/01/C01047.

# **3 Readout Solutions for Short Circuit Operation Mode**

# 3.1 Introduction

This chapter presents circuit design solutions for the short circuit operation mode of PIN diodes in high-precision detection systems. It explores the preamplifier's role in establishing a virtual ground for efficient charge collection, highlighting key design considerations such as input impedance, noise performance, bandwidth, and feedback mechanisms. Solutions for mitigating ISI-induced errors through additional threshold levels or signal shaping filters are also studied. Furthermore, threshold discriminators are investigated for accurate signal digitization. The impact of these design choices on signal integrity, noise performance, and detection accuracy is analyzed, providing a comprehensive evaluation of optimized circuit architectures for scanning electron microscopy applications.

# 3.2 Analog Readout Frontend for Short Circuit Operation Mode

As outlined in Section 2.1, the analog frontend converts the PIN-diode's charge signal into a voltage and generates a logical '1' within the required time frame to register a BSE event. Figure 3-1 illustrates its key components:

- **Preamplifier:** Converts weak charge signals into voltage while ensuring high bandwidth and low noise for precise detection.
- **Gain/Filter Stage:** Amplifies and shapes the signal to enhance SNR and improve clarity.
- **Threshold Discriminator:** Compares the processed signal to a threshold, digitizing valid events by outputting a logical '1' when the signal exceeds the threshold.





The readout channel must minimize detection errors from noise and ISI, with the preamplifier playing a key role. It requires wide bandwidth for high time resolution, low noise for accuracy, and low power to meet system constraints. While higher power enhances noise performance through increased bias currents or advanced topologies, it raises power dissipation, potentially exceeding system limits [1], [2].

Narrowing bandwidth reduces noise but increases ISI, as shown in Figure 2-4. Noise, a statistical error, can be minimized by bandwidth limitation, whereas ISI, a deterministic error from signal overlap, may degrade performance if uncorrected [3]. To balance this tradeoff, circuit complexity can mitigate ISI, enabling narrower bandwidth for improved noise and power efficiency [4], [5]. Additionally, periodic resetting of the preamplifier's feedback capacitor ( $C_F$ ) prevents output saturation.





power, bandwidth, and SNR. ISI compensation methods include:

• Auxiliary threshold levels: Multiple thresholds improve signal discrimination and reduce ISI. • **Signal shaping filters:** These filters optimize timing to minimize ISI and enhance detection accuracy.

Figure 3-2 presents the proposed readout channel, integrating ISI compensation to sustain detection performance while optimizing noise, bandwidth, and power. The following sections detail the design of each stage, including ISI mitigation strategies.

# 3.3 Preamplifier Stage

The preamplifier in short-circuit mode must establish a stable virtual ground to ensure efficient charge collection while minimizing signal degradation. Achieving ultralow input impedance is paramount to suppress voltage drops that could compromise signal integrity. Simultaneously, the design must maintain low noise and sufficient bandwidth to enable accurate charge signal conversion. These requirements are dictated by the detector capacitance ( $C_D$ ), count rate, and power efficiency constraints, necessitating a careful balance between performance metrics. A high loop gain is essential to sustaining the virtual ground, minimizing impedance variations, and enhancing charge transfer efficiency. Through precise optimization of these parameters, the preamplifier can achieve high precision and low noise operation, ensuring robust performance in demanding detection environments [1]–[3].

Selecting the optimal operational mode for the preamplifier involves evaluating the trade-offs between the Charge Sensitive Amplifier (CSA) and the Transimpedance Amplifier (TIA) [3]. The CSA excels in high-sensitivity applications due to its superior charge integration capability, low noise characteristics, and power efficiency, making it particularly well-suited for detecting low-intensity signals [1]. In contrast, the TIA offers advantages in high-event-rate scenarios but suffers from higher noise and increased power consumption, making it less suitable for this design [2], [3].

For this implementation, the CSA architecture was chosen to maximize charge conversion accuracy while adhering to strict noise and power constraints. By integrating charge over time, the CSA enhances sensitivity to weak input signals, a crucial requirement for high-resolution detection. However, its inherently narrow bandwidth introduces challenges in optimizing the trade-off between sensitivity and signal processing speed. To address this, careful tuning of the feedback time constant ( $\tau_F$ ) is necessary to balance charge integration efficiency with bandwidth limitations.

A key innovation in this work lies in refining the CSA design to meet the stringent demands of SEM applications. The primary focus is on optimizing bandwidth while preserving low noise performance and minimizing power dissipation. Additionally, enhancing stability under high-count-rate conditions is crucial for ensuring consistent signal processing. By systematically addressing these design challenges, this research advances the development of high-precision, low-power readout circuits, contributing to the next generation of high-resolution detection systems.

#### 3.3.1 Operation Principle

A typical preamplifier for PIN-diode signal processing, shown in Fig. 3-3, operates in either charge-sensitive mode ( $\tau_F \gg t_p$ ) or transimpedance mode ( $\tau_F \approx t_p$ ), depending on the feedback time constant ( $\tau_F = R_F. C_F$ ), where  $t_p$  is the peaking time at the preamplifier output.



Fig. 3-3. Schematic of a typical preamplifier tailored to the PIN-diode.

In CSA mode, the feedback time constant  $(\tau_F)$  determines the discharge tail  $(t_{tail})$  of the signal and must satisfy  $\tau_F \gg t_{Collection}$  to fully integrate the detector's charge signal in the feedback capacitor  $(C_F)$ . The CSA's transfer function, combining the forward amplifier and feedback network, is given by:

$$T_{CSA}(s) = \frac{R_F}{(1+s\tau_F)(1+s\tau_{CL})}$$
(3-1)

where  $\tau_F$  defines the discharge tail and  $\tau_{CL} = \frac{1}{GBWP} \cdot \frac{C_F + C_D + C_G}{C_F}$  determines the rise time, linked to the loop bandwidth [6]. The C<sub>G</sub> is the gate capacitance of the preamplifier input transistor. For impulse input Q<sub>in</sub>, the CSA output voltage is:

$$V_{CSA}(t) = Q_{in} \cdot \frac{1}{c_F} \left( e^{-\frac{t}{\tau_F}} \right)$$
(3-2)

For an input charge signal of  $Q_{in} = 160 \text{ aC}$ , an output voltage signal in a range of 15 - 30 mV can be achieved through a feedback capacitance in a range of  $C_F = 5 - 10 \text{ fF}$ . The time constant of the second pole ( $\tau_{CL}$ ) is set according to the desired time resolution of the voltage signal. For the target time resolution of 2.5 ns, the CSA voltage signal must have a peaking time of  $t_p \leq 2.5 \text{ ns}$ , corresponding to a second pole beyond 400 MHz.

The first pole's time constant ( $\tau_F$ ) dictates the discharge tail ( $t_{tail}$ ) of the signal. As the amplitude decays while noise remains constant, maintaining signal stability is crucial to prevent SNR loss. To ensure accurate detection, the CSA voltage must stay steady within each time frame. For a decay below 1% over 2.5 ns,  $t_{tail}$  is set to 250 ns, preventing significant voltage drop during processing. With a femtofarad-range feedback capacitance ( $C_F$ ), achieving this requires a megaohm-range feedback resistance ( $R_F$ ). This design balances high SNR with timing and power constraints, optimizing signal processing in the readout channel.

Noise performance is critical for maintaining a high SNR. The main noise sources include voltage noise  $(v_n)$  and current noise  $(i_n)$ , described by:

$$S_{V_n} = \frac{4KT\gamma_n}{g_m} + \frac{K_f}{C_{0x}WL} \cdot \frac{1}{f} = \alpha + \frac{A_f}{f}$$
(3-3)

$$S_{i_n} = \frac{4KT}{R_F} + 2qI_{leackage} = \beta$$
(3-4)

where K is the Boltzmann constant, T is the temperature in Kelvin, q is the electron charge,  $R_F$  is the feedback resistance,  $I_{leackage}$  is the detector leakage current,  $K_f$  is the flicker noise coefficient,  $C_{ox}$  is the gate oxide capacitance per area,  $\gamma_n$  is a noise parameter ranging from 1/2 (weak inversion) to 2/3 (strong inversion), and  $g_m$ , W and L

are the transconductance and the dimensions of the preamplifier input transistor, respectively. The total noise power spectral density at the preamplifier output can be expressed as:

$$S_{V_{Preamp}} = \frac{\beta}{(2.\pi.f.C_F)^2} + \left(\alpha + \frac{A_f}{f}\right) \left(\frac{C_D + C_G + C_F}{C_F}\right)^2$$
(3-5)

The total noise at the output, incorporating the transfer function  $T_{CSA}(s)$ , determines the Equivalent Noise Charge (ENC) as expressed in [1]:

$$ENC^{2} = \frac{1}{c_{F}^{2}} \left( \beta \cdot t_{s} \cdot a_{p} + (C_{D} + C_{G} + C_{F}) \cdot \left( \frac{\alpha}{t_{s}} \cdot a_{w} + A_{f} \cdot a_{f} \right) \right)$$
(3-6)

where  $t_s$  is the preamplifier shaping time, indicating the signal width after preamplifier, while  $a_p$ ,  $a_w$ , and  $a_f$  are coefficients for different noise contributions. These coefficients characterize different aspects of noise contributions, such as current noise, white noise, and flicker noise, respectively. Figure 3-4 shows the interplay of three different noise sources in ENC as function of the shaping time ( $t_s$ ).



Fig. 3-4. ENC as function of the shaping time  $(t_s)$ .

Optimizing shaping time  $(t_s)$  is key to minimizing ENC. Shorter  $t_s$  increases bandwidth but leads to ballistic deficit, while longer  $t_s$  filters noise but raises ENC [1]. The optimum shaping time  $(t_{sopt})$  minimizes the ENC. In CSA mode, the shaping time  $(t_s)$ is proportional to the time constant of the feedback network  $(\tau_F)$ , allowing noise filtering via larger  $\tau_F$  or higher  $R_F$  [1]. Enhancing the transconductance  $(g_m)$  improves signal gain and reduces voltage noise but must be balanced against power and bandwidth constraints.

#### 3.3.2 Core Amplifier

The CSA's core amplifier dictates the readout channel's performance, requiring optimization of gain, noise, bandwidth, and power consumption. Different architectures offer distinct trade-offs: the Cascode amplifier delivers high gain and output impedance for strong signal amplification, the folded Cascode achieves wide bandwidth with low power consumption for energy-efficient designs [7], [8], and the telescopic amplifier provides speed and efficiency but is constrained by limited voltage headroom [8]. Selecting the appropriate architecture is critical for meeting system specifications, as detailed in the following sections.

#### 3.3.2.1 DC Gain

The DC gain of the core amplifier  $(G_{Amp}_{DC})$  is critical for efficient charge collection and accurate signal amplification in a CSA. A high DC gain ensures that the feedback capacitor  $(C_F)$  stores most of the input charge rather than the detector capacitance  $(C_D)$ , improving charge-to-voltage conversion accuracy—essential for detecting small signals in precision applications.

According to Miller's theorem, a sufficiently high gain amplifies the effective feedback capacitance ( $C_{eff} = C_F \times (1 + G_{Amp_{DC}})$ ), ensuring that  $C_F$  dominates over  $C_D$ and enhancing charge-handling capacity (Fig. 3-5). This leads to more accurate voltage outputs by minimizing charge retention in  $C_D$ . The required condition is  $G_{Amp_{DC}} \gg (C_D/C_F)$ , ensuring efficient charge storage in  $C_F$ .

In practice,  $G_{Amp}_{DC}$  ranges from 100 to 1000, depending on  $C_D$  and  $C_F$ . While higher gain improves charge collection and sensitivity, it also increases noise and potential instability. Thus, selecting an optimal gain involves balancing accuracy, noise, stability, and power efficiency to maximize CSA performance.



Fig. 3-5. Charge division between detector and feedback capacitors due to Miller effect.

In this work, with  $C_D = 50$  fF, and a DC gain of 100 (40dB), the Miller effect increases the effective feedback capacitance to  $C_{eff} = 505$  fF. This means about 91% of the input charge is stored in the feedback capacitor, and only 9% is lost to the detector capacitance. A higher DC gain would further improve charge-to-voltage conversion efficiency, ensuring most of the charge is stored in the feedback network, thus enhancing signal detection accuracy for high-precision applications.

A high DC gain is also crucial for maintaining the virtual ground at the CSA input. By ensuring a high loop gain, the inverting terminal remains stable, minimizing parasitic signal interference and drift-induced distortions. The attenuation of unwanted signals follows  $A_{eff} = 1/(1 + \text{Gain}_{DC})$ . For instance, with  $G_{Amp}_{DC} = 100$ ,  $A_{eff} = 1/(101 \approx 0.0099)$ , meaning only 1% of parasitic signals affect the virtual ground. This enhances charge transfer efficiency from the detector to the feedback capacitor, ensuring precise charge measurement.

#### 3.3.2.2 Bandwidth

To achieve the desired time resolution in the CSA, the output signal must have a rise time of less than 2.5 ns, ensuring that the signal quickly reaches its peak value for effective signal processing. The rise time is primarily determined by the loop bandwidth of the CSA, which needs to exceed 400 MHz to meet this requirement. The loop bandwidth is defined by the point where the frequency response of the core amplifier

intersects with the transfer function of the feedback network which is at  $f_{CL}$  frequency. Maintaining this high loop bandwidth is critical for achieving the necessary rise time and overall performance of the system.

With a DC gain of 100 in the core amplifier, a detector capacitance  $C_D = 50$  fF, and a feedback capacitance  $C_F = 5$  fF, the first pole of the amplifier can be estimated to be around  $f_{p_1} = 40$  MHz. This corresponds to a gain-bandwidth product (GBWP) of more than 4.4 GHz. This high GBWP is essential for maintaining both the gain and the speed of the amplifier while ensuring the required time resolution and signal fidelity in high-precision applications. A GBWP of this magnitude allows the system to handle fast transient signals, making it possible to accurately detect and process signals with a rise time as short as 2.5 ns.

#### 3.3.2.3 Power Consumption

In designing the CSA for low power consumption, maintaining a power budget of 200  $\mu$ W is crucial without compromising performance. Achieving this requires careful selection of power-efficient design techniques, focusing on minimizing current draw while preserving essential specifications like gain, bandwidth, and rise time.

One approach is to employ low-power circuit architectures, such as using minimumsized transistors in non-critical signal paths. This reduces parasitic capacitance and leakage currents, conserving power. Additionally, adopting current-efficient topologies like folded Cascode designs can provide high gain and wide bandwidth while limiting current consumption. These topologies are particularly effective for applications like charge-sensitive amplifiers, where high performance must be maintained within stringent power limits.

Careful biasing is also critical. Properly setting the biasing points of the transistors ensures that each stage operates in its most efficient region, avoiding excessive power dissipation. Maintaining adequate voltage headroom ensures the amplifier remains within its linear operating range while using the minimum necessary supply voltage.

By optimizing these design aspects, the CSA can achieve the necessary performance—such as a DC gain of 100, a loop bandwidth exceeding 400 MHz, and a rise time under 2.5 ns—while staying within the 200  $\mu$ W power budget. This balance is essential for ensuring high-resolution, low-noise signal detection in power-constrained high-resolution applications.

## 3.3.3 Feedback Resisistor

In this research, the primary challenge in designing the charge-sensitive preamplifier lies in achieving a feedback resistance ( $R_F$ ) large enough to achieve a discharge tail ( $t_{tail}$ ) exceeding 250 ns ensuring the required  $\tau_F$  for a proper signal processing. This necessitates a feedback resistance of at least 10 M $\Omega$ , considering the feedback capacitance of  $C_F = 5$  fF. However, implementing such large resistances introduces several issues that must be addressed to maintain signal integrity and system performance.

One common solution is the use of polysilicon resistors, which can provide the necessary resistance values. However, polysilicon resistors require a significant chip area, leading to increased parasitic capacitance, which can result in higher crosstalk and noise levels [7], [9], [10]. This directly affects the system's sensitivity and degrades detection accuracy, making this approach less ideal for high-precision applications. The large area requirement also adds to the complexity and cost of the design.

An alternative solution, shown in Fig. 3-6, replaces the feedback resistor with a MOSFET biased in the ohmic region, providing the required resistance [11], [12]. The MOSFET's equivalent resistance can be modulated by periodically driving its gate with a voltage signal. This method faces two key challenges: charge injection from the MOSFET's gate-drain capacitance ( $C_{gd}$ ) can cause erroneous detection, and the variability of the MOSFET's resistance during each gate cycle introduces current noise fluctuations ( $S_{in}$ ), degrading the SNR and detection accuracy [1], [3]. While this technique enables dynamic control of the feedback resistance, these challenges hinder its suitability for precise detection applications.

These challenges underscore the necessity for more innovative and advanced solutions to overcome the issues associated with implementing large resistances in parallel with the feedback capacitance. Two promising solutions, the Krummenacher Network and the ICON Cell, provide a configuration where the feedback capacitor is paired with a large imaginary feedback resistor in parallel. While these circuit architectures offer an improvement over traditional resistor-based designs, each presents its own set of advantages and potential drawbacks, which must be carefully considered in the context of the specific application requirements.



Fig. 3-6. Implementation of a large feedback resistance by a MOSFET device.

# 3.3.3.1 Krummenacher Network

The Krummenacher network (Fig. 3-7) is commonly employed in photon counting frontend electronics due to its ability to manage both feedback resistance and detector leakage current [10], [13]. This network provides two feedback paths: a resistive path to implement an equivalent feedback resistance of  $R_F \approx 2/g_{m1}$  (assuming  $g_{m1} = g_{m2}$ ) and then discharge the feedback capacitance  $C_F$  by a small constant current  $I_{Krum}$ , and an inductive path through  $M_3$  and  $C_{Krum}$  to compensate the detector leakage current rather than it flowing into the feedback resistance  $R_F$ .

However, in applications where the detector leakage current is negligible, the Krummenacher network may not be ideal. Its current noise contribution  $(S_{i_n})$  is directly proportional to the bias current  $I_{Krum}$ , meaning that reducing  $S_{i_n}$  will increase the shaping time since the shaping time is proportional to  $Q_{in}/I_{Krum}$ . Any variations in the input charge  $Q_{in}$  will consequently lead to changes in the shaping time, affecting the stability and performance of the system [6], [14].

In summary, while the Krummenacher network offers effective feedback resistance and leakage current compensation, its noise-shaping time tradeoff and sensitivity to variations in input charge can limit its usefulness in applications with low detector leakage current.



Fig. 3-7. Schematic of a CSA implemented by the Krummenacher network in the feedback.

#### 3.3.3.2 ICON Cell

Another method for implementing a large feedback time constant in a CSA is based on the concept of using a current mirror to reduce the current flowing through the feedback resistor ( $R_F$ ), effectively making it behave like a resistor with a much higher value than its nominal resistance, as described in [9], [14]. This technique can be realized using a current conveyor, also known as an ICON Cell, in the feedback network, as shown in Fig. 3-8.

The CSA in this methos comprises an amplifying stage designated as "Amp", followed by a source follower transistor stage  $M_b$  with negative feedback provided by capacitor  $C_F$ . Additional negative feedback is introduced through the ICON Cell. The ICON Cell generates an equivalent large resistance, represented as  $R_F = (K + 1) \times R_1$ , along with a corresponding feedback time constant given by  $\tau_F = (K + 1) \times R_1 \times C_F$ . In this formulation, K denotes the current mirroring factor, while  $R_1$  refers to the physical resistor employed in the configuration. This setup effectively provides the desired large resistance and time constant, all while utilizing a smaller physical resistor  $R_1$ , thereby optimizing area.



Fig. 3-8. Schematic of a CSA implemented by an ICON Cell in the feedback network.

A significant advantage of this architecture lies in its influence on the current noise contribution ( $S_{i_n}$ ). This contribution is directly proportional to the bias current  $I_{out}$  in the right branch of the current mirror within the ICON Cell (shown in Fig. 3-9). Consequently, the feedback time constant ( $\tau_F$ ) remains unaffected by the current noise contribution ( $S_{i_n}$ ). This characteristic of independence from noise renders the ICON Cell an attractive option for implementing large resistances in the feedback network of a CSA, especially relevant to the application discussed in this thesis. The subsequent sections will delve deeper into the implementation of the CSA with the ICON Cell and present simulation results to illustrate its effectiveness.

In steady-state conditions, the feedback loop ensures that the current flowing into resistor  $R_1$  matches the current generated by the source  $I_{R_1}$ . This balance results in zero current being injected into the ICON Cell, making its bias current independent of the resistor bias current  $I_{R_1}$ . Once the detector injects charge into the input, the output voltage signal begins to rise. This output voltage is subsequently converted into current through resistor  $R_1$ , which flows into the ICON Cell. Within the ICON Cell, this current is demagnified by the mirroring factor K. The demagnified current is then sent back

to the input node of the CSA, where it initiates the discharge of the feedback capacitance ( $C_F$ ).



Fig. 3-9. Schematic of the current conveyor known as an ICON Cell.

A simple analysis of the circuit shown in Fig. 3-9 shows that  $V_{out}$  and  $I_{in}$  are related by the following equation:

$$\frac{V_{out}}{I_{in}} = \frac{KR_1}{(1+SC_FR_1(K+1))(1+s\tau_{CL})}$$
(3-7)

| Parameters    | Poly Resistor       | Krummenacher              | ICON Cell               |  |
|---------------|---------------------|---------------------------|-------------------------|--|
| Resistance    | R <sub>F</sub>      | 2/g <sub>m1</sub>         | $(K+1) \times R_1$      |  |
| Time Constant | $\propto R_{\rm F}$ | $\propto Q_{in}/I_{Krum}$ | $\propto \mathrm{KR}_1$ |  |
| Noise         | 4KT/R <sub>F</sub>  | ∝ I <sub>Krum</sub>       | $\propto I_{out}$       |  |

Table 3-2. Characteristics of the poular solutions for implementing a larger feedback resistor.

The dominant noise source in a CSA with an ICON Cell is the current noise contribution coming from the output branch of the mirror  $(I_{out})$ . In fact, thanks to the mirror-

ing factor K, all the noise contributions present in the input branch of the mirror, generator  $I_{R_1}$ , transistor  $M_b$ , and resistor  $R_1$  are nullified by a factor of  $K^2$  [9]. Therefore, to keep the current noise contribution small, the output branch of the ICON Cell is biased in the subthreshold region with an ultra-small current in sub picoampere range. Table 3-1 presents and compares the characteristics of popular solutions for implementing a larger feedback resistor.

## 3.3.4 Noise

The primary noise sources in the front-end electronics for this design are voltage noise  $(v_n)$  and current noise  $(i_n)$ , both significantly influencing overall noise performance of the preamplifier system. As discussed in subsection 3.4.1, the feedback network's time constant  $(\tau_r)$  is critical for optimizing the noise characteristics of the CSA.



Fig. 3-10. Theoretical ENC as function of the feedback time constant ( $\tau_F$ ).

Figure 3-10 illustrates the theoretical ENC of the CSA stage as a function of the feedback time constant ( $\tau_F$ ). With a feedback capacitance of  $C_F = 5$  fF, the minimum noise (ENC<sub>opt</sub> = 39 e<sup>-</sup><sub>rms</sub>) occurs at an optimum feedback time constant of  $\tau_{F_{opt}} = 32.7$  ns, corresponding to an optimum feedback resistor value of  $R_{F_{opt}} = 6.6 \text{ M}\Omega$ . For feedback resistors of  $R_F = 5 \text{ M}\Omega$  and  $R_F = 10 \text{ M}\Omega$ , the calculated theoretical ENC values are 39 e<sup>-</sup><sub>rms</sub> and 40 e<sup>-</sup><sub>rms</sub>, respectively. In the former case, voltage noise dominates,

while in the latter, current noise becomes the primary contributor. These results suggest that the theoretical SNR can range between 20 and 30 for the given input charge  $(Q_{in})$ .

# 3.3.5 Design and Implementation

This subsection details the CSA design, optimized for the target application. A folded Cascode amplifier is chosen for its high gain, wide bandwidth, and power efficiency. The feedback network incorporates an ICON Cell to achieve the required resistance, ensuring optimal signal processing while mitigating space, noise, and parasitic capacitance challenges. CSA loop analysis and stability considerations are covered in Appendix A.

### 3.3.5.1 Folded Cascode Amplifier

As a tradeoff between achieving low noise and maintaining high gain and high input signal throughput, a Folded Cascode architecture is employed for the core amplifier, as illustrated in Fig. 3-11. This architecture allows for a balance between the noise performance and the overall gain of the amplifier, while also supporting high-speed signal processing.



Fig. 3-11. Schematic of the Folded Cascode amplifier.

To ensure proper charge integration in the feedback capacitance ( $C_F$ ), the condition ( $C_{in}/(C_D + C_G)$ ) > 10 must be satisfied, where  $C_D$  is the detector capacitance,  $C_G$  is

the gate capacitance of the amplifier input transistor (M<sub>1</sub>), and  $C_{in} = C_F \times (1 + G_{Amp} \cdot \frac{C_F}{C_F + C_D + C_G})$  is the equivalent feedback capacitance seen from the input. Concerning the  $C_F = 5$  fF,  $C_D = 50$  fF, and  $C_G = 9$  fF, the DC gain of the amplifier should be  $G_{Amp} > 100(40 \text{ dB})$  to integrate the input charge with less than 10% charge loss. Additionally, given the aforementioned parameters, the amplifier should have the first pole (f<sub>p1</sub>) larger than 40 MHz and a GBWP of greater than 4.8 GHz for proper system performance.

The specific design of the Folded Cascode amplifier, including the dimensions of the transistors and their corresponding drain currents, is detailed in Table 3-2. These parameters are optimized to ensure that the amplifier operates efficiently while meeting the requirements for noise reduction, gain, and input signal handling capabilities.

|                     | $\mathbf{M}_{1}$ | <b>M</b> <sub>2</sub> | <b>M</b> 3 |
|---------------------|------------------|-----------------------|------------|
| <b>W</b> [μm]       | 18               | 1.26                  | 8.71       |
| L [μm]              | 0.1              | 0.25                  | 0.15       |
| Ι <sub>D</sub> [μΑ] | 118              | 2                     | 2          |

Table 3-2. Dimensions and Currents of the Transistors in Folded Cascode Amplifier.



Fig. 3-12. Bode diagram of the Folded Cascode amplifier.

To enhance the DC gain, the amplifier is biased such that the output branch is driven by a small current ( $I_{D2} = 2 \mu A$ ) while the input device (a high-threshold NMOS transistor) operates in the subthreshold region with a drain current of  $I_{D1} = 118 \mu A$ . Under these conditions, a transconductance of  $g_{m1} = 1.5$  mS is achieved, which is primarily determined by the drain current  $I_{D1}$ . Figure 3-12 illustrate the simulated bode diagram of the Folded Cascode amplifier which indicates a DC gain of  $G_{Amp} = 45.3$  dB with poles at  $f_{p_1} = 44$  MHz and  $f_{p_2} = 2.8$  GHz as well as a GBWP of 8.1 GHz. In terms of noise performance, the simulated input-referred flicker noise power spectral density is  $4.51 \ \mu V^2/Hz$ , while the white noise power spectral density is  $32.91 \ fV^2/Hz$ .

## 3.3.5.2 Source Follower Stage

The core amplifier is followed by a source follower stage to form a high speed negative feedback through capacitor  $C_F$ . As shown in Fig. 3-13, the source follower stage contains a transistor  $M_b$  which is connected to a polysilicon resistor ( $R_1$ ) and a current source ( $I_{R_1}$ ) at its source and drain pins, respectively.



Fig. 3-13. Schematic of the source follower stage.

The core amplifier is followed by a source follower stage that establishes a highspeed negative feedback loop via the feedback capacitor ( $C_F$ ), as shown in Fig. 3-13. This stage uses transistor  $M_b$ , connected to a polysilicon resistor ( $R_1$ ) at the source and a current source ( $I_{R_1}$ ) at the drain. This configuration ensures stable and consistent voltage delivery to the feedback capacitor, maintaining CSA performance. A crucial design consideration for the source follower is placing its poles and zeros outside the loop bandwidth to avoid interference with the CSA's frequency response, preserving high-speed performance and time resolution. Additionally, a large equivalent resistance is required at the drain of transistor M<sub>b</sub> to maintain stability and ensure proper feedback loop operation.

The source follower's noise contribution must also be managed to prevent degradation of the CSA's noise performance. Since the noise is propagated through the core amplifier or the ICON Cell, its effect is significantly reduced. For the core amplifier, the noise is divided by the square of the amplifier's gain, and for the ICON Cell, it is reduced by the mirroring factor  $K^2$ . As a result, the source follower's input-referred noise is negligible compared to other system noise sources and does not significantly affect the overall noise performance of the CSA.

#### 3.3.5.3 ICON Cell DC biasing

The schematic of the ICON Cell, shown in Fig. 3-9, illustrates its primary function of demagnifying the signal current by a factor K. This is achieved by biasing the current mirror branches with different DC currents, ensuring that the current at the drain of the transistor  $M_b$  (Fig. 3-8) in the source follower stage is properly directed towards the ICON Cell. The transistors  $M_6$  and  $M_7$  (Fig. 3-9) act as common-gate transistors, presenting a low resistance path for the current. The equivalent resistance seen at the input of the ICON Cell can be expressed as:

$$R_{eq} = \frac{1}{g_{m_6}} || \frac{1}{g_{m_7}} = \frac{1}{g_{m_6} + g_{m_7}}$$
(3-9)

where  $g_{m_6}$  and  $g_{m_7}$  are the transconductances of  $M_6$  and  $M_7$ , respectively. The total current noise contribution ( $I_{noise}$ ) at the output branch of the ICON Cell can be expressed as:

$$I_{\text{noise}} = \sqrt{\left(2qI_{\text{DS}_9}\right)^2 + \left(2qI_{\text{DS}_{10}}\right)^2 + \frac{(I_{\text{in}})^2}{K^2}}$$
(3-10)

where  $I_{DS_9}$  and  $I_{DS_{10}}$  are the bias currents in the output branch of the ICON Cell, and  $I_{in}$  is the total equivalent noise at the input branch of the ICON Cell. This input noise

includes contributions from the source follower stage and the transistor at the input branch.

As Eq. 3-10 indicates, due to the large mirror factor K, only the noise contributions of transistors  $M_9$  and  $M_{10}$  significantly impact the total noise. These contributions are proportional to the DC current flowing in the output branch of the ICON Cell ( $I_{out}$ ). To minimize the noise, the ICON Cell's output branch is biased in the subthreshold region with a very small current ( $I_{out} = 230$  pA). This low current helps to reduce the current noise while maintaining the desired signal characteristics. The ICON Cell implements a mirroring factor K = 80, which effectively reduces the noise sources in the input branch and provides the required discharge tail for the voltage signal after the CSA. Table 3-3 presents the dimensions and currents of the transistors in the ICON Cell.

|                     | <b>M</b> 5 | <b>M</b> 6 | <b>M</b> 7 | <b>M</b> 8 | M9    | <b>M</b> 10 |
|---------------------|------------|------------|------------|------------|-------|-------------|
| <b>W</b> [μm]       | 9.4        | 0.55       | 1.09       | 12.15      | 0.12  | 0.24        |
| L [μm]              | 0.7        | 0.21       | 0.2        | 0.31       | 3     | 2           |
| I <sub>D</sub> [nA] | 18.5       | 18.5       | 18.5       | 18.5       | 0.230 | 0.230       |

Table 3-3. Dimensions and Drain Currents of the Transistors in ICON Cell.



Fig. 3-14. Schematic of the ICON Cell bias driver.

The bias voltages  $V_{b_6}$  and  $V_{b_7}$  can be supplied externally or by the driver stage, as shown in Fig. 3.14. When provided by the driver stage, these voltages are dynamically adjusted based on the input signal at the ICON Cell's input node (V<sub>A</sub>). This adjustment creates a negative feedback loop, enabling the driver stage to control the gates of M<sub>6</sub> and M<sub>7</sub>. This reduces the equivalent resistance (R<sub>eq</sub>) at the ICON Cell's input, directing more of the input signal to the ICON Cell, improving its performance, and minimizing dissipation in the source follower stage. The feedback loop also ensures the input node (V<sub>A</sub>) is maintained at a fixed level defined by an external reference voltage V<sub>ref</sub>, enhancing the stability and precision of the ICON Cell's operation under varying input conditions.

# 3.3.5.4 CSA Programability

The programmability of the CSA allows for dynamic adjustments in gain and timing, optimizing performance for different detection requirements. The gain of the CSA is inversely proportional to the feedback capacitance ( $C_F$ ); the smaller the  $C_F$ , the larger the amplitude of the voltage signal after the CSA. Conversely, the CSA loop bandwidth is directly proportional to  $C_F$ . Therefore, the value of  $C_F$  is set based on the gain and bandwidth requirements [15], [16]. The gain of the CSA can be adjusted by adding an identical capacitor is added in parallel to the existing  $C_F$  through a switch controlled by *Gain\_Prog* signal. For *Gain\_Prog* = '0' (default mode), the CSA operates in high gain mode with  $C_F = 5$  fF while for *Gain\_Prog* = '1', the CSA operates in low gain mode with  $C_F = 10$  fF.

For a fixed value of  $C_F$ , the time width of the CSA voltage signal is proportional to the value of  $R_1$  [6], [16]. The CSA can operate in slow ( $R_1 = 133 \text{ K}\Omega$ ) or fast ( $R_1 = 67 \text{ K}\Omega$ ) modes, depending on the status of a switch controlled by the *Width\_Prog* signal. The polysilicon resistor  $R_1$ , as shown in Fig. 3-8, can be implemented in two ways: 1) using one large resistor  $R_a = R_1 = 133 \text{ k}\Omega$ , or 2) using two identical smaller resistors  $R_b = R_1/2 = 67 \text{ k}\Omega$  in series. In the first case, illustrated in Fig. 3-15(a), the fast mode (*Width\_Prog* = '1') is achieved by adding an identical resistor in parallel to  $R_a$ . In the second case, illustrated in Fig. 3-15(b), the fast mode is achieved by shorting out one of the  $R_b$  resistors. The smaller  $R_1$  resistance value in fast mode increases the DC current flowing through the source of  $M_b$ . To maintain the same biasing points in both modes, the drain current of  $M_b$  must be increased to compensate for the additional source current. Therefore, as indicated in Fig. 3-15, an additional current source  $I_R$  should be added at the drain of  $M_b$ .



Fig. 3-15. CSA configurations for realizing the fast mode.

#### 3.3.5.5 Simulation Results

In the simulations for this work, the PIN-diode is represented by a digitally controlled current source in parallel with a capacitance  $C_D$ . This model emulates the detector's behavior by generating charge pulses with an area equivalent to  $Q_{in} = 160 \text{ aC}$ (equivalent to 1000 electrons) and a pulse width matching the charge collection time of the detector,  $t_{Collection} = 1.8 \text{ ns}$ . This setup ensures that the CSA is stimulated in a way that closely approximates the actual charge collection dynamics.

Table 3-4 summarizes the post-layout simulation results for the designed CSA, detailing how the circuit's performance characteristics change with  $C_D = 50$  fF and different feedback component values, which, in turn, adjust the effective time constants of the system. These simulated values provide insight into the CSA's behavior under various configurations, ensuring that it meets the requirements for gain, speed, and noise across different operating modes. The optimal operating point of the CSA, balancing both processing speed and SNR, is achieved with a feedback capacitance  $C_F = 5$  fF and a feedback resistor  $R_1 = 133 \text{ K}\Omega$ . In combination with a current mirror factor K = 80, this configuration yields an equivalent feedback resistance  $R_F = 10.82 \text{ M}\Omega$  and a corresponding bandwidth of 2.94 MHz for the CSA. The CSA has a power consumption of 140  $\mu$ W.

| CSA Mode                             | High  | Gain  | Low Gain |       |  |
|--------------------------------------|-------|-------|----------|-------|--|
| CSA Mode                             | Slow  | Fast  | Slow     | Fast  |  |
| t <sub>r</sub> [ns]                  | 2.35  | 2.1   | 1.94     | 1.91  |  |
| t <sub>tail</sub> [ns]               | 259.6 | 133.8 | 579.6    | 264.1 |  |
| ENC [e <sup>-</sup> <sub>rms</sub> ] | 44    | 43    | 52       | 50    |  |
| SNR                                  | 22.7  | 23.2  | 19.2     | 20.3  |  |

Table 3-4. Post-layout simulation characteristics of the CSA for different values of feedback components.

\*Discharge tail of the CSA voltage signal

Figure 3-16 shows the simulated voltage signal at the output of the CSA for both fast and slow operating modes, demonstrating the variations in speed achievable with  $C_F = 5$  fF for  $C_D = 50$  fF. For different gain settings, Figure 3-17 illustrates the simulated output voltage signal in low and high gain modes, with  $R_1 = 133$  K $\Omega$  and  $C_D = 50$  fF. These simulations confirm the adaptability of the CSA, providing flexibility in both speed and gain configurations, which is essential for tailoring the amplifier to diverse detection conditions and optimizing its performance within the system's design constraints.

Tables 3-5 and 3-6 contain the parameter spread for fast and slow operation modes with a feedback capacitance of  $C_F = 5$  fF and  $C_F = 10$  fF resulting from a Monte Carlo Analysis in 200 points, respectively.

As previously discussed, the bandwidth of the CSA, and consequently its SNR, is significantly influenced by the detector junction capacitance ( $C_D$ ). This relationship is illustrated in Figure 3-18, which presents the SNR as a function of  $C_D$  for a CSA configured in high-gain mode. Two scenarios are depicted: one with  $R_1 = 67 \text{ K}\Omega$  (blue curve) and another with  $R_1 = 133 \text{ K}\Omega$  (red curve). The plot demonstrates a clear trend:

as the detector junction capacitance ( $C_D$ ) decreases, the SNR improves. This improvement is attributed to the reduced capacitive load, which enhances the loop bandwidth and minimizes the noise contribution from the CSA, underscoring the critical role of  $C_D$ in optimizing system performance.



Fig. 3-16. Simulated CSA output voltage signal for the fast and slow modes with  $C_F = 5$  fF.



Fig. 3-17. Simulated CSA output voltage signal for the low and high gain modes with  $R_1 = 133$  KΩ.

| Domonostoria               | Fast  | Mode | Slow Mode |      |  |
|----------------------------|-------|------|-----------|------|--|
| Parameters                 | μ     | σ    | μ         | σ    |  |
| $R_{F}[M\Omega]$           | 5.29  | 1.32 | 10.96     | 1.36 |  |
| t <sub>tail</sub> [ns]     | 135.6 | 32.3 | 261.6     | 44.6 |  |
| t <sub>p</sub> [ns]        | 2.24  | 0.31 | 2.43      | 0.29 |  |
| Signal <sub>max</sub> [mV] | 29.5  | 0.5  | 30.8      | 0.7  |  |

Table 3-5. Parameter spread for fast and slow operation modes from a  $C_F = 5$  fF - Monte Carlo Analysis.

Table 3-6. Parameter spread for fast and slow operation modes from a  $C_F = 10$  fF - Monte Carlo Analysis.

| Donomotors                 | Fast 1 | Mode | Slow Mode |      |  |
|----------------------------|--------|------|-----------|------|--|
| Parameters                 | μ      | σ    | μ         | σ    |  |
| $R_F[M\Omega]$             | 5.29   | 1.32 | 10.96     | 1.36 |  |
| t <sub>tail</sub> [ns]     | 268.8  | 31.6 | 584.2     | 43.8 |  |
| t <sub>p</sub> [ns]        | 2.08   | 0.3  | 2.15      | 0.27 |  |
| Signal <sub>max</sub> [mV] | 14.86  | 0.52 | 15.07     | 0.66 |  |



Fig. 3-18. SNR as a function of  $C_D$  for a CSA configured in high-gain.  $R_1 = 67 \text{ K}\Omega$  (blue curve) and with  $R_1 = 133 \text{ K}\Omega$  (red curve).

# 3.4 Signal Shaping Filter

The primary objective of the analog frontend design is to detect and count BSEs impacting the detector pixel within 2.5 ns, with this data encoded in the rising edge of the CSA output. After the CSA, the signal can be processed digitally via comparison with a reference level. However, CSA's limited bandwidth can cause a prolonged tail in the output, leading to signal pileup and ISI errors, as shown in Figure 3-19. Additionally, while the CSA is optimized for noise performance and power consumption, it may suffer from offset drift, further compromising detection accuracy [16].



Fig. 3-19. Example of direct discrimination after the low bandwidth CSA. The blue solid line represents the voltage signal after the CSA while the red dashed line illustrates the threshold voltage level.

To resolve these issues, a signal shaping filter is added to mitigate ISI errors and achieve the required time resolution. The filter attenuates the low-frequency offset components of the CSA output while preserving the high-frequency rising phase, implementing a high-pass transfer function [17], [18].

The filter order affects output signal characteristics like amplitude, time width, and noise performance. Higher-order filters provide better low-frequency noise attenuation and improved signal clarity but may lead to increased complexity and signal loss [1], [19]. Lower-order filters, while maintaining signal amplitude, may result in slower rise times and longer time widths, heightening the risk of ISI in high-rate detection scenar-ios [1]. Higher-order filters also excel at suppressing out-of-band noise, improving SNR, which is critical in high-speed applications.

Two filter solutions are considered: a passive CR network [18] and an active filter with baseline restoration (BLR) [17]. Passive filters, such as first-order CR networks, are simple, low-power, and suitable for lower-speed applications, but they attenuate signal amplitude. They offer sufficient high-frequency noise attenuation, but the longer tail in the output signal may impact high-speed performance.

Active filters use amplifiers to boost signal levels and improve noise performance. A second-order active filter, like a band-pass filter with BLR, narrows the time width, enhances SNR, and ensures sharper transitions, reducing the risk of signal overlap [17], [19]. Second-order filters are chosen to balance noise performance with minimal amplitude loss, providing better signal isolation for accurate detection.

However, higher-order active filters come with trade-offs, such as increased power consumption due to added amplifier complexity. This can be a limitation in low-power or portable applications, and the feedback network may introduce noise that needs to be managed carefully.

Choosing between passive and active filters involves balancing filter order, signal amplitude, time width, noise performance, and power consumption. The passive filter is optimized for low power, providing adequate high-frequency noise suppression, while the active filter is selected for enhanced noise performance and sharper signal transitions, crucial for accurate detection in high-resolution applications.

# 3.4.1 Passive High-pass Signal Shaping Filter

The passive high-pass signal shaping filter enhances detection accuracy by eliminating the ISI-induced errors and the offset, while consuming negligible power. Implemented through passive components, Fig. 3-20 illustrates the circuit diagram and transfer function of the passive signal shaping filter.

The passive high-pass shaping filter is a first-order network with a transfer function which can be expressed as:

$$T_{HPF}(S) = \frac{SR_{HPF}C_{HPF}}{1 + SR_{HPF}(C_{HPF} + C_{Disc})}$$
(3-11)
where  $C_{HPF}$  and  $R_{HPF}$  are the main components of the RC network, and  $C_{Disc}$  represents the total capacitance seen at the input of the next block: the discriminator. This passive shaping filter passes signal frequency contributions that are higher than a certain cut-off frequency  $f_c$ , which can be expressed as:

$$f_{C} = \frac{1}{2\pi\tau_{HPF}}$$
(3-12)

where  $\tau_{HPF} = R_{HPF}(C_{HPF} + C_{Disc})$  is the filter time constant which must be carefully set to maximize the SNR at the filter output and limit the signal time-width to fit in a time frame of 2.5 ns after discrimination. Regarding the rise time of the CSA voltage signal [16] and considering a margin of 80 MHz for the passband [17], the cut-off frequency f<sub>C</sub> of the filter should be below 300 MHz.



Fig. 3-20. Circuit diagram (a) and the transfer function (b) of a passive high-pass shaping filter.

#### 3.4.1.1 Design Considerations

The passive shaping filter, unlike the active one, does not amplify the CSA output but attenuates it by a factor of  $C_{HPF}/(C_{HPF} + C_{Disc})$ , where  $C_{Disc}$  models the loading effect of the discriminator block. To minimize attenuation,  $C_{HPF}$  should be larger than  $C_{Disc}$  but not excessively large to avoid compromising CSA stability. For a  $C_{Disc}$  of 11.8 fF, the filter capacitor is set to 35 fF. Figure 3-21 presents simulation results showing the signal after the passive high-pass shaping filter for various time constants  $\tau_{HPF}$ .

While the passive filter design is simple, variations in process and temperature affect its transfer characteristics, potentially degrading detection accuracy. Additionally, the capacitor  $C_L$ , representing the discriminator input capacitance, is also susceptible to process variations, altering signal amplitude, time width, and SNR. Table 3-7 summarizes the amplitude, time width, and SNR of the signal after the passive high-pass shaping filter for different time constants  $\tau_{HPF}$ . Simulation results show that the passive high-pass filter meets the desired requirements with a time constant of  $\tau_{HPF} = 0.7$  ns.



Fig. 3-21. Signal after the passive high-pass shaping filter with different filter time constants T<sub>HPF</sub> for an input signal provided by the CSA.

Table 3-7. Amplitude, time width, and snr for the signal after the passive high-pass shaping filter for different time constants.

| τ <sub>HPF</sub><br>[ns] | V <sub>Amp</sub><br>[mV] | t <sub>Width</sub><br>[ns] | σ <sub>Noise</sub><br>[mV <sub>rms</sub> ] | SNR  |
|--------------------------|--------------------------|----------------------------|--------------------------------------------|------|
| 0.5                      | 8.1                      | 2.8                        | 0.57                                       | 14.2 |
| 0.6                      | 9.3                      | 3.1                        | 0.63                                       | 14.6 |
| 0.7                      | 10.4                     | 3.3                        | 0.68                                       | 15.3 |
| 0.8                      | 11.2                     | 3.6                        | 0.71                                       | 15.8 |
| 0.9                      | 12                       | 3.9                        | 0.74                                       | 16.2 |

## 3.4.1.2 Performance Verification

The amplitude of the signal after the passive shaping filter reflects attenuation due to energy loss in the RC network. To maintain high detection accuracy, minimizing the discriminator input-referred noise and offset is crucial, although this increases power consumption.

A drawback of the passive high-pass shaping filter is the signal undershoot, where the signal drops below the baseline during the falling phase, requiring a long recovery time. Figure 3-22 shows the undershoot amplitude at the filter output as a function of the filter resistor  $R_{HPF}$  for various capacitors  $C_{HPF}$ . For  $C_{HPF} = 35$  fF and  $R_{HPF} = 20$  K $\Omega$ , the undershoot enters a plateau with a nearly constant amplitude.

This undershoot can negatively impact detection accuracy, especially when multiple electrons hit the detector in consecutive frames. Figure 3-23 illustrates the filter output for  $C_{HPF} = 35$  fF and varying  $R_{HPF}$  values, showing a decrease in signal peak due to the undershoot, which lowers the SNR. In the worst-case scenario, the signal may fail to exceed the discriminator threshold, resulting in missed detections.



Fig. 3-22. Amplitude of the signal undershoot at the shaping filter output as a function of the filter resistor  $R_{HPF}$  for several values of the filter capacitor  $C_{HPF}$ .



Fig. 3-23. Passive shaping filter output signal for three cascading electrons hitting the detector in three consecutive time frames.

## 3.4.2 Active High-Pass Signal Shaping Filter

The active signal shaping filter uses active components to achieve the desired performance and transfer function. As shown in Fig. 3-24, a common approach is to configure the filter in a closed-loop mode, with amplifiers in the forward path and a lowpass network in the feedback loop. This feedback network performs a BLR function, reducing the gain at low frequencies by a factor of  $G_{Loop}(0)$ . At higher frequencies, beyond the first pole, the transfer function mirrors the forward path response.



Fig. 3-24. Block diagram of the active signal shaping filter.

The BLR concept tracks low-frequency components of the amplified signal in the forward path and subtracts them at the input of the shaping filter [17], [19], [20]. This enables the desired transfer function, as shown in Fig. 3-25. Optimizing the attenuation factor  $A_1$  and the zero frequency  $f_Z$  is crucial for enhancing the SNR and minimizing signal time width through the filter.



Fig. 3-25. Expected transfer function of the active shaping filter.

The transfer function of the active shaping filter can be expressed as:

$$T_{BPF}(S) = A_2 \times \frac{(1+S\tau_Z)}{(1+S\tau_{P_1})(1+S\tau_{P_2})}$$
(3-13)

where  $\tau_Z$  represents the zero introduced by the BLR, and  $\tau_{P_1}$  and  $\tau_{P_2}$  correspond to the poles associated with the loop cutoff frequency and the amplifiers in the forward path, respectively. The active shaping filter's bandwidth is primarily governed by these poles  $\tau_{P_1}$  and  $\tau_{P_2}$ , which define the filter's frequency response. At low frequencies, the behaviour of the transfer function is characterized by the attenuation factor A<sub>1</sub>, which must effectively suppress low-frequency noise. As frequency increases, the transfer function transitions toward the forward gain A<sub>2</sub>, providing the required signal amplification.

#### 3.4.2.1 Operation Principle

The negative feedback loop continuously monitors the shaping filter's output voltage, stabilizing it at a reference DC voltage,  $V_{ref}$ . This reduces low-frequency noise and mitigates offset from both the CSA and shaping filter by a factor of  $A_1$ , while amplifying the signal of interest by  $A_2$ . The reference voltage  $V_{ref}$  is set to establish the desired DC level at the filter's output.

The BLR consists of sub-blocks, including an operational transconductance amplifier (OTA), a slew-rate limited stage, and a low-pass filter. The OTA compares the shaping filter output to  $V_{ref}$  and activates the slew-rate limited stage when the reference voltage is exceeded. This stage produces a controlled voltage, driving the low-pass filter. The slew-rate limiting minimizes charge accumulation in the low-pass filter capacitor, reducing undershoot amplitude at the filter's output [17].

By adjusting the BLR bandwidth, the low-pass filter captures low-frequency components, tracking the CSA voltage tail. This signal is applied to the differential amplifier in the forward path, closing the feedback loop and subtracting it from the CSA signal. Simulated signals at each sub-block of the shaping filter are shown in Fig. 3-26.



Fig. 3-26. Simulated signals generated after each sub-block of the active shaping filter: (a) CSA and BLR output signals, (b) active shaping filter output signal.

## 3.4.2.2 Design Considerations

To meet the application requirements, the parameters of the transfer function must be precisely configured. The attenuation factor  $A_1$  should exceed 20 dB to sufficiently suppress both the offset and the tail of the CSA voltage signal. Meanwhile, the gain in the passband  $A_2$  should be greater than 15 dB to ensure adequate signal amplification; however, excessive gain could potentially lead to stability issues. Given the CSA signal rise time of  $t_r = 2.56$  ns, the center frequency of the band-pass filter should be set at 380 MHz, with an 80 MHz margin on each sideband. The first pole  $f_1$  of the BLR should be positioned at 25 MHz. Additionally, considering the power requirements of the CSA and the overall system budget, the active shaping filter must operate within a power consumption limit of 200  $\mu$ W.

#### 3.4.2.3 Stability Analysis

The stability of the active shaping filter is critical, particularly due to the negative feedback loop, which, if not properly managed, could introduce oscillations. To ensure stability, the system's phase margin and gain margin can be evaluated using Bode plots. For the shaping filter to remain stable, the phase shift introduced by the feedback network must be constrained to less than 180 degrees at the unity-gain crossover frequency.

In addition, careful design of the amplifier stages is essential to ensure that the overall loop gain does not exceed the stability threshold. The selection of compensation techniques, such as pole-zero placement and feedback damping, is vital for maintaining stability. Another important consideration is the slew rate of the amplifiers; if the slew rate is insufficient, it can lead to distortion and reduced bandwidth, which would negatively impact the filter's performance.

## 3.4.2.4 Amplifiers

The signal shaping filter features two cascaded amplifiers in the forward path, designed to amplify the signal voltage and reduce digitization errors in the discriminator. Two differential amplifiers (Fig. 3-27) are connected in series, balancing gain, noise, and power consumption. Both stages utilize short-channel devices for wide bandwidth and sufficient gain. The first stage provides a differential output, and the second stage delivers a single-ended signal to drive the discriminator. The overall DC gain is 17 dB, with the first pole at 460 MHz. Figure 3-28 shows the common mode feedback schematic, and Table 3-8 lists the transistor dimensions and current values in the forward path.



Fig. 3-27. Schematic of the amplifiers in the forward path.



Fig. 3-28. Schematic of the common mode feedback in the forward path.

Table 3-8. Dimensions and the current of the transistors of the amplifier in the forward path.

|                     | M1, M2 | M3, M4 | M5, M6 | M7, M8 |
|---------------------|--------|--------|--------|--------|
| <b>W</b> [μm]       | 0.29   | 1.75   | 0.5    | 0.98   |
| L [µm]              | 0.1    | 0.1    | 0.1    | 0.1    |
| Ι <sub>D</sub> [μΑ] | 25     | 25     | 23     | 23     |

## 3.4.2.5 BLR Chain

The BLR chain in the feedback network consists of key sub-blocks that influence the shaping filter's performance and baseline stability [1]. As shown in Fig. 3-29, these include an OTA, a slew-rate limited buffer, and a low-pass filter, each contributing to the feedback loop's operation.



Fig. 3-29. Schematic of the OTA, slew-rate limited stage, low-pass filter, and level-shifter in the BLR chain.

The OTA, positioned as the first stage, functions as a differential amplifier that generates a corrective feedback signal when the output of the forward path amplifiers deviates from the reference voltage  $V_{ref}$ , ensuring baseline stability at the shaping filter's output. To minimize offset effects that could impair accuracy, the OTA is integrated into a negative feedback loop with careful transistor sizing. This design ensures high accuracy in baseline tracking while operating within the power constraints of the filter.

The slew-rate limited buffer, following the OTA, regulates the feedback signal's rate of change. Configured as a source follower PMOS device  $(M_{25})$  with a 1  $\mu$ A bias current, this stage produces a smooth voltage signal. A large capacitor  $C_{SR}$  is used to limit the slope of the OTA-generated signal, preventing overshoot or undershoot and stabilizing the baseline. The discharge of  $C_{SR}$  is faster than the charge phase due to the 5:1 mirroring ratio between  $M_{26}$  and  $M_{27}$ , helping restore the baseline steady-state after filter anomalies.

The low-pass filter in the BLR chain introduces a pole that defines the zero of the overall transfer function ( $f_1$ ), filtering high-frequency components in the feedback signal. Implemented with a source follower branch ( $M_{28}$ ) with a 10 nA bias current and a programmable capacitor network  $C_{LPF}$ , the low-pass filter stabilizes the BLR output, allowing accurate baseline drift correction without introducing noise. The time constant is  $\tau_{LPF_{BLR}} = g_{m_{28}}/C_{LPF}$ .

The final component, the level-shifter, restores the DC level of the feedback signal, ensuring the feedback signal matches the required DC level at the shaping filter's input. This mechanism helps maintain baseline stability across varying signal conditions, ensuring the output remains centered around the target baseline.

Table 3-9 shows the dimensions and current values of the transistors in the BLR chain, while Fig. 3-30 illustrates its frequency response. Figures 3-31 and 3-32 display the simulated voltage signals after the active shaping filter for the CSA in both high gain and slow/fast modes. The signal at the output remains unaffected by the discharging tail of the CSA signal, demonstrating the filter's robustness in shaping the desired signal. A minor undershoot is observed, with its amplitude depending on  $C_{LPF}$ . Simulation results indicate that this undershoot is small (about 3% of the signal amplitude), having a negligible impact on signal integrity.

|                     | M19, M20 | M21, M22 |
|---------------------|----------|----------|
| <b>W</b> [μm]       | 1.24     | 0.2      |
| L [µm]              | 0.1      | 0.1      |
| I <sub>D</sub> [μA] | 5.6      | 5.6      |



Fig. 3-30. Frequency response of the BLR chain.

Table 3-10 summarizes the amplitude, time width, and SNR of the signal at the active shaping filter output for various CSA configurations and tunable  $C_{LPF}$  ranges. The time width is measured at the 1% amplitude crossing points. The results show that the signal width exceeds 2.5 ns, indicating ISI. However, when measured at the threshold level (half of the signal amplitude), the time width remains below 2.5 ns [17]. The active shaping filter operates with a power consumption of 170  $\mu$ W.



Fig. 3-31. Simulated voltage signal after the active shaping filter for the tunable range of C<sub>LPF</sub> for CSA configured in high gain and slow modes.



Fig. 3-32. Simulated voltage signal after the active shaping filter for slow and fast operating modes of an CSA configured in high gain mode.

To assess the impact of residual ISI, the readout channel was triggered by three consecutive charge signals, as depicted in Fig. 3-33, for  $C_{LPF} = 400$  fF and the CSA programmed in high gain and slow modes. The results demonstrate that the amplitude of the residual signal in the subsequent timeframe is only 7% of the maximum signal amplitude, effectively rendering the pileup effect negligible. Consequently, the active shaping filter successfully mitigates ISI-induced errors, producing well-defined signals that align within 2.5 ns timeframes after discrimination.

| CSA Mode | C <sub>LPF</sub> [fF] | V <sub>Amp</sub> [mV] | t <sub>Width</sub> [ns] | $\sigma_{Noise}[mV_{rms}]$ | SNR  |
|----------|-----------------------|-----------------------|-------------------------|----------------------------|------|
|          | 400                   | 297.7                 | 3.13                    | 16.59                      | 17.9 |
| Slow     | 500                   | 329.4                 | 3.67                    | 16.23                      | 20.3 |
|          | 600                   | 354.1                 | 3.91                    | 16.02                      | 22.1 |
|          | 400                   | 281.1                 | 3.09                    | 14.88                      | 18.8 |
| Fast     | 500                   | 310.3                 | 3.59                    | 14.61                      | 21.2 |
|          | 600                   | 335.1                 | 3.85                    | 14.44                      | 23.1 |

Table 3-10. Amplitude, time width, and SNR for the signal after the active shaping filter for different configurations of CSA and the tunable range of  $C_{LPF}$ .



Fig. 3-33. Simulated voltage signal after the active shaping filter in the case of three consecutive trigger pulses for  $C_{LPF} = 400$  fF and the CSA programmed in high gain and slow modes.

## 3.5 Threshold Discriminator

Threshold discriminators play a vital role in readout circuits for Scanning Electron Microscopy and particle detectors [21], [22]. Positioned after the shaping filter, these circuits digitize analog signals by comparing them to a predefined threshold ( $V_{Th}$ ), as illustrated in Fig. 3-34. This process enables event detection by indicating when the signal exceeds the threshold. The threshold level is crucial for balancing noise rejection and detection efficiency. While a higher threshold enhances noise rejection by improving the threshold-to-noise ratio, it may reduce detection efficiency by missing weaker valid signals [15]. This trade-off impacts system performance. For a PIN diode in short circuit mode, two main readout architectures are possible: one with a passive high-pass

RC filter and another with an active bandpass filter and a BLR chain. Each architecture has distinct requirements for the threshold discriminator, affecting noise performance, offset tolerance, and power consumption.



Fig. 3-34. Discriminator operation to digitize the analog signal.

## 3.5.1 Discriminator Design for Passive Shaping Filter

In the passive high-pass RC filter configuration, the signal amplitude at the discriminator input is typically lower [18], requiring a highly sensitive discriminator design with tight control over input-referred noise and offset. To achieve this, a differential preamplifier stage is often included to amplify the incoming signal before comparison.

The preamplifier is designed for high gain and broad bandwidth, ensuring sufficient signal amplification without introducing significant noise or offset. A common approach is a two-stage amplifier topology with common-mode control, which provides the necessary gain while minimizing noise. This is followed by a series of inverters to maintain signal integrity and drive the output to the required digital levels.

To address offset issues, particularly given the low SNR, an autozeroing technique is used [23]. This technique periodically samples and stores the offset value on a capacitor, allowing the circuit to reset its offset at regular intervals. This reduces mismatch and drift, improving reliability and preventing false triggers from noise. While this configuration is more power-hungry than the active filter setup, it ensures precise event detection even at low signal amplitudes.

## 3.5.1.1 Design Considerations

The signal after the passive RC filter has a minimum amplitude of 9 mV, which is small compared to the noise level, increasing the risk of false triggers or output saturation due to offset. As a result, controlling the noise and offset of the discriminator is essential to prevent a reduction in detection accuracy and SNR at the discriminator input.

To ensure that the noise and offset do not exceed the filter output noise, the inputreferred noise and offset of the discriminator must be kept below certain thresholds. Based on the results in Table 3-7, the discriminator should have an input-referred noise of  $\sigma_{Noise} < 0.6$  mV and an offset of  $V_{Offset} < 2$  mV to maintain sufficient SNR and detection accuracy. Achieving these requirements typically involves a more conservative design, resulting in higher power consumption.

Offset, caused by mismatches in transistor dimensions and threshold voltages due to fabrication variations, can significantly degrade the discriminator's performance. It can be evaluated using Monte Carlo simulations during the design phase to account for these uncertainties.

Moreover, selecting the threshold level  $V_{Th}$  is crucial for detection accuracy. Setting  $V_{Th}$  to 6 – 8 times the total noise power at the discriminator input node ( $\sigma_{tot}$ ) ensures that 99.99% of noise samples fall below the threshold. Achieving this requires an SNR in the range of 12 to 16 at the discriminator input.

## 3.5.1.2 Preamplifier

To ensure proper signal digitization, the preamplifier must provide sufficient gain and bandwidth. A two-stage amplifier topology (Fig. 3-35) was chosen for this purpose. The first stage is a fully differential amplifier with a common-mode control loop, and the second stage is a differential amplifier with a single-ended output for driving cascading inverters [24].

Simulation results show a total gain of 49.2 dB with a tolerance of 4.7 dB across process corners and a bandwidth of 181 MHz. Table 3-11 lists the transistor dimensions and current values for the preamplifier. The preamplifier consumes 104  $\mu$ W of power and has an input-referred noise of 0.326 mV, which meets the target. However, the offset is 15.3 mV [18], higher than the design specification, which could impair performance, especially at low signal amplitudes. Offset reduction techniques, discussed in the following subsection, are necessary to meet the required performance.



Fig. 3-35. Circuit diagram of a two-stage preamplifier with a common-mode control loop. Table 3-11. Dimensions and the currents of the transistors in the two-stage preamplifier.

|                     | $M_{1}, M_{2}$ | M3, M4 | M <sub>5</sub> , M <sub>6</sub> | M7, M8 |
|---------------------|----------------|--------|---------------------------------|--------|
| <b>W</b> [μm]       | 1.98           | 0.87   | 1.08                            | 3.3    |
| L [µm]              | 0.25           | 0.12   | 0.2                             | 0.66   |
| Ι <sub>D</sub> [μΑ] | 30             | 30     | 25                              | 25     |

## 3.5.1.3 Offset Reduction Network

A large offset in the preamplifier can degrade the detection accuracy of the ROIC by lowering the effective SNR at the discriminator input. Despite applying passive solutions, such as increasing the size of the input transistor pairs during design and layout, the offset remains larger than specified. To mitigate this issue and maintain high detection accuracy, an active offset reduction technique is necessary, though these techniques increase power consumption in the discriminator block.

Two common active offset reduction methods are preamplifier autozeroing and signal chopping. Due to the high electron rate in the target application, signal chopping is not feasible as it requires an ultra-high chopping frequency. However, preamplifier autozeroing can be implemented at much lower frequencies and is suitable for this application. In this method, the preamplifier offset is periodically sampled onto a memory capacitor ( $C_{az}$ ) during the "offset storage phase." In the subsequent "offset subtraction phase," the stored offset is subtracted from the input signal [23]. Figure 3-36 illustrates the two phases of this technique.



Fig. 3-36. Two phases of the autozeroing technique for preamplifier offset reduction: (a) offset storage phase, and (b) offset subtration phase.

During the offset storage phase, the discriminator is disconnected from the ROIC's preceding blocks and cannot register events. This phase, known as the deadtime, causes missed electron events. To minimize missed events, the duration of this phase  $(t_1)$  should be as short as possible, as it is proportional to the detection accuracy. The duration of  $t_1$  depends on the autozeroing capacitor  $C_{az}$  and the preamplifier time constant  $\tau_{preamp}$ . Figure 3-37 shows the voltage across  $C_{az}$  as a function of  $t_1$  for different values of  $C_{az}$ . For  $t_1 = 5 \times C_{az} \times \tau_{preamp}$ , more than 99.33% of the offset is stored, completing the offset storage.

During the offset subtraction phase, the voltage across  $C_{az}$  gradually decreases, leading to offset degradation due to leakage gate currents in the preamplifier input pairs and auxiliary switches. The larger  $C_{az}$  and the lower the leakage currents, the slower the offset degradation. Figure 3-38 shows the voltage across  $C_{az}$  during  $t_2$ , for various  $C_{az}$  values. The total leakage current discharging the capacitor is 18 pA.

A large  $C_{az}$  requires a longer offset storage phase (t<sub>1</sub>), resulting in a longer deadtime and higher detection error rates. Based on post-layout simulations, the autozeroing capacitor is set to  $C_{az} = 1 \text{ pF}$ , with  $t_1 = 10 \text{ ns}$  and  $t_2 = 90 \text{ µs}$ . This configuration reduces the preamplifier's initial offset after autozeroing to 70 µV [18]. The qualification of the discriminator and detection accuracy assessment are detailed in [18].



Fig. 3-37. Voltage across the autozeroing capacitor  $C_{az}$  (normalized with  $V_{Offset}$ ) as a function of the duration of the offset storage phase  $t_1$  for different values of the autozeroing capacitor  $C_{az}$ .



Fig. 3-38. Voltage across the autozeroing capacitor  $C_{az}$  (normalized with  $V_{Offset}$ ) as a function of the duration of the offset subtraction phase  $t_2$  for different values of the autozeroing capacitor  $C_{az}$ .

## 3.5.2 Discriminator Design for Active Shaping Filter

The active shaping filter configuration, which includes a bandpass filter with BLR, provides a higher signal amplitude at the discriminator input compared to the passive RC high-pass filter. This increased amplitude results in a higher SNR, reducing the need for strict noise and offset specifications in the discriminator design.

With a higher SNR, the discriminator can operate effectively with a simpler, less sensitive design, leading to a more power-efficient solution. This approach strikes a balance between performance and power consumption, enabling accurate detection without the need for complex offset correction or higher power dissipation. As a result, the active shaping filter configuration enhances overall power efficiency while maintaining the required detection accuracy.

#### 3.5.2.1 Design Considerations

In the active shaping filter configuration, the discriminator uses a differential OTA followed by cascaded inverters. The OTA is optimized for fast response and low power while maintaining adequate noise performance. The higher signal amplitude from the active filter enables the discriminator to achieve a low input-referred offset (typically < 1 mV) and a rapid propagation delay (< 0.5 ns) [22], simplifying design by prioritizing speed over complex noise and offset management.

The cascaded inverters efficiently drive the output to the required logic levels, making this design ideal for power-constrained applications. The increased signal amplitude enhances SNR, reduces noise interference, and allows the discriminator to focus on speed and responsiveness with less sensitivity to small signal variations. Therefore, the active shaping filter configuration is optimal for applications requiring both power efficiency and high-speed detection.

#### 3.5.2.2 Design and Implementation

The differential OTA, shown in Fig. 3-39, is a five-transistor amplifier with an NMOS differential input pair and a PMOS mirrored load. The NMOS transistors align with 450 mV baseline of the signal shaping filter, with sizing optimized to minimize input-referred offset dispersion, ensuring reliable performance across variations [22].

The tail current ( $I_{tail}$ ) plays a crucial role in power consumption, noise, and propagation delay. Simulation results indicate that  $I_{tail} = 40 \ \mu\text{A}$  strikes the optimal balance between speed, noise, offset, and power, as detailed in Table 3-12. This tail current satisfies the system's constraints. Table 3-13 lists the transistor sizes for  $I_{tail} = 40 \ \mu\text{A}$ , chosen to optimize noise performance, offset, and event detection speed [22].

Cascaded inverters drive the output signal to the digital backend, with sizing selected to minimize propagation delay and efficiently drive the load capacitance ( $C_L$ ), which includes both the digital backend and the signal transmission line. Experimental qualification and characterization of the discriminator are provided in [22].



Fig. 3-39. Schematic of the differential OTA and the inverters.

Table 3-12. Simulated Operation Metrics of the Differential OTA as a Function of the Tail Current  $I_{tail}$ 

| Tail Current I <sub>Tail</sub> | 20 [µA] | 30 [µA] | 40 [µA] | 50 [µA] |
|--------------------------------|---------|---------|---------|---------|
| Gain [dB]                      | 25.14   | 25.52   | 24.57   | 24.29   |
| Bandwidth [GHz]                | 0.811   | 0.891   | 1.075   | 1.137   |
| Rise Time [ps]                 | 19.6    | 18.9    | 18.1    | 17.8    |
| Noise [µV <sub>rms</sub> ]     | 290.3   | 246.2   | 228.9   | 209.7   |
| Offset [mV]                    | 0.64    | 0.57    | 0.52    | 0.48    |
| Power [µW]                     | 28.9    | 37.53   | 52.17   | 63.74   |

Table 3-13. Differential OTA transistor Sizes for  $I_{tail} = 40 \ \mu A$ 

| Transistors | <b>W</b> [μm] | L [µm] |
|-------------|---------------|--------|
| $M_1, M_2$  | 1.44          | 0.3    |
| $M_3, M_4$  | 7.2           | 0.48   |

# 3.6 Preamplifier Reset Generator

Incorporating a signal shaping filter into the readout channel effectively compensates for ISI-induced errors, enabling the discriminator to detect and register the arrival times of BSEs with a time resolution of 2.5 ns. However, at high BSE flux rates, the signal from the preamplifier becomes susceptible to pileup, which can lead to the saturation of the CSA. To address this issue, a reset switch can be integrated into the feedback network to discharge the feedback capacitor ( $C_F$ ), as illustrated in Fig. 3-40.



Fig. 3-40. Reset switch in feedback network of CSA.

The reset mechanism prevents CSA saturation but introduces deadtime—periods during which the readout channel cannot register incoming charge signals. This deadtime affects both detection accuracy and the output count rate by limiting the ability to register consecutive BSEs. Therefore, the reset period and mechanism must strike a balance between preventing CSA saturation and maintaining high detection performance, especially under high-rate conditions.

Several methods for activating the reset switch exist [25]. One involves comparing the signal after the preamplifier with an auxiliary threshold close to its saturation limit. If consecutive BSEs cause the signal to exceed this threshold, the reset switch is triggered. Although this method prevents saturation and improves BSE detection, it requires extra circuitry, increasing power consumption and silicon area. It also depends only on the BSE rate, ignoring their temporal pattern.

Another method activates the reset switch periodically, ensuring the preamplifier avoids saturation under high BSE flux rates. However, this method temporarily disables the readout channel during reset intervals, increasing detection errors when BSE events occur during these periods.

A more advanced approach triggers the reset switch based on the spatial and temporal pattern of incoming BSEs. By calculating the expected number of BSEs per detector pixel over consecutive time frames, a logic circuit integrated into the readout channel monitors the discriminator output and activates the reset accordingly. This method minimizes power consumption and reduces the chance of missing BSEs, balancing accuracy, power efficiency, and complexity for high-precision applications.

As outlined in Chapter 2, a maximum of three BSEs can impinge on a single detector pixel across three time frames. This can be used to reset the feedback capacitor ( $C_F$ ) in the CSA. A 3-bit shift register can store the detection pattern, triggering the reset switch in the fourth time frame to prevent CSA saturation.



Fig. 3-41. Schematic of the reset network and the associated logic.

Implemented in the digital domain, this solution has negligible power consumption compared to the analog stages, efficiently preventing CSA saturation without increasing power demands. Figure 3-41 shows the reset network and logic integration in the readout channel design. The experimental qualification and evaluation of output count rates under different reset mechanisms are detailed in [25].

## 3.7 Conclusions

This chapter presented the design and implementation of the readout frontend for the short circuit operation mode of PIN diodes, emphasizing solutions that meet the stringent requirements of high-resolution BSE detection in SEM. Key design choices were validated through simulations and experimental results, demonstrating the effectiveness of proposed strategies.

The preamplifier was optimized for low input impedance, minimal noise, and high bandwidth to accurately process weak signals from the detector while maintaining low power consumption. The signal shaping filter was designed with two alternatives: a passive CR filter for power-efficient operation and an active high-pass filter with gain to improve detection accuracy by relaxing the comparator's noise and offset requirements. Comparative analysis highlighted that while the passive approach is power-saving, the active solution better supports reliable detection at high count rates.

In conclusion, the design decisions presented in this chapter offer a balanced approach to achieving high performance and power efficiency in the short circuit operation mode, contributing to robust and energy-efficient electron detection systems.

## 3.8 References

- [1] M. Nakhostin, Signal processing for radiation detectors. Hoboken: John Wiley, 2018.
- [2] A. Rivetti, CMOS: Front-End Electronics for Radiation Sensors. Boca Raton: CRC Press, 2018. doi: 10.1201/b18599.
- [3] E. Säckinger, Analysis and Design of Transimpedance Amplifiers for Optical Receivers, 1st ed. Wiley, 2017. doi: 10.1002/9781119264422.
- [4] M. A. Disi, "Single Electron Readout Circuit: SERCuit," M.S. thesis, Dept. Microelectronics, TU Delft, Delft, the Netherlands, 2020.
- [5] M. Al Disi, A. Mohammad Zaki, Q. Fan, and S. Nihtianov, "High-Count Rate, Low Power and Low Noise Single Electron Readout ASIC in 65nm CMOS Technology," in 2021 XXX International Scientific Conference Electronics (ET), Sozopol, Bulgaria: IEEE, Sep. 2021, pp. 1–5. doi: 10.1109/ET52713.2021.9580005.
- [6] A. Mohammad Zaki and S. Nihtianov, "High Time Resolution, Low-Noise, Power-Efficient, Charge-Sensitive Amplifier in 40 nm Technology," in 2022 XXXI International Scientific Conference Electronics (ET), Sozopol, Bulgaria: IEEE, Sep. 2022, pp. 1–6. doi: 10.1109/ET55967.2022.9920321.
- [7] M. Ahangarianabhari, D. Macera, G. Bertuccio, P. Malcovati, and M. Grassi, "VEGA: A low-power front-end ASIC for large area multi-linear X-ray silicon drift detectors: Design and experimental characterization," Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 770, pp. 155–163, Jan. 2015, doi: 10.1016/j.nima.2014.10.009.
- [8] B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw-Hill Higher Education, 2016.
- [9] C. Fiorini and M. Porro, "Integrated RC cell for time-invariant shaping amplifiers," IEEE Transactions on Nuclear Science, vol. 51, no. 5, pp. 1953–1960, Oct. 2004, doi: 10.1109/TNS.2004.835578.
- [10] F. Krummenacher, "Pixel detectors with local intelligence: an IC designer point of view," Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 305, no. 3, pp. 527– 532, Aug. 1991, doi: 10.1016/0168-9002(91)90152-G.
- [11] G. Geronimo, A. Dragone, J. Grosholz, P. O'Connor, and E. Vernon, "ASIC with Multiple Energy Discrimination for High-Rate Photon Counting Applications," in 2006 IEEE Nuclear Science Symposium Conference Record, Oct. 2006, pp. 697– 704. doi: 10.1109/NSSMIC.2006.355951.
- [12] G. De Geronimo et al., "ASIC for SDD-Based X-Ray Spectrometers," in IEEE Transactions on Nuclear Science, vol. 57, no. 3, pp. 1654-1663, June 2010, doi: 10.1109/TNS.2010.2044809
- [13] R. Ballabriga, M. Campbell, E. H. M. Heijne, X. Llopart, and L. Tlustos, "The Medipix3 Prototype, a Pixel Readout Chip Working in Single Photon Counting Mode With Improved Spectrometric Performance," IEEE Transactions on Nuclear Science, vol. 54, no. 5, pp. 1824–1829, Oct. 2007, doi: 10.1109/TNS.2007.906163.
- [14] E. Fabbrica et al., "Design of MIRA, a low-noise pixelated ASIC for the readout of micro-channel plates," J. Inst., vol. 17, no. 01, p. C01047, Jan. 2022, doi: 10.1088/1748-0221/17/01/C01047.

- [15] A. Mohammad Zaki and S. Nihtianov, "Characterization Challenges of a Low Noise Charge Detection ROIC," IEEE Trans. Instrum. Meas., vol. 71, pp. 1–8, 2022, doi: 10.1109/TIM.2022.3160529.
- [16] A. Mohammad Zaki and S. Nihtianov, "Experimental Qualification of a Low-Noise Charge-Sensitive ROIC with Very High Time Resolution," in 2023 IEEE 32nd International Symposium on Industrial Electronics (ISIE), Helsinki, Finland: IEEE, Jun. 2023, pp. 1–6. doi: 10.1109/ISIE51358.2023.10228077.
- [17] A. Mohammad Zaki and S. Nihtianov, "Low-Offset Band-Pass Signal Shaper with High Time Resolution in 40 nm CMOS Technology," in IECON 2023- 49th Annual Conference of the IEEE Industrial Electronics Society, Singapore, Singapore: IEEE, Oct. 2023, pp. 1–5. doi: 10.1109/IECON51785.2023.10312049.
- [18] A. Mohammad Zaki, Y. Du, and S. Nihtianov, "Low-Power High Time Resolution Charge Detection ROIC in 40nm CMOS Technology," in 2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Glasgow, United Kingdom: IEEE, May 2024, pp. 1–6. doi: 10.1109/I2MTC60896.2024.10560791.
- [19] I. Hafizh, M. Carminati, and C. Fiorini, "TERA: Throughput-Enhanced Readout ASIC for High-Rate Energy-Dispersive X-Ray Detection," IEEE Transactions on Nuclear Science, vol. 67, no. 7, pp. 1746–1759, Jul. 2020, doi: 10.1109/TNS.2020.3001459.
- [20] G. Deda, I. Hafizh, M. Carminati, and C. Fiorini, "SCARLET: Readout ASIC for Bump-bonded SDD Array for Large Event Throughput," in 2021 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Oct. 2021, pp. 1–3. doi: 10.1109/NSS/MIC44867.2021.9875476.
- [21] L. Ratti, M. Manghisoni, V. Re, and G. Traversi, "Discriminators in 65 nm CMOS process for high granularity, high time resolution pixel detectors," in 2013 IEEE Nuclear Science Symposium and Medical Imaging Conference (2013 NSS/MIC), Oct. 2013, pp. 1–6. doi: 10.1109/NSSMIC.2013.6829777.
- [22] A. Mohammad Zaki, Y. Du, and S. Nihtianov, "Design and Qualification of a High-Speed Low-Power Comparator in 40 nm CMOS Technology," in 2023 XXXII International Scientific Conference Electronics (ET), Sozopol, Bulgaria: IEEE, Sep. 2023, pp. 1–5. doi: 10.1109/ET59121.2023.10278935.
- [23] R. Wu, J. Huijsing, and K. Makinwa, Precision Instrumentation Amplifiers and Read-Out Integrated Circuits, vol. Analog Circuits and Sinal Processing. New York, NY, USA: Springer, 2013. doi: 10.1007/978-1-4614-3731-4.
- [24] Y. Du, "Power-efficient, Precise Discriminator for a High Time Resolution, Low-Noise Charge Detection ROIC," M.S. thesis, Dept. Microelectronics, TU Delft, Delft, the Netherlands, 2023.
- [25] A. M. Zaki and S. Nihtianov, "Challenges of High-Resolution Electron Detection ASICs for SEM Microscopy," in 2024 9th International Conference on Mathematics and Computers in Sciences and Industry (MCSI), Aug. 2024, pp. 68–76. doi: 10.1109/MCSI63438.2024.00019.

# 4 Readout Solutions for Open Circuit Operation Mode

## 4.1 Introduction

The open circuit mode of a PIN-diode is characterized by a high-impedance configuration, where the load impedance is effectively infinite, allowing the charge generated from incident particles to accumulate in the diode's junction capacitance ( $C_D$ ) and in this way to produce a voltage signal over the PIN-diode terminals. The voltage signal is then compared with a threshold level by a high-precision comparator. This mode of operation is often preferred for applications requiring high sensitivity and low power consumption. However, as described in Chapter 2, open circuit mode introduces challenges such as charge pileup and saturation, leading to non-linearity and loss of sensitivity under high-rate BSEs. This chapter focuses on these challenges, detailing circuit techniques for charge management, signal integrity, and comparator-based detection, while optimizing power consumption and performance in the readout frontend.

## 4.2 Readout Frontend for Open Circuit Operation Mode

As shown in Fig. 4-1, the heart of the readout system is a high-impedance input stage, which ensures that the charge stored in  $C_D$  is not discharged prematurely, allowing the voltage signal to remain stable until it is processed [1]. A comparator typically interfaces the diode to realize the high-impedance load, ensuring minimal current flow while processing the voltage signal. The comparator is directly connected to the diode and it compares the signal to a reference threshold.

In open circuit mode, the comparator is responsible for comparing the voltage across the diode to a predefined reference threshold, determining whether a signal event has occurred. The comparator's design must prioritize both high precision and fast response times. Precision is essential for detecting small voltage variations, particularly in applications that require high sensitivity, such as single-electron detection. At the same time, a fast response is equally critical to detect successive signals without overlap in highrate environments, avoiding errors caused by signal pileup [2].



Fig. 4-1. Simplified block diagram of the readout frontend for open circuit mode of operaton.

A practical and efficient solution is to incorporate a reset mechanism to clear accumulated charge after each detection event. This reset mechanism is a key element in preventing charge pileup and maintaining system linearity. After each signal detection, the charge stored on the diode's junction capacitance ( $C_D$ ) must be cleared to prevent further accumulation and potential saturation. By resetting  $C_D$  to its initial state, subsequent signals can be accurately measured without interference from prior events. The reset circuit is typically synchronized with the comparator, ensuring that the system is ready to detect new signals immediately after processing the previous one. This approach ensures linearity and avoids the complications of multi-threshold designs, maintaining accuracy over a wider range of input signals.

Power efficiency is another crucial consideration in open circuit mode readout design. Since no power is consumed during the initial charge-to-voltage conversion, the focus shifts to minimizing the power used by the comparator and reset circuits. A dynamic comparator can significantly reduce power consumption during idle periods. Additionally, low-leakage components in the input stage help to preserve the stored charge without requiring continuous power input. These power-saving strategies further enhance the suitability of open circuit mode for energy-efficient applications while maintaining high detection accuracy and reliability.

## 4.3 Readout Frontend Architecture

A high-precision readout frontend (Fig. 4-2), proposed in [2] and validated through simulations, integrates several critical elements, including detector charge extraction mechanisms, periodic sampling, offset compensation, and threshold generation. These components, working within a well-defined architectural framework, collectively enable the accurate detection of small signals, minimize power consumption, and enhance the SNR.

The core of the proposed readout frontend is a 800 MHz comparator with active offset compensation to address inherent offset errors. When a hit is detected, the comparator resets the detector voltage to its bias level and injects a threshold charge onto the detector's junction capacitance ( $C_D$ ). To suppress undesirable effects such as charge injection and kickback, a complementary dummy capacitor ( $C_{dummy}$ ) is employed, connected to the comparator's opposite terminal and tied to the bias voltage via a reset switch. This configuration establishes a common-mode structure, converting these disturbances into common-mode errors and enhancing the robustness of the comparator against such interferences.



Fig. 4-2. A simplified circuit diagram of the readout circuit proposed in [2].

#### 4.3.1 Matching of Detector and Dummy Capacitors

The capacitor  $C_D$  plays a crucial role in the readout system by converting charge into voltage. However, its external placement makes it highly vulnerable to parasitic effects ( $C_{par}$ ), such as those arising during the bonding process between the detector and the readout dies. These parasitic influences can degrade its performance and compromise its intended functionality. In contrast,  $C_{dummy}$ , being integrated within the readout chip, operates in a more controlled and stable environment. This difference highlights the critical importance of achieving precise matching between  $C_D$  and  $C_{dummy}$ . Proper matching is essential to maintain the desired common-mode conditions at the comparator's input, thereby preserving the integrity and accuracy of the entire readout system.

As shown in Fig. 4-3, any mismatch between  $C_D$  and  $C_{dummy}$  can introduce an unintended differential signal, disturbing the comparator's equilibrium. This disruption manifests as differential errors that can shift the comparator's input away from its intended balance. If left unaddressed, these errors could result in false electron detections, undermining the accuracy and reliability of the detection system. The sources of mismatch include parasitic capacitances arising from the bonding process, as well as inherent variations in the values of  $C_D$  and  $C_{dummy}$  due to fabrication tolerances. These effects collectively complicate the precise matching of the two capacitors, necessitating a robust design approach to mitigate their impact.



Fig. 4-3. Differential signal at input of comparator due to the mismatch of the capacitors.

To overcome these challenges, an active matching mechanism is indispensable. This mechanism dynamically compensates for mismatches between  $C_D$  and  $C_{dummy}$ , effectively mitigating differential errors. By utilizing the comparator to correct these mismatches, the matching system ensures that the common-mode input conditions remain balanced. This, in turn, significantly improves the accuracy of the comparator and guarantees reliable electron detection. Subsection 4.5.4 elaborates on this active matching mechanism, detailing its implementation and evaluating its effectiveness in minimizing differential errors and preserving system integrity.

## 4.3.2 Periodic Sampling

In open circuit readout mode, the charge induced by BSEs accumulates on the detector capacitance until a reset switch removes it. This setup enables periodic voltage monitoring instead of continuous operation, reducing power consumption. Periodic sampling is inherently more power-efficient because circuits operate only at defined intervals, but it introduces challenges such as timing precision, signal alignment, and noise sensitivity.

Although a sampling frequency of 400 MHz logically aligns with a time resolution of 2.5 ns, the random arrival of hits often leads to timing misalignments. For instance, a hit occurring immediately after a sampling event may go undetected until the next sampling cycle. To address this issue, a higher sampling frequency of 800 MHz is employed, as generated on-chip and described in subsection 4.4.5. The increased sampling rate reduces timing misalignments and minimizes residual charge interference by capturing hits closer to their actual arrival times. This not only ensures prompt resets but also significantly enhances detection reliability, particularly in high hit-rate applications.

While doubling the sampling frequency inevitably raises power consumption, the periodic monitoring approach remains more energy-efficient than continuous operation. Moreover, the improved timing accuracy and reduced signal loss achieved at the higher frequency justify the additional power expenditure, ensuring robust performance under demanding operating conditions. At 800 MHz, the reset duration ( $t_{reset}$ ) becomes critical to prevent signal loss. For a maximum hit rate of 400 MHz (2.5 ns interval), the reset time must satisfy:

$$t_{reset} < \frac{1}{f_{hit_{max}}} - t_{clk} - t_{collection} \times \frac{V_{Th}}{V_{sig}}$$
(4-1)

where  $f_{hit_{max}}$  is the maximum hit frequency of incoming events,  $t_{clk}$  is the clock period,  $t_{reset}$  is the reset time, and  $t_{collection}$  is the charge collection time,  $V_{Th}$  is the threshold voltage, and  $V_{sig}$  is the signal voltage.

In the 40 nm CMOS process, reset times below 350 ps are achievable, ensuring reliable performance [2]. Among the evaluated solutions, sampling at 800 MHz provides the best balance between power efficiency and detection reliability. By reducing timing misalignments and preserving the benefits of periodic monitoring, it is well-suited for high-resolution, high-rate applications despite the higher power requirements.

# 4.4 Design and Implementation

The readout design consists of several specialized blocks. In addition to the primary blocks, supplementary circuits are needed to generate the high-speed 800 MHz clock signal and precisely control the reset pulse width. Both of these elements are crucial for maintaining timing accuracy and overall signal integrity. These additional components are important for the system seamless performance and will be discussed in more detail in this section.

## 4.4.1 Dynamic Comparator

The dynamic comparator design is based on the most popular topology shown in Fig. 4-4. It includes a preamplifier stage followed by a latch and an output buffer. The differential input signal is amplified by the preamplifier, allowing the circuit to tolerate the latch's offset, which can be up to 100 mV [2]. The latch uses positive feedback and remains in an unstable equilibrium during reset. Upon comparison, the latch transitions into one of its stable states, with the preamplifier determining the final output, either '1' or '0'.



Fig. 4-4. Generic implementation of the dynamic comparator.

The considerations taken into account for the comparator design include:

- 1. **Speed**: The comparator must complete the comparison within 1.25 ns to meet the 800 MHz operating frequency, including a reset period for subsequent hits.
- Power Efficiency: A dynamic comparator was chosen for its low static power consumption, which drops to zero after a decision or reset. Dynamic power usage is minimized through optimized switching and clocking schemes, ensuring energy efficiency.
- Preamplifier Gain: The preamplifier, operating in weak inversion with low bias current, reduces noise and offset, enabling rapid decision-making. The latch, requiring higher speed, operates with a higher bias current. These components typically use separate current branches.
- Kickback: Kickback noise is mitigated by minimizing gate-drain capacitance, controlling voltage variations, and ensuring common-mode kickback across terminals, preserving input signal integrity.
- Noise and Clock Timing: Input-referred noise is analyzed to ensure precise detection of small signals. A single clock trigger was employed to synchronize the latch and preamplifier, avoiding process-related gain variations.

This combination of speed, power efficiency, noise management, and timing precision ensures the comparator's suitability for the system's demanding operational requirements.

#### 4.4.1.1 Miyahara's Dynamic Comparator

The Miyahara dynamic comparator was chosen as the core component for the readout design due to its ability to meet key performance requirements, such as speed, offset management, and low power consumption [3]. This topology offers significant advantages, particularly in handling differential signals efficiently in high-speed applications. Fig. 4-5 illustrates the comparator's circuit diagram and outlines the role of each functional block.



Fig. 4-5. Miyahara's dynamic comparator.

During the reset phase, with the clock signal (CLK) low, the Di<sup>+</sup> and Di<sup>-</sup> nodes are charged to  $V_{DD}$  through transistors  $M_4$  and  $M_5$ . Transistors  $M_6$  to  $M_9$  maintain the latch in an inactive state, while  $M_{10}$  and  $M_{11}$  set the latch output to zero volts. When the CLK switches to high,  $M_3$  discharges the Di<sup>+</sup> and Di<sup>-</sup> nodes through  $M_1$  and  $M_2$ . The discharge rates vary based on the differential input voltages, generating a controlled differential signal that determines the latch's output direction.  $M_6$  and  $M_7$  further ensure that the latch sides receive different currents depending on the differential signal, improving latch control. The latch stage uses positive feedback to quickly amplify the differential signal at the Di<sup>+</sup> and Di<sup>-</sup> nodes, producing a definitive digital output. The preamplifier and latch operate on separate current branches, allowing independent current control. This setup optimizes power efficiency by matching the biasing of each component to its function: the preamplifier uses lower current for gain, while the latch uses higher current for speed.

Furthermore, the latch does not require an additional CLK phase to trigger; it is automatically enabled once the preamplifier amplifies the input sufficiently, simplifying the design and avoiding timing issues. Miyahara's topology is thus power-efficient, consuming only dynamic power during switching and eliminating static power consumption.

Kickback, a common issue in dynamic comparators due to voltage shifts at the Di<sup>+</sup> and Di<sup>-</sup> nodes, is managed in this design as common-mode errors. This approach prevents kickback from interfering with input signals while maintaining accurate differential-mode signal detection. The Miyahara comparator's architecture effectively addresses critical design challenges, making it ideal for high-speed, precision-oriented applications in low-power systems.

#### 4.4.1.2 Preamplifier Gain

To optimize the gain of the preamplifier, it is crucial to consider the factors influencing the differential voltage at the Di<sup>+</sup> and Di<sup>-</sup> nodes, which directly impact noise reduction and offset management in the comparator. The differential voltage  $\Delta V_{Di}(t)$ across these nodes is determined by the transconductance  $(g_{m_{1,2}})$  of the input transistors, the differential input voltage  $\Delta V_{in}$ , and the effective capacitance  $C_{Di}$  on the nodes, as given in Eq. 4-2:

$$\Delta V_{\text{Di}}(t) = \frac{g_{\text{m}_{1,2}} \times \Delta V_{\text{in}}}{c_{\text{Di}}} \times t$$
(4-2)

This relation holds while the latch is disabled (before  $M_6$  and  $M_7$  activate to drive the latch to  $V_{DD}$ ). The latch triggers when the differential voltage on the Di<sup>+</sup> and Di<sup>-</sup> nodes fall below the threshold voltage of the top inverters  $(V_{th_{inv}})$ . This enables the latch, resulting in a stable output decision.

To estimate the time required to activate the latch, we approximate the discharge rate of the  $Di^+$  and  $Di^-$  nodes using the average current  $I_{cm}$  through  $M_1$  and  $M_2$ . The discharge time is expressed as:

$$t \approx C_{Di} \times \frac{V_{DD} - V_{th_{inv}}}{I_{cm}}$$
(4-3)

Combining Eq. 4-2 and Eq. 4-3, the effective gain of the preamplifier  $(A_{Amp})$  is:

$$A_{Amp} = \frac{\Delta V_{Di}}{\Delta V_{in}} = \frac{g_{m_{1,2}}}{I_{cm}} \times (V_{DD} - V_{th_{inv}})$$
(4-4)

Equation 4-4 illustrates that the preamplifier gain is independent of the parasitic capacitance at the  $Di^+$  and  $Di^-$  nodes, provided the input signal remains constant during the comparison period. However, if the input is changing during operation (e.g., when a hit signal is still arriving), the gain will depend on the parasitic capacitance, as detailed in [2].

To maximize gain, two primary design strategies can be employed. First, by adjusting the sizing of  $M_3$ , the input current through  $M_1$  and  $M_2$  can be limited, pushing the transistors into weak inversion. This increases the  $g_m/I_d$  ratio, enhancing gain. Second, setting the inverter threshold ( $V_{th_{inv}}$ ) close to ground extends the time during which the input signal integrates on the Di<sup>+</sup> and Di<sup>-</sup> nodes, further increasing gain.

However, these methods to boost gain come with a trade-off: they reduce the overall comparison speed. Since the comparator operates at a high frequency of 800 MHz, the achievable gain will be constrained by speed requirements. Thus, careful balancing between gain and speed is necessary in the final design.

## 4.4.1.3 Noise

Dynamic comparators do not have a steady-state operating point, as the transistor operating regions constantly change during each comparison cycle. This characteristic makes traditional AC noise analysis unsuitable, as there is no fixed state to measure noise consistently across all operating conditions. Instead, noise in dynamic comparators must be analyzed using time-domain techniques, which can still provide valuable insights. For example, in [4], stochastic calculus was applied for time-domain noise analysis, yielding a noise expression that, although specific to a different comparator topology than the one shown in Fig. 4-4, offers generalizable conclusions for dynamic comparators.

Minimizing the overdrive voltage of the input transistors is essential for reducing noise in dynamic comparators. To achieve this, three primary design techniques are applied. First, the width-to-length (W/L) ratio of the input transistors is increased to improve transconductance, which in turn reduces noise. However, this adjustment also increases the input capacitance, potentially attenuating the input signal amplitude and diminishing the signal-to-noise ratio, making it less desirable in some cases. Second, the current through the input transistors is reduced by adjusting the size of the tail transistor,  $M_3$ . By lowering the W/L ratio of  $M_3$ , the current through the input pair ( $M_1$  and  $M_2$ ) is limited, which effectively reduces the noise generated during operation. Lastly, lowering the input common-mode voltage ( $V_{CM}$ ) further reduces the current through the input pair, thus minimizing noise.

While these approaches successfully reduce noise, they also introduce a trade-off in the form of reduced speed, as the reduced current impacts the comparator's response time. Therefore, careful optimization of these parameters is done to achieve an optimal balance between noise reduction and the required comparator speed for the application.

#### 4.4.1.4 Power Consumption

The power consumption of a dynamic comparator is heavily influenced by the amount of parasitic capacitance that needs to be charged and discharged during each comparison cycle. The primary contributors to this parasitic load are the output capacitance and the capacitance at the differential input nodes, known as the Di<sup>+</sup> and Di<sup>-</sup> nodes. A rough estimate of the total power consumption can be expressed by the following equation:

$$P = f_{CLK} \times V_{DD}^2 \times (2C_{Di} + 1.25C_{out})$$

$$(4-5)$$
where P represents the power consumption,  $f_{CLK}$  is the clock frequency,  $C_{Di}$  represents the capacitance at the Di<sup>+</sup> and Di<sup>-</sup> nodes, and  $C_{out}$  denotes the output capacitance. The factor of 2 accounts for the charging and discharging of both Di<sup>+</sup> and Di<sup>-</sup> nodes, while the factor of 1.25 reflects the fact that only one output fully charges and discharges; the other output typically reaches approximately  $V_{DD}/2$  before discharging back to ground.

To minimize power consumption in the comparator design, the key approach is to reduce parasitic capacitances at the Di<sup>+</sup> and Di<sup>-</sup> nodes. This is achieved through specific design techniques, such as optimizing the circuit layout by shortening trace lengths and selecting appropriate component sizes. By carefully managing these design factors, the parasitic capacitances are minimized, which in turn reduces the overall power consumption of the comparator while maintaining its performance within the required specifications.

#### 4.4.1.5 Simulation Results

In the dynamic comparator design, the input common-mode voltage ( $V_{CM}$ ) is set to 600 mV. This choice is intentional, as it optimizes gain while minimizing noise in the preamplifier, ensuring that the speed requirements for operation are met. The input transistors are configured with a width-to-length (W/L) ratio of 4.8 µm/100 nm, which ensures that the non-calibrated offset remains within acceptable limits. This transistor sizing also increases the transconductance ( $g_m$ ), positively contributing to the overall gain of the preamplifier. The resulting input capacitance of the comparator is 3.29 fF.

In terms of transistor types, the  $M_6$  and  $M_7$  transistors are High Voltage Threshold (HVT) devices, while  $M_8$  and  $M_9$  are Low Voltage Threshold (LVT) devices. This configuration brings the threshold of the top inverters closer to ground, further improving the preamplifier's gain. Additionally,  $M_1$  and  $M_{11}$  are LVT devices, ensuring that they can adequately pull the latch when the Di<sup>+</sup> and Di<sup>-</sup> nodes are reduced to the inverter threshold voltage ( $V_{th_{inv}}$ ).

To assess the noise characteristics of the comparator, the input voltage difference  $(\Delta V_{in})$  is swept over small values around zero. For each value of  $\Delta V_{in}$ , the probability that the comparator outputs a '1' is recorded. The resulting curve of P(OUT = 1) versus  $\Delta V_{in}$  exhibits an integrated normal distribution. The mean of this distribution corresponds to the comparator's offset, while the standard deviation reflects twice the input-referred RMS noise. The probability versus  $\Delta V_{in}$  curve is shown in Fig. 4-6, with both schematic and post-layout results.



Fig. 4-6. Probability of a positive output for different values of  $\Delta V_{in}$ .

| Category                                                  | Schematic | Post-layout |
|-----------------------------------------------------------|-----------|-------------|
| Power Consumption [µW]                                    | 36.0      | 56.9        |
| Energy per Conversion [f]]                                | 45.0      | 71.1        |
| Comparison Delay [ps]<br>$(\Delta V_{in} = 2 \text{ mV})$ | 153       | 240         |
| Input Referred Noise [µV <sub>rms</sub> ]                 | 125       | 140         |
| ENC [e <sup>-</sup> <sub>rms</sub> ]                      | 24        | 27          |
| Offset [mV]                                               | 5.59      | 5.87        |

Table 4-1. Dynamic comparator performance.

The final non-calibrated offset of the comparator shows a standard deviation of approximately 5.6 mV. It is determined in [2] that the threshold voltage ( $V_{Th}$ ) mismatch alone contributes about 4.0 mV to this standard deviation in offset. The remaining offset is attributed to other mismatches within the comparator, supporting the conclusion that a significant portion of the offset arises from the  $V_{th}$  mismatch of the input pair. A comprehensive summary of all simulation results for the dynamic comparator is presented in Table 4-1.

# 4.4.2 Offset Compensation

Miyahara pioneered a method of active offset compensation in his comparator design, which is similarly required for the application discussed in this work [3]. This subsection expands on Miyahara's original concept, detailing the offset compensation technique and its relevance to this application. It is important to note that traditional offset compensation methods commonly employed in analog circuits, such as chopping and auto-zeroing, are not suitable for this application. These techniques rely on linear amplifiers to function effectively, but since the comparator in this design provides only a digital output, it can indicate the polarity of the offset without revealing its magnitude. This limitation renders such methods inapplicable for dynamic circuits. Furthermore, the study in [2] demonstrated that, even under optimal conditions, passive offset compensation fails to meet the stringent precision requirements of this application. This limitation underscores the necessity for more sophisticated active offset compensation techniques to achieve the desired performance.

In Miyahara's approach, active offset compensation is implemented using auxiliary input pairs, as shown in Fig. 4-7. In this setup, the voltage  $V_C^-$  is fixed, while the voltage  $V_C^+$  is dynamically adjusted to counteract the offset. The appropriate compensation voltage is determined during a calibration phase, where a charge pump incrementally injects or removes charge into a storage capacitor  $C_C$  connected to the  $V_C^+$  node. This process is governed by the comparator's output polarity, allowing the offset to be finely tuned. Figure 4-8 illustrates the example waveforms of the calibration process.



Fig. 4-7. Adjusted pre-amplifier with auxiliary input pair to allow for offset calibration.



Fig. 4-8. Offset compensation technique used in [2].

After the calibration phase, both  $V_C^+$  and  $V_C^-$  are held constant, limiting the post-calibration offset of the comparator to a maximum of one charge pump step. Ideally, the compensation voltage across the storage capacitor would remain stable indefinitely. However, in practice, gate leakage through transistors  $M_{b_1}$  and  $M_{b_2}$  causes gradual dissipation of this voltage over time. To address this, the storage capacitor is designed with a large capacitance, ensuring that the compensation voltage remains effective over extended periods.

# 4.4.2.1 Two-sided Compensation

In the design, two-sided compensation is employed to enhance offset correction flexibility by allowing both sides of the compensation circuitry to be adjusted independently. Although this approach initially seems counterintuitive due to the use of two large capacitors instead of one, it offers a distinct advantage by effectively doubling the compensation range, as shown in Fig. 4-7. In contrast to single-sided compensation, where one side is fixed, two-sided compensation allows for more extensive offset corrections, making the system more adaptable [2], [3], [5].

The maximum compensation range is determined by the sizing of the compensation transistors  $M_{b_1}$  and  $M_{b_2}$ . While the use of two capacitors might appear to increase the overall design complexity, reducing the size of the compensation transistors does not reduce the compensation range. Smaller transistors, with lower gate leakage, enable the use of smaller capacitors without compromising the compensation voltage V<sub>C</sub> [5], [6]. As a result, the overall design area may remain the same or even decrease, despite the inclusion of two capacitors.

One of the key benefits of two-sided compensation is its ability to minimize the negative impact on preamplifier gain. The auxiliary input pair adds current to the Di<sup>+</sup> and Di<sup>-</sup> nodes, introducing a common-mode current  $I_{CM}$ , which reduces the preamplifier's gain (Eq. 4-4) and increases noise and offset. By enabling independent control over both sides of the compensation, it is possible to reduce these currents, thereby minimizing the adverse effects on the preamplifier gain. Specifically, one side of the compensation circuitry can be switched off when not needed, reducing the average compensation voltages  $V_C^+$  and  $V_C^-$ , thus preserving the comparator's performance while extending the compensation range.

This approach ensures both improved offset correction and overall efficiency without compromising the reliability and accuracy of the comparator. The two-sided compensation technique is adapted based on principles detailed in reference [3], with modifications tailored to meet the specific performance requirements of this design.

#### 4.4.2.2 Compensation Speed

Compensation speed is a critical factor in the design of the comparator, particularly when addressing the issue of charge leakage from the compensation capacitors. Excessive charge dissipation necessitates recalibration of the comparator, during which the system becomes unresponsive to incoming signals. To mitigate the impact of this downtime, the calibration process can be synchronized with the scanning operation of the SEM, which is disruptive process. Nevertheless, minimizing the calibration duration is essential to reduce the interruption to the system's functionality.

In linear step calibration, as demonstrated in [3], the maximum compensation time  $T_{comp}$  can be determined using the equation:

$$T_{\rm comp} = T_{\rm CLK} \times \frac{V_{\rm comprange}}{V_{\rm compstep}}$$
(4-6)

where  $V_{comp_{step}}$  represents the step size, and  $V_{comp_{range}}$  denotes the total compensation range. Achieving high precision requires smaller  $V_{comp_{range}}$ , which inevitably increases the calibration time. To address this limitation, a more efficient binary search algorithm can be employed to identify the offset. In this method, the initial step size is set to half of  $V_{comp_{range}}$ , with each subsequent step halving the size of the previous one. The calibration time for this approach can be expressed as:

$$T_{comp} = T_{CLK} \times [log_2(\frac{V_{comprange}}{V_{compstep}})]$$
(4-7)

While the binary search method significantly reduces calibration time compared to linear steps, its implementation introduces additional complexity. As a practical alternative, a compromise solution with decreasing step sizes is adopted. In this approach, although not as efficient as binary steps, the calibration time is still significantly shortened. The step size is reduced dynamically each time the comparator's output signal flips.

To enable this functionality, two key circuit blocks are used: an output sign swap detector and a charge pump with adjustable step size. The output sign swap detector circuit, illustrated in Fig. 4-9, monitors the comparator's output state. Each time the output is high, the corresponding latch is set to '1'. When both latches are simultaneously high, indicating that both outputs have been activated, the latches are reset, and a short pulse is generated to signify an output swap. Simulation results for this circuit are presented in Fig. 4-10.



Fig. 4-9. Circuit used to detect swapping of the output sign.



Fig. 4-10. Simulation results of the output swap detector circuit.



Fig. 4-11. Charge pump implementation.

The charge pump plays a vital role in calibrating the offset by placing and removing charge from a storage capacitor. This process is contingent on the output of the comparator and should only occur during calibration, indicated by an active high calibration signal *calb*. When functioning correctly, the charge pump decreases the step size each time the swap signal is activated. The implementation of this functionality is depicted in Fig. 4-11, which showcases the charge pump for the V<sub>C</sub><sup>+</sup> side; for the V<sub>C</sub><sup>-</sup> side, the output signals out<sup>+</sup> and out<sup>-</sup> are swapped.



Fig. 4-12. Simulation results of the charge pump circuit with decreasing step size.

Prior to calibration, the *calb* signal is low, ensuring that the voltages  $V_{pullup}$  and  $V_{pulldown}$  are set to  $V_{SS}$  and  $V_{DD}$ , respectively. During the calibration phase, charge is extracted from the  $C_{pullup}$  and  $C_{pulldown}$  capacitors whenever the swap signal is triggered. This charge reduction lowers the gate-source voltage ( $V_{gs}$ ) of the current source

and sink transistors, thereby decreasing the current flow through these transistors. Additionally, output switches direct the charge pump's operation based on the comparator's output. The simulation results of this process are provided in Fig. 4-12.

## 4.4.2.3 Maximum Calibration Range

The maximum calibration range of the offset compensation circuit is primarily determined by the relative sizing of the  $M_{b_1}$  and  $M_{b_2}$  transistors compared to  $M_1$  and  $M_2$ . This relationship defines the highest offset that can be effectively corrected. To calculate the maximum calibration range, the voltages  $V_C^+$  and  $V_C^-$  are set to their extreme values—specifically,  $V_C^+$  to  $V_{DD}$  and  $V_C^-$  to  $V_{SS}$  (or vice versa).

For optimal functionality in the targeted application, the maximum calibration range must be large enough to ensure that the probability of all comparators operating within this range is acceptable. For example, setting the maximum calibration range to  $V_{calb_{max}} = 6 \times \sigma(V_{offset})$ , where  $\sigma(V_{offset})$  is the standard deviation of the offset, increases the probability of all 10000 pixels functioning correctly to 99.5 %.

It is important to note that the offset arises not only from the comparator itself but also from other factors, including mismatches in the reset switches and variations in the gate-drain capacitance ( $C_{gd}$ ) of the input pair, both of which contribute to discrepancies in the comparator's kickback. Additionally, mismatches between the detector capacitance ( $C_D$ ) and the dummy capacitance ( $C_{dummy}$ ) further impact the kickback performance. Based on these contributions, the total offset is estimated to be approximately 10 mV, guiding the sizing of theM<sub>b1</sub> and M<sub>b2</sub> transistors.

As this design serves as a proof of concept, the maximum calibration range is intentionally set high at 330 mV to ensure the system operates reliably under all conditions. For the final application, the  $M_{b_1}$  and  $M_{b_2}$  transistors can be scaled down by a factor of five, aligning with the relationship  $V_{calb_{max}} = 6 \times \sigma(V_{offset})$ . This adjustment significantly reduces the size of the storage capacitors in the final design compared to the proof-of-concept version. Post-layout simulations validated the effectiveness of the offset compensation strategy. The implementation reduced the offset of the dynamic comparator from  $1\sigma = 5.59$  mV to  $1\sigma = 172 \mu$ V, demonstrating a substantial improvement. This result confirmed the robustness of the offset compensation approach and highlighted its effectiveness in enhancing the comparator performance for the intended application.

## 4.4.2.4 Charge Leakage

One notable challenge in the calibration process arises when the calibration mode remains active for an extended period. During such instances, the charge stored in the C<sub>pullup</sub> and C<sub>pulldown</sub> capacitors can become fully depleted due to the continuous processing of swap signals. This depletion renders the calibration mode effectively inactive.

To address this issue, the switches  $S_2$  and  $S_3$  in the calibration circuit (illustrated in Fig. 4-11) are intentionally designed to be significantly larger than  $S_1$  and  $S_4$ . This design ensures that, even if the system remains in calibration mode for an extended duration, the C<sub>pullup</sub> and C<sub>pulldown</sub> capacitors gradually recharge. The higher leakage currents through the larger switches $S_2$  and  $S_3$  offset the effects of the smaller switches  $S_1$  and  $S_4$ , enabling the stored charge to rebuild over time. As a result, if the output polarity remains unchanged, the system experiences a slow increase in the calibration step size as the capacitors recharge. This behavior ensures the calibration system retains a degree of responsiveness, even after prolonged inactivity.

Moreover, the leakage characteristics of the  $M_{b_1}$  and  $M_{b_2}$  transistors, along with the  $S_5$  and  $S_6$  switches, introduce a critical challenge by causing a gradual drift in the calibration voltages over time. This leakage-induced drift undermines the accuracy of the offset compensation once the calibration mode is deactivated, thereby diminishing the overall reliability and effectiveness of the system.

With a total leakage current of 18 fA, maintaining the offset compensation within a 5% error margin over a 1 ms period necessitates a storage capacitor ( $C_C$ ) larger than 2 fF [2]. To address this, the  $C_C$  is designed to be configurable, offering values of 2 fF,

5 fF, 10 fF, and 20 fF through programmable settings in the readout channel. This configurability not only ensures robust offset compensation but also provides flexibility in characterizing and optimizing the offset compensation mechanism under various operational conditions.

## 4.4.3 Threshold Generation

To ensure accurate hit detection, the dynamic comparator compares the detector voltage to a predefined threshold, ensuring a '0' output in the absence of hits. Precision in setting this threshold is crucial, as inaccuracies can affect detection reliability. Two methods have been considered and analyzed for implementing this threshold:

- At the Comparator Input: A voltage slightly below the detector's bias is applied to the comparator's input, ensuring a '0' output when the detector is at its bias. A hit causes the voltage to drop, prompting the comparator to register a '1'.
- Within the Comparator: The threshold can also be embedded within the comparator by introducing an intentional asymmetry that biases the output to '0'. This must be switchable to avoid interference with offset calibration.

Although both approaches are viable, applying the threshold at the comparator's input offers distinct advantages [2]. This method addresses the issue of varying detector capacitance ( $C_D$ ), which can lead to inaccuracies when using a fixed voltage threshold. A more reliable solution is to use a fixed charge threshold, which ensures a consistent threshold-to-signal ratio, regardless of fluctuations in  $C_D$ . Applying the threshold at the comparator input, where the detector is connected, ensures that the threshold charge  $Q_{th}$  is converted to a voltage by  $C_D$ . However, implementing this method requires that  $Q_{th}$  be smaller than the signal charge ( $Q_{sig} = 1000 \text{ e}^-$ ).

Generating the required small charge threshold presents a challenge. Two main methods can be considered for this: integrating a current over time or using a capacitor to convert a voltage into charge. The current integration approach demands precise control over both the current and the timing. The timing window is particularly tight—charge generation must occur after the detector reset and before the comparator performs the next comparison. For instance, with a reset time of  $T_{reset} = 300$  ps and a

comparator comparison time of  $T_{comp} = 350$  ps, only a narrow 600 ps window remains for charge generation. This limited time frame makes the system highly sensitive to variations in process, voltage, and temperature (PVT), which presents a substantial challenge. Additionally, switching the current source introduces noise, which further degrades the threshold generation. Therefore, current integration is not the optimal solution for this application.

An alternative method involves converting a voltage into charge using a capacitor, as illustrated in Fig. 4-13. In this approach, when a voltage step  $V_{step}$  is applied to the right side of the capacitor, the voltage across the detector changes according to the following equation:

$$V_{D_{step}} = \frac{V_{step} \times C_{dec}}{C_{decop} + C_D}$$
(4-8)

Assuming  $C_{dec} \ll C_D$ , this voltage can be approximated as:

$$V_{D_{step}} = \frac{V_{step} \times C_{dec}}{C_D}$$
(4-9)

The corresponding charge generated is given by:

$$Q_{D_{step}} = V_{step} \times C_{dec}$$
(4-10)

This method ensures that the amount of charge generated is independent of the detector's junction capacitance  $C_{D}$ .



Fig. 4-13. Circuit diagram to convert voltage to charge.

While this threshold generation method offers rapid operation, it has some significant drawbacks. The condition  $C_{dec} \ll C_D$  requires the decoupling capacitor ( $C_{dec}$ ) to be on the order of a few femtofarads, while maintaining high accuracy. While challenging, capacitors of this magnitude have been successfully fabricated in previous studies, with reports of 2 fF capacitors achieving an accuracy of up to 0.43 % in a 0.35 µm technology [7].

Additionally, the introduction of  $C_{dec}$  increases the equivalent capacitance at the input, reducing the signal amplitude. The overall accuracy of charge generation depends on the precision of the voltage step  $V_{step}$  and the decoupling capacitor  $C_{dec}$ . The voltage step can be generated by toggling between  $V_{SS}$  and  $V_{DD}$ , using a capacitive divider to achieve the desired step size. Although this method introduces a dependency on  $V_{DD}$ , the external power supply typically remains constant. The main concern is managing ripple voltage, which can be minimized using decoupling capacitors and wide power traces near the threshold generation blocks.

# 4.4.3.1 Design of Decoupling Capacitor

The depletion capacitor ( $C_{dec}$ ) plays a vital role in threshold generation by converting a voltage step ( $V_{step}$ ) into a precise amount of charge. For optimal performance,  $C_{dec}$  must be small and designed with high accuracy. In this technology, MOS capacitors (MOSCAPs) are ideal for  $C_{dec}$ , as they exhibit minimal variance in Monte Carlo simulations, making them a suitable choice. However, MOSCAPs come with certain characteristics and limitations that must be considered.

MOSCAPs have capacitance values that vary with the applied voltage, which can complicate consistent charge generation. To mitigate this, the applied step voltage  $V_{step}$ must be kept sufficiently small to ensure that the capacitance remains nearly constant. While MOSCAP values fluctuate across process corners, this variation is systematic across all pixels and can be compensated manually.

Another challenge with MOSCAPs is that their capacitance values depend on the observation point within the circuit. To minimize the load on the detector while achieving the desired threshold generation, testing both configurations—where the MOSCAP is connected either with the gate or another terminal to the detector—showed that connecting the gate to the detector resulted in the lowest load. This configuration was ultimately selected.

With the detector voltage ( $V_D$ ) set to 600 mV, the dependence of capacitance on  $V_{step}$  can be analyzed. By keeping  $V_{step}$  near  $V_{DD}$ , the capacitance value remains relatively stable, as shown in Fig. 4-14. This stability suggests that biasing  $V_{step}$  close to  $V_{DD}$  is advantageous.

For the final implementation, an NMOSCAP with dimensions of 800 nm by 400 nm is chosen, resulting in an additional load capacitance of 960 aF. This design approach minimizes loading effects on the detector and ensures consistent operation within the intended voltage range.



Fig. 4-14. Capacitance C<sub>dec</sub> vs V<sub>step</sub> voltage. The C<sub>dec</sub> is implemented using a MOSFET with the source and drain shorted together.

# 4.4.3.2 Circuit Design

The threshold generation circuit, shown in Fig. 4-15, functions by creating a small voltage step at  $V_{step}$  to adjust the detector's threshold, ensuring precise charge detection. During the reset phase, when the reset signal is high, the detector capacitance is pulled towards the reference voltage  $V_1$ . When threshold generation is enabled,  $V_{step_{big}}$  shifts from  $V_{DD}$  to ground, creating a step-down of  $V_{step}$  as determined by the capacitive divider. This momentarily lowers the detector node voltage, though the reset switch quickly restores  $V_D$  to  $V_1$ .

Once the reset signal goes low, the threshold addition phase begins. At this point,  $V_{step_{big}}$  is pulled up to  $V_{DD}$ . This upward shift in  $V_{step}$  is transferred through  $C_{dec}$  to  $V_D$ , adding the threshold charge to the detector capacitance and setting the comparator threshold. Waveform simulations, as shown in Fig. 4-16, confirm the sequence, demonstrating the detector's response during reset and the subsequent threshold addition, which adjusts the detector voltage for accurate charge measurement.





To provide flexibility in setting the threshold charge, the design incorporates an adjustable capacitance  $C_{step_2}$ , controlled by digital logic. This adjustability allows fine control over the generated threshold, with Fig. 4-17 illustrating the achievable range of values. During post-layout simulations, the MOSCAP capacitance was found to be approximately 20% higher than expected, due to additional parasitics not fully accounted for in schematic-level simulations. This discrepancy slightly increased the generated threshold, but the calibration mechanism effectively compensates for this.

The threshold generator achieves a low standard deviation of 12 electrons at a threshold setting of 500 electrons, maintaining stability across a wide range of detector's junction capacitances ( $C_D$ ) from 30 fF to 50 fF. However, the threshold standard deviation shows temperature dependence, gradually increasing at a rate of 0.272 electrons per degree Celsius. This minor thermal sensitivity is manageable with appropriate



thermal controls, ensuring consistent threshold levels across varying operating conditions.

Fig. 4-16. Signal waveforms of the threshold charge generation block.



Fig. 4-17. Simulation results of the charge generated by the threshold charge generator. The charge generated is expressed as a number of electrons.

#### 4.4.3.3 Layout Considerations

In the layout design, the decoupling capacitor  $C_{dec}$  plays a critical role and requires special attention to minimize parasitic capacitance. To achieve this, the connections for the MOSCAPs are initially routed using poly and n-type silicon, before transitioning to metal layer 1. This routing strategy helps reduce the parasitic capacitance that metal traces could introduce, thereby preserving the precision of  $C_{dec}$ . Additionally, the MOSCAPs are completely enclosed within a guard ring, effectively isolating it from the surrounding p-substrate, which enhances its stability and minimizes potential interference.

To ensure reliable operation of the capacitive divider, additional buffer circuits are incorporated into the design. These buffers improve the driving capability, ensuring the capacitive divider operates accurately without signal degradation. This comprehensive layout approach—minimizing parasitics, isolating sensitive components, and ensuring robust signal driving—contributes significantly to the overall accuracy and stability of the threshold generation circuit.

# 4.4.4 Capacitor Matching Mechanism

The mismatch between the detector capacitor  $(C_D)$  and the dummy capacitor  $(C_{dummy})$  presents a critical challenge in maintaining common-mode conditions at the comparator's input. This mismatch arises primarily due to parasitic capacitances introduced during the bonding process between the detector and readout dies, as well as process variations. Such discrepancies convert clock kickback and reset-switching artifacts into differential signals instead of common-mode effects, resulting in potential erroneous detections. The root of this issue lies in the inherent physical differences between  $C_D$ , located on the detector die and influenced by bonding-related parasitics, and  $C_{dummy}$ , integrated in the readout die. Combined, these factors can lead to a mismatch of up to 20 % between the two capacitors.

To address this issue, an active matching mechanism, depicted in Fig. 4-18, is proposed. This mechanism employs a programmable binary-scaled capacitor network to implement  $C_{dummy}$ , enabling dynamic tuning of the total equivalent capacitance to

compensate for mismatches. The  $C_{dummy}$  network includes a static capacitor of 16 fF in parallel with programmable capacitors ranging from 2 fF to 16 fF. During the matching process,  $C_D$  (including parasitics) and the  $C_{dummy}$  network are simultaneously charged with a fixed charge. The offset-compensated autozeroed comparator then evaluates the resulting voltage difference and identifies the capacitor with the smaller effective value. A successive approximation register (SAR) logic iteratively adjusts the configuration of the  $C_{dummy}$  network until the comparator's output toggles, signifying that the input voltages are balanced.



Fig. 4-18. Capacitor matching mechanism.

This implementation uses the charge generation network described in subsection 4.4.3, reusing existing circuitry for efficient operation. A replica of the charge generation network is implemented in parallel with the  $C_{dummy}$  network, ensuring that both networks are charged with identical amounts of charge. Consequently, the accuracy of the matching depends solely on the precision of the charge generation network. Since this block is composed entirely of capacitors and switches, achieving high matching accuracy is feasible by employing interdigitation techniques during layout to enhance process-dependent precision. Additionally, as both charge generation networks reside on the same die, they share similar parasitic characteristics, further improving matching

accuracy. Advanced layout techniques, including symmetrical placement and the use of dummy structures, are also applied to minimize variations and ensure consistent performance.

Simulation results demonstrate that the proposed matching mechanism can reduce the maximum mismatch between  $C_D$  and  $C_{dummy}$  network to just 1 fF. This residual mismatch results in a differential voltage ranging from 85  $\mu$ V to 170  $\mu$ V for input charge signals between 500 and 1000 electrons. Such a voltage is well within the tolerance range of the comparator, as it remains smaller than the comparator's offset after autozeroing.

#### 4.4.5 Additional Blocks

In addition to the core components of the readout circuit, several auxiliary blocks are crucial for its comprehensive functionality. These supplementary blocks play a vital role in supporting various operational aspects, thereby enhancing the overall system performance and ensuring the seamless integration of each component within the readout architecture. This sub-section will explore the design and purpose of three specific additional blocks. These blocks are introduced to facilitate calibration, threshold generation, and stability under varying conditions, addressing key challenges and bolstering the robustness of the readout system.

# 4.4.5.1 Reset Generation

The reset generator plays a crucial role in generating a reset signal with a programmable duration, which is essential for initializing the system to a consistent state, particularly during calibration mode. This ensures that the input voltages of the comparator are balanced, enabling zero-offset calibration. The reset generator circuit, illustrated in Fig. 4-19, utilizes a programmable delay line to allow precise adjustment of the reset pulse width to meet specific operational requirements. Additionally, the latch design ensures that when both the set and reset signals are high, the output  $\overline{\text{Reset}}$  remains at '1', maintaining a stable reset pulse duration, irrespective of the comparator's output timing.



Fig. 4-19. Circuit diagram of the reset generator, the latch is designed such that it gives a  $\overline{\text{Reset}} =$ '1' output whenever set and reset are both '1'.



Fig. 4-20. The output waveform of the reset generator (Reset).



Fig. 4-21. Values for the different reset widths possible given an input bitcode.

Post-layout simulations, depicted in Figs. 4-20 and 4-21, reveal that this reset network provides accurate control over the pulse width, remaining unaffected by variations in the input pulse width from OUT<sup>+</sup> or OUT<sup>-</sup>. This robustness ensures the reliability of the reset function, delivering consistent performance across different operating conditions.

# 4.4.5.2 Asynchronous Reset

For the readout system to function correctly, a clock frequency of 800 MHz is required, while the chip is driven by an external 400 MHz reference clock signal. To double the frequency, the incoming 400 MHz CLK is converted to the 800 MHz CLK<sub>comp</sub> using a circuit shown in Fig. 4-22. The top section of this circuit includes an edge detector that generates a brief pulse,  $V_{SET}$ , with each rising or falling edge of the CLK signal. Each time an edge is detected, CLK<sub>comp</sub> is set to '1', signaling the comparator to begin its comparison. Once the comparator evaluates its inputs, and either OUT<sup>+</sup> or  $OUT^-$  reaches '1', indicating the comparison is complete, the circuit resets CLK<sub>comp</sub> to '0' using an OR gate. This setup prevents premature resetting of the comparator, ensuring accurate control over the timing of the comparisons.



Fig. 4-22. Circuit used to asynchronously reset comparator and generate a 800 MHz CLK.

The output of the circuit,  $CLK_{comp}$ , may not exhibit a 50 % duty cycle, as the comparator reset is triggered by the completion of the comparison rather than a fixed clock cycle. To support rapid resetting, the reset switches  $M_4$  and  $M_5$  (as shown in Fig. 4-5) are sized relatively large, ensuring that the comparator can reset quickly, even if the comparison takes slightly longer.

Simulated waveforms, presented in Fig. 4-23, confirm the duty cycle variations, with schematic simulations showing 21.0 % and post-layout simulations showing

30.3 %. The slight increase in the post-layout duty cycle is primarily due to additional delays within the dynamic comparator, as detailed in Table 4-1. Testing with the dynamic comparator in the loop replicates the performance expected in the final pixel, providing realistic timing and confirming that the circuit can sustain an 800 MHz effective clock frequency, despite the initial 400 MHz limitation.



Fig. 4-23. Simulation results of the asynchronous reset circuit, simulated in combination with the dynamic comparator.

#### 4.4.5.3 Output Buffer

The comparator's output stream consists of brief pulses at 800 MHz on the OUT<sup>+</sup> and OUT<sup>-</sup> lines. These high-frequency, short pulses cannot be directly detected by the FPGA, so an intermediate output buffer is used to convert the signals into two separate 400 MHz streams. This output buffer allows the FPGA to process the comparator's results by splitting the high-frequency signal into two phases, one for each half of the clock cycle. In this configuration, one output represents the comparator's signal during the first phase of the CLK cycle (when CLK= '1'), while the other represents the signal during the second phase (when CLK = '0').

The buffer circuit, shown in Fig. 4-24, consists of two latches with distinct base values: the top latch defaults to '0', and the bottom latch defaults to '1'. During each CLK cycle, the latches reset to their respective base values. If OUT<sup>+</sup> is high during the first phase (when CLK = '1'), the top latch is set to '1'. Conversely, during the second phase (when CLK = '0'), if  $OUT^-$  becomes high, the bottom latch is set to '0'. At the beginning of each CLK cycle, the latch outputs are sampled by flip-flops, ensuring that each pulse spans a single CLK cycle.



Fig. 4-24. Circuit diagram of the output buffer, there is a single clock cycle of delay between the input and the output of the buffer.

Simulation results, as shown in Fig. 4-25, demonstrate a predictable, single-cycle delay in the output buffer. This delay, though consistent, can be accounted for when reconstructing images for SEM applications. The output buffer's two channels are routed to separate FPGA inputs for processing. Although the final design would merge

these signals using an OR gate to form a single 400 MHz bitstream, the prototype monitors each output independently to facilitate accurate classification and testing of signal integrity.



Fig. 4-25. Simulation results of the output buffer given every third output of the comparator is positive.

# 4.4.6 Pixel Overview

With the integration of all these functional blocks, the pixel is now operational and capable of detecting incoming signals. The complete functional block diagram of a single pixel, as introduced in this chapter, is illustrated in Fig. 4-26.



Fig. 4-26. Functional block diagram of the pixel.

The pixel's operation will be evaluated using an ideal detector, and the simulation results are presented in Fig. 4-27. In this scenario, the current pulse generated by the detector, representing an input signal, is labeled  $I_{sig}$ . While this signal is masked by the comparator's kickback, which is a common-mode disturbance, the differential component remains discernible as  $\Delta V_{in} = V_{dummy} - V_{Det}$ . This differential nature allows the comparator to effectively disregard the common-mode kickback, enabling it to make accurate decisions based on the incoming signal.

The outputs of the comparator serve critical functions within the system. They are used by the output buffer to generate the signals  $Out_{ph_1}$  and  $Out_{ph_2}$ . Simultaneously, these outputs are fed into the asynchronous reset circuit to produce the  $CLK_{comp}$  signal. Additionally, the reset generator utilizes the comparator outputs to reset the voltages on both the detector and dummy, thereby enabling the generation of a new threshold for the detector. This threshold is reflected in the  $\Delta V_{in}$  signal, which stabilizes around  $\sim 2.5$  mV in the absence of hits.

The first 200 ns of the simulation is allocated to calibrating the dynamic comparator. During this phase, the values of  $V_C^+$  and  $V_C^-$  are adjusted to ensure the comparator achieves a zero offset. Following this, a capacitor matching mechanism is activated for 350 ns, during which the configuration of the  $C_{dummv}$  network is optimized to match the capacitance of  $C_D$ , including parasitics. After the matching process, offset cancellation is reactivated for an additional 200 ns to reaffirm the comparator's zero-offset state.



Fig. 4-27. Simulation results of a single pixel, using an ideal detector model.

Once the self-calibration procedures are complete, the system begins to process hits, following a recurring pattern of '1110 0011 0010'. The design undergoes extensive verification across all process corners and over a wide temperature range from 0 °C to 100 °C. Monte Carlo simulations are also conducted to validate the robustness of the calibration process. Furthermore, the clock phase for receiving hits is fine-tuned to ensure reliable functionality during hit arrivals. The impact of capacitance mismatches between the dummy and detector is evaluated as well, demonstrating that the system maintains proper operation even with a  $\pm 20\%$  variation in capacitance.

The final power consumption of a single pixel is measured at 200  $\mu$ W, marking a significant improvement over previously designed current mode readout circuits, which consumed 370  $\mu$ W. This reduction in power consumption highlights the efficiency of the current design while maintaining performance standards.

## 4.4.7 Conclusions

The open circuit operation mode of the PIN diode presents a compelling solution for high-sensitivity, low-power readout applications by leveraging the intrinsic junction capacitance for charge-to-voltage conversion. This approach eliminates the need for traditional current or charge amplifiers, thereby significantly enhancing the power efficiency of the readout frontend. The ability to perform signal conversion without active amplifiers underscores its suitability for energy-constrained systems.

However, the inherent memory effect of the detector in this mode necessitates resetting after each detection event. This operational constraint has been utilized to enable the use of dynamic comparators, which activate only at the end of each detection cycle. This selective activation helps reduce power consumption while maintaining high detection accuracy.

A notable design consideration is the temporary disengagement of the readout frontend during the offset compensation phase, leading to periodic deadtime. Nonetheless, this limitation can be tactically addressed in SEM applications by synchronizing offset compensation with natural scanning pauses, thereby mitigating performance trade-offs.

In summary, the open circuit mode offers an effective trade-off between power efficiency and system complexity, particularly for scenarios where power conservation and high detection accuracy are paramount. The design strategies presented in this chapter demonstrate how circuit-level innovations, including dynamic comparator operation and offset compensation synchronization, can address the inherent challenges of this mode, paving the way for robust and efficient readout solutions.

# 4.5 References

- Y. Wang, Z. Dong, R.-L. Lai, and K. Kanai, "Semiconductor charged particle detector for microscopy," WO2019233991A1, Dec. 12, 2019.
- [2] L. Bouman, "High-Speed Readout Circuit for PIN Single Electron Detector in Voltage Mode," M.S. thesis, Dept. Microelectronics, TU Delft, Delft, the Nether-lands, 2023.
- [3] M. Miyahara, Y. Asada, D. Paik, and A. Matsuzawa, "A low-noise self-calibrating dynamic comparator for high-speed ADCs," in 2008 IEEE Asian Solid-State Circuits Conference, Nov. 2008, pp. 269–272. doi: 10.1109/ASSCC.2008.4708780.
- [4] P. Nuzzo, F. De Bernardinis, P. Terreni, and G. Van der Plas, "Noise Analysis of Regenerative Comparators for Reconfigurable ADC Architectures," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 55, no. 6, pp. 1441–1454, Jul. 2008, doi: 10.1109/TCSI.2008.917991.
- [5] R. Chen, A. Lee, Y. Hu, H. Xu, and X. Kou, "A 12-bit 75 MS/s Asynchronous SAR ADC with Gain-Boosting Dynamic Comparator," in 2024 IEEE International Symposium on Circuits and Systems (ISCAS), May 2024, pp. 1–5. doi: 10.1109/IS-CAS58744.2024.10558594.
- [6] R. Wu, J. Huijsing, and K. Makinwa, Precision Instrumentation Amplifiers and Read-Out Integrated Circuits, vol. Analog Circuits and Signal Processing. New York, NY, USA: Springer, 2013. doi: 10.1007/978-1-4614-3731-4.
- H. Omran, R. T. El Afandy, M. Arsalan, and K. N. Salama, "Direct Mismatch Characterization of Femtofarad Capacitors," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 63, no. 2, pp. 151–155, Feb. 2016, doi: 10.1109/TCSII.2015.2468919.

# **5** Experimental Results

This chapter presents the experimental evaluation of the proposed readout frontends developed for high-precision charge detection. The fabricated prototype, implemented in a 40 nm CMOS process, serves as the Device Under Test (DUT), integrating two matrices of readout channels designed for short circuit and open circuit modes. The primary goal of these tests is to validate the DUT's ability to reliably detect charge signals from a PIN diode under stringent power and precision requirements. Comprehensive experiments assess key performance parameters such as gain, bandwidth, noise, and crosstalk, while also examining process-induced variations and noise coupling effects. The results offer insights for further design optimization.

# 5.1 Qualifications Test Setup

The qualification test setup is designed to evaluate the performance of the fabricated readout circuits comprehensively. The DUT features a 40 nm CMOS chip integrating readout matrices and auxiliary components for flexible testing. The setup includes a power supply for biasing the chip, a current source for generating input charge signals, and an oscilloscope for real-time signal monitoring. Additionally, an FPGA-based Data Acquisition Board (DAB) is employed to program the DUT and perform comprehensive testing.

# 5.1.1 Goals of the Qualification Tests

The qualification tests aim to validate the ability of the proposed architectures to detect the charge signal from the PIN-diode and digitize it with high accuracy while adhering to a limited power budget. To achieve this, the readout channels are triggered using charge signals with tunable amplitudes, and the output signals from each functional block within the readout channel must be systematically monitored and analyzed.

Performance evaluation involves characterizing both analog and digital signals at various stages of the readout chain. This includes assessing the accuracy of the charge

signal detection under critical parameters such as gain, bandwidth, noise, and threshold levels. The readout channels are implemented within a matrix configuration to enable a thorough analysis of additional noise sources, including pixel crosstalk and substrate noise coupling. These evaluations guide further refinements of the readout architecture, ensuring its suitability for high-performance SEM applications.

# 5.1.2 Device Under Test (DUT)

The DUT is a chip fabricated in 40 nm CMOS technology including two matrix of readout channels: a  $3 \times 3$  matrix of readout channels for short circuit mode of operation and a  $3 \times 4$  matrix of readout channels for open circuit mode of operation as depicted in Fig. 5-1. Auxiliary blocks are included around the chip to enhance testability and facilitate the experimental qualifications. These blocks comprise a power regulator for biasing, a current source for generating test signals, a shift register and configuration switches for programming the readout channels and the peripheral blocks, and a dummy linear feedback shift register (LFSR) to introduce random digital noise into the chip substrate by generating random digital pulses synchronized with a given CLK<sub>In</sub> signal.

The block diagram of the DUT is shown in Fig. 5-2. The signal monitoring is facilitated by selecting the channel of interest using the *Row\_Sel* and *Col\_Sel* signals. The readout channels operating in short circuit mode are organized into three configurations. Group I and Group II are shielded from substrate-coupled noise using a deep nwell (DNW) layer, while Group II and Group III feature electrostatic discharge (ESD) protection at their input nodes, where the detector is bonded.

For readout channels operating in open circuit mode, four configurations are employed, with each row featuring a different value of auto-zeroing capacitor to optimize performance. This grouping ensures operational accuracy and robustness by tailoring the circuit configurations to specific functional and environmental requirements.

| Logic Decoupling Caps | Decou | ıpling i<br>ps | Open Circuit<br>Mode Pixels<br>4X3 | Short Circuit<br>Mode Pixels<br>3X3<br>Buffers |  |
|-----------------------|-------|----------------|------------------------------------|------------------------------------------------|--|
|                       |       | gic            | Decoupl                            | ing Caps                                       |  |

Fig. 5-1. Micrograph of the DUT including two matrix of readout channels.

| LVDS TX/RX     |                   | Pulse Generator |        |        | Current Regulator  |       |       | Power    |           |                           |
|----------------|-------------------|-----------------|--------|--------|--------------------|-------|-------|----------|-----------|---------------------------|
| Shift Resister | Open Circuit Mode |                 |        |        | Short Circuit Mode |       |       |          | Regulator |                           |
|                | РІХ               | (1              | PIX 2  | PIX 3  |                    | PIX 1 | PIX 2 | PIX 3    |           |                           |
| Buffer - X     | РІХ               | (4              | PIX 5  | PIX 6  |                    | PIX 4 | PIX 5 | PIX 6    | -         | N - Buffer                |
|                | РІХ               | (7              | PIX 8  | PIX 9  |                    | PIX 7 | PIX 8 | PIX 9    |           |                           |
|                | РІХ               | 10              | PIX 11 | PIX 12 |                    | DNW   |       | ESD      | i         | Configuration<br>Switches |
| Chip           | L                 |                 |        | i.     | Dummy LFSR         |       |       | Switches |           |                           |

Fig. 5-2. Block diagram of DUT including two matrix of readout channels and auxiliary blocks.

# 5.1.3 Test PCB

The external test setup devices are connected to the chip through isolation buffers implemented both on a test printed circuit board (PCB) and within the chip itself. These buffers serve to minimize noise injection from peripheral devices and prevent loading effects on the DUT. Communication between the DAB and the DUT is realized via a high-speed low-voltage differential signal (LVDS) interface. To ensure noise immunity across the system, isolation buffers are used to channel all signals from the DAB. A photograph of the test PCB is provided in Fig. 5-3 while Fig. 5-4 illustrates the top-level overview of the test PCB structure.



Fig. 5-3. The test PCB designed for experimental qualification of the DUT.



Fig. 5-4. The top-level overview of the test PCB.

## 5.1.4 Detector Emulating Circuit (DEC)

For the purpose of qualifying the designed readout channels, the detector is substituted by an emulating circuit, providing freedom in tunability of the charge signal amplitude as well as the equivalent junction capacitance of the detector ( $C_D$ ) [1], [2].

The detector emulator circuit (DEC) consists of two functional blocks: a programable capacitor network and a programmable current source. The capacitor network simulates the detector's equivalent junction capacitance ( $C_D$ ), adjustable between 30 fF and 50 fF during the chip programming. The DEC generates fast current pulses to emulate the detector's charge signal for the readout channels.

The DEC is shown in Fig. 5-5. It comprises a current generator  $(I_{Internal})$  to supply the bias current, a programmable current mirror (ratio N) for adjusting pulse amplitude, a switch network to define pulse width based on the *Trigger* signal's duty cycle, and a current buffer to deliver pulses to the CSA input. The *Trigger* signal, generated by the DAB, creates random logical patterns ('0' and '1'), with a pulse width matching t<sub>Collection</sub>. The current generator and mirror are located at the chip periphery, while the other components are within each readout channel. The DCCS also supports external biasing (I<sub>External</sub>) controlled by the *I\_MODE* signal.

The equivalent charge of the DCCS-generated pulses is calculated as:

$$Q_{\rm in} = t_{\rm collection} \times i_{\rm s} \tag{5-1}$$

where  $Q_{in}$  is the input charge, and  $i_s$  is the pulse amplitude. With  $t_{Collection} = 1.8$  ns and  $i_s$  ranging from 73 to 115 nA, the DEC produces charge signals of  $Q_{in} = 130 - 210$  aC (800 - 1300 electrons).

To achieve the required current levels, the DEC is biased by a low-noise precision current source. The programmable mirroring factor N of the current regulator relaxes biasing constraints. With N adjustable between 9 and 14, the internal current source  $(I_{Internal} = 1 \,\mu A)$  generates the desired current range. For instance, with N = 11, the DEC produces a current pulse of  $i_s \approx 90$  nA, corresponding to 160 aC charge. Figure 5-6 shows the simulated current pulses across the specified range.



Fig. 5-5. Simplified detector emulating circuit (DEC) diagram.

The DEC can also be biased using an external precision current source (Yokogawa GS200), offering greater flexibility for tuning  $i_s$  and evaluating the current regulator's accuracy. With this setup, the current amplitude can be adjusted by modifying N or varying the external current source between 0.8  $\mu$ A and 1.3  $\mu$ A, with N = 11.



Fig. 5-6. Simulated DEC current pulses for tunable range of  $i_s$  and  $I_{Internal} = 1 \mu A$ .

# 5.1.5 Pulse Time Width Modulator

The DEC is driven by a signal generated by the DAB, featuring a fixed pulse width of 2.5 ns. However, since the time width of the charge signal must align with the detector's charge collection time ( $t_{Collection} = 1.8 \text{ ns}$ ), an intermediate block is required to adjust the pulse duration.

This intermediate block generates the trigger signal with a programmable length for the DEC. A schematic overview of this block is shown in Fig. 5-7. It comprises a risingedge detector and a programmable delay line that allows selection of the desired pulse width. To ensure robust operation, an enable signal is integrated, and buffers are strategically placed within the delay line to maintain adequate signal strength. The resulting waveforms from an input step are depicted in Fig. 5-8, while Fig. 5-9 illustrates the dependence of DEC trigger signal's time width on the programmable *SEL*<0:2> signal.



Fig. 5-7. Pulse width modulator schematic diagram.





Fig. 5-8. Simulated resulting waveforms after the pulse width modulator.

Fig. 5-9. dependence of the *Trigger* signal's time width on the programmable *SEL*<0:2> signal.

## 5.1.6 Multiple Pulse Generator

The pulse time width modulator can only generate a single pulse every other clock cycle. However, in practical scenarios, signal hits may arrive as closely as 2.5 ns apart.
To replicate these conditions and thoroughly test the system, the multiple pulse generator block is employed. This generator converts a single trigger signal from the pulse time width modulator into two or three consecutive trigger signals, with programmable spacing between them. The circuit implementation is depicted in Fig. 5-10.



Fig. 5-10. Schematic diagram of the multiple pulse generator block.

The second trigger signal is generated after the first programmable delay block, while the third trigger signal follows after the second programmable delay block. Both the second and third trigger signals can be independently activated or deactivated using the *DOUBLE\_EN*<0:4> and *TRIPLE\_EN*<0:4> signals, respectively. This flexibility allows precise control over the number and timing of the generated pulses. A simulated waveform demonstrating this functionality is shown in Fig. 5-11. It is worth noting that, due to the switching time required by the multiplexer, each successive pulse experiences a slight reduction in width, which has been accounted for in the design.



Fig. 5-11. The post-layout simulation results of the multiple pulse generator block.

#### 5.1.7 Shift Register

The chip incorporates a 70-bit shift register to enable programmable configuration and dynamic adjustment of the readout channel parameters. This shift register functions as a serial-to-parallel converter, allowing input data to be sequentially loaded and then distributed across 70 control bits. Each bit corresponds to a specific programmable element within the chip, such as switches or control lines, which define the operational settings of the readout channels. By leveraging this architecture, the readout parameters can be easily modified, such as gain, threshold levels, and capacitors values, without requiring hardware modifications or external components.

The 70-bit shift register also governs the status of switches implemented within the readout channels, facilitating real-time reconfiguration. This capability is particularly useful for evaluating multiple channel configurations and mitigating effects such as substrate noise and pixel crosstalk. Programming is achieved through a serial input interface, where the data is shifted in synchronously with a clock signal, and once all bits are loaded, a latch signal stores the settings in the respective control elements.

#### 5.1.8 Linear Feedback Shift Register

To evaluate the readout architecture's robustness against substrate noise, a dummy Linear Feedback Shift Register (LFSR) is integrated into the chip. The LFSR generates random digital pulses synchronized with an external  $CLK_{In}$  signal, thereby introducing controlled random noise into the substrate. By injecting digital noise, the LFSR emulates real-world operating conditions where noise from digital circuits can interfere with sensitive analog readout channels. Fig. 5-12 presents the schematic diagram of the 12-bit LFSR.



Fig. 5-12. Schematic diagram of the 12-bit LFSR.

The generated pulses mimic stochastic digital transitions, which couple into the substrate and affect the performance of the readout channels. This setup enables the systematic study of substrate noise coupling and its impact on the critical parameters of the readout system, such as SNR, gain stability, and noise floor. By varying the LFSR VDD and CLK<sub>In</sub> frequency, the amplitude and frequency characteristics of the noise can be tailored, allowing for comprehensive testing and validation of the readout architecture's resilience. This feature plays a pivotal role in identifying potential noise-induced vulnerabilities and optimizing the system's design to mitigate substrate noise effects.

#### 5.1.9 Data Acquisition Board

In the experimental evaluation of the DUT, the DAB is generating a series of trigger signals and is recording the output data from the DUT. The DAB, based on an FPGA platform (DE10-Standard), is responsible for controlling the sequence of operations and interacting with the DUT to assess its performance. The trigger signals are generated based on digital codes stored in a register labeled 'trigger register', which precisely delineates the status of the detector within 2.5 ns timeframes. A logic state of '1' indicates the detection of a BSE landing on the detector surface at the start of each timeframe. In this test series, it is assumed that the particle always lands on the detector surface at the beginning of the timeframe. This configuration of the DAB allows for easy adjustments of chip settings without the need for resynthesizing the FPGA for each test condition.

To validate the accuracy of the detection timing, the DAB uses a reference clock generator operating at  $CLK_{In} = 400$  MHz to achieve the necessary 2.5 ns resolution. The DAB simultaneously collects the digital data generated by the readout frontend, storing it in a register labeled 'data register'. The DAB then compares the logical state patterns in both the trigger register (which indicates the landing of a BSE on the detector surface) and the data register (which indicates the detection of the BSE by the readout channel in DUT). Comparing the digital data stored in these two registers enables the calculation of the detection error rate and operational accuracy of the readout frontend [2]. The performance algorithm of the DAB as depicted in Fig. 5-13.

In addition to the basic trigger functionality, the DAB is responsible for configuring the shift register with the appropriate settings, initiating calibration when necessary, and controlling the execution of different tests. The full FPGA code is provided in Appendix B.



Fig. 5-13. Simplified block diagram of the DAB.

#### 5.1.10 Poissonian-Distributed Trigger Pulses

In the detector, the charge signals are generated randomly. However, in the DEC, these signals are generated based on the trigger signals provided by the DAB. To emulate the random arrival of events at the detector, the DAB must send the trigger pulses at random time intervals. Furthermore, a trimmable delay can be applied to the DEC trigger pulses through chip programming, deferring the charge signal's arrival at each time slot. However, this delay is uniform for all trigger pulses.



Fig. 5-14. Pattern of logical '0' and '1' with a Poissonian distribution and the corresponding trigger pulses generated by the DAB.

The BSEs land on the detector following a Poissonian distribution [3], so the trigger pulses must exhibit the same behavior. To achieve this, a pattern of logical '0' and '1' with a Poissonian distribution can be generated using MATLAB. This pattern is then scanned by the DAB to produce Poissonian-distributed trigger signals. This approach simplifies the experimental setup, as it allows for a direct comparison between the pattern of the DEC trigger signal and the one generated by the comparator, facilitating the evaluation of the DUT operation. Figure 5-14 illustrates the pattern of logical '0' and

'1' with a Poissonian distribution, along with the corresponding trigger pulses generated by the DAB.

## 5.2 Experimental Qualification Results

This section presents the experimental assessment and evaluation of the readout channels introduced in Chapters 3 and 4, applying the test setup introduced in section 5.1. To achieve this, the output signals from each functional block are plotted and analyzed under various test scenarios. Furthermore, the detection accuracy of charge signals through the readout channels is examined as a function of the discriminator threshold levels. The impact of additional error sources, such as pixel crosstalk and substrate noise coupling, is also experimentally investigated and validated. In all conducted tests, each charge signal generated by the DEC consistently maintains a time width of 1.8 ns (corresponding to the detector charge collection time) while the signal amplitude and detector capacitance ( $C_D$ ) are tunable.

# 5.2.1 Experiemental Qualification of Readout Channels Operating in Short Circuit Mode

#### 5.2.1.1 Characterization of CSA

For a trigger signal generated by DAB and  $C_D = 50$  fF, Fig. 5-15 presents the measured voltage signal after the CSA operating in fast (red curve) and slow (blue curve) modes with a  $C_F = 5$  fF (high-gain mode). Similarly, Fig. 5-16 presents the measured voltage signal after the CSA operating in fast (red curve) and slow (blue curve) modes with a  $C_F = 10$  fF (low-gain mode). A summary of signal characteristics is presented in Table 5-1. As seen, in both the slow and fast modes, the CSA can fulfill the requirements of the SNR and rise time for either of the gain modes.

As discussed in Chapter 3, the transfer function and noise performance of the CSA are strongly influenced by the total input capacitance, which comprises the detector capacitance ( $C_D$ ) and additional parasitic capacitances. Figure 5-17 illustrates the measured noise at the CSA output as a function of the total input capacitance, varying from

30 fF to 50 fF, for different CSA configuration modes. The data highlights the direct correlation between increased input capacitance and elevated noise levels, emphasizing the importance of minimizing parasitic capacitances to optimize the CSA's noise performance.



Fig. 5-15. Measured voltage signal after the CSA with  $C_F = 5$  fF (high-gain mode) for the slow and fast modes in blue and red, respectively.



Fig. 5-16. Measured voltage signal after the CSA with  $C_F = 10$  fF (low-gain mode) for the slow and fast modes in blue and red, respectively.

Figure 5-18 illustrates the measured rise time  $(t_r)$  of the CSA output signal as a function of the detector capacitance  $(C_D)$  for various CSA configuration modes, with

the input charge amplitude fixed at 160 aC. The results reveal a clear trend: as the detector capacitance increases, the rise time also increases, primarily due to the combined effects of reduced bandwidth and the additional capacitive load. Furthermore, the measurements indicate that the CSA operating in slow mode exhibits a significantly longer rise time compared to faster configurations, emphasizing the strong dependence of  $t_r$  on the CSA bandwidth.



Fig. 5-17. Measured noise at the CSA output as a function of the total input capacitance, varying from 30 fF to 50 fF, for different CSA configuration modes.



Fig. 5-18. Measured signal rise time  $t_r$  afthe the CSA as function of detector capacitance  $C_D$  for different configuration modes of the CSA while the input charge is fixed at 160 aC.

| CCA Mada                                  | High  | Gain  | Low Gain |       |  |
|-------------------------------------------|-------|-------|----------|-------|--|
| CSA Mode                                  | Slow  | Fast  | Slow     | Fast  |  |
| V <sub>Amp</sub> [mV]                     | 29.45 | 28.53 | 16.47    | 13.64 |  |
| $\sigma_{Noise} \left[ m V_{rms} \right]$ | 1.43  | 1.38  | 0.91     | 0.82  |  |
| ENC [e <sub>rms</sub> ]                   | 49    | 48    | 55       | 60    |  |
| SNR                                       | 20.59 | 20.67 | 18.1     | 16.63 |  |
| t <sub>r</sub> [ns]                       | 2.56  | 2.41  | 2.27     | 2.11  |  |
| t <sub>tail</sub> [ns]                    | 286.9 | 159.3 | 537.6    | 249.6 |  |

Table 5-1. Measured characteristics of the voltage signal aftre CSA.

\*Discharge tail of the CSA voltage signal

As direct interface with the detector, CSA plays a pivotal role in ensuring a sufficient SNR across all operating conditions. The performance metrics becomes particularly critical in the worst-case scenarios, where the detector produces a charge signal that is 20% below the nominal value, equivalent to approximately 130 aC (or 800 electrons). In such cases, the CSA must effectively amplify the weak signal while maintaining a robust SNR. Figure 5-19 presents the experimentally measured SNR of the CSA as a function of the quantized levels of the input charge signal for various CSA configuration modes and  $C_D = 50$  fF.

Furthermore, Figure 5-20 depicts the measured SNR of the CSA as a function of the quantized levels of the input charge signal for different detector capacitance ( $C_D$ ) values, with the CSA configured in high-gain and fast modes. The results reveal a clear trend: the SNR diminishes as  $C_D$  increases. This behavior can be attributed to the additional capacitive loading, which degrades the CSA bandwidth and amplifies noise contributions, underscoring the critical trade-off between detector capacitance and signal fidelity in the readout interface.

Studying the CSA's response to consecutive trigger pulses provides critical insights into its performance under conditions of signal pileup, a common challenge in highrate applications. To evaluate this behavior, the DEC is triggered using two logical patterns: '1100' and '1010'. The '1100' pattern tests the CSA's response when triggered in consecutive time frames, while the '1010' pattern introduces an idle time frame between two trigger pulses. By analyzing the amplitude of the voltage signals generated in each case, the extent of gain compression and amplitude loss due to idle time frames can be quantified.



Fig. 5-19. Measured SNR of the CSA as a function of the quantized levels of the input charge signal for various CSA configuration modes and  $C_D = 50$  fF.



Fig. 5-20. Measured SNR of the CSA as a function of the quantized levels of the input charge signal for different C<sub>D</sub> values, with the CSA configured in high-gain and fast modes.



Fig. 5-21. Voltage signals of the charge-sensitive ROIC for '1100' and '1010' logical trigger patterns in blue and red once the CSA is configured in slow and high-gain modes.

Figure 5-21 presents the CSA voltage signals corresponding to these two trigger patterns. The blue line represents the response for the '1100' pattern, while the red line shows the response for the '1010' pattern when the CSA operates in slow and high-gain mode. For the '1100' pattern, the amplitude of the second voltage signal is measured at 58.78 mV, while for the '1010' pattern, it is slightly reduced to 58.57 mV. Comparing the amplitudes of the first and second pulses in the '1100' pattern reveals a 0.4% gain compression under signal pileup conditions. This behavior is observed due to the nonlinear transfer characteristics and limited dynamic range of the CSA. Additionally, a detailed comparison of the two patterns indicates an amplitude loss of 0.21 mV within 2.5 ns, attributed to the idle time frame in the '1010' pattern. This loss reflects the CSA's transient behavior and highlights the challenges associated with idle intervals, which can degrade signal integrity.

#### 5.2.1.2 Characterization of the Active Shaping Filter

Figure 5-22 illustrates the measured voltage signals of the active shaping filter for a low-pass filter capacitance of  $C_{LPF} = 500$  fF and a reference voltage of  $V_{ref} =$ 450 mV, generated in response to a single trigger pulse from the DAB. Two programming modes of the CSA—high-gain with slow and fast configurations—are evaluated to assess the shaping filter's performance. These measurements reveal that the active shaping filter effectively compresses the signal in the time domain while maintaining a relatively consistent time-width across different CSA operating modes. The elimination of CSA signal's long tail ensures a more compact signal suitable for further processing.



Fig. 5-22. Measured voltage signal of the active shaping filter for  $C_{LPF} = 500$  fF and  $V_{ref} = 450$  mV for a high gian CSA programmed in slow (blue line) and fast (red line) modes.

Despite the slight variations in amplitude and time-width between modes, the generated signals exhibit time-widths larger than the 2.5 ns target timeframe for discrimination. However, as confirmed by Table 5-2, the signal time-width at the discrimination level—set at eight times the noise power—meets the required specification. This observation highlights the active shaping filter's ability to adaptively refine the signal, ensuring that it remains within the acceptable limits for subsequent discrimination and processing stages. The experimental results validate the active shaping filter's functionality in both modes of the CSA operation.

| CCA Mada                                                | High  | Gain  | Low Gain |       |  |
|---------------------------------------------------------|-------|-------|----------|-------|--|
| CSA Mode                                                | Slow  | Fast  | Slow     | Fast  |  |
| V <sub>Amp</sub> [mV]                                   | 337.4 | 313.9 | 186.5    | 147.3 |  |
| $\sigma_{\text{Noise}} \left[ m V_{\text{rms}} \right]$ | 16.63 | 14.85 | 10.35    | 8.84  |  |
| SNR                                                     | 20.2  | 21.1  | 18.1     | 16.6  |  |
| t <sub>width</sub> [ns]                                 | 3.72  | 3.64  | 2.61     | 2.43  |  |
| t <sub>Width</sub> @8σ <sub>Noise</sub> [ns]            | 2.25  | 2.17  | 1.22     | 1.15  |  |

Table 5-2. Characteristics of the active shaping filter for  $C_{LPF} = 500$  fF and  $V_{ref} = 450$  mV and different programming modes of the CSA.

Figure 5-23 demonstrates the dependency of the signal time-width after the active shaping filter on the value of the trimmable capacitance ( $C_{LPF}$ ) in the BLR chain for various configuration modes of the CSA. The results indicate a clear trend: as the  $C_{LPF}$  increases, the time-width of the output voltage signal broadens. This behavior aligns with the expected dynamics of the low-pass filter in the feedback network, where a larger  $C_{LPF}$  effectively reduces the cut-off frequency, extending the signal duration. This feature provides design flexibility, enabling fine-tuning of the signal time-width to meet specific application requirements.



Fig. 5-23. Measured signal time-width after active shaping filter as a function of a capacitor C<sub>LPF</sub> in a BLR chain for varous configuration modes of the CSA.



Fig. 5-24. Measured SNR after the active shaping filter as a function of the quantized levels of the input charge signal for various values of  $C_{LPF}$  in the BLR chain.

Figure 5-24 presents the measured SNR after the active shaping filter as a function of the quantized levels of the input charge signal for various values of the trimmable capacitance ( $C_{LPF}$ ) in the BLR chain. In these measurements, the CSA is configured in high-gain and slow modes. The results reveal a noticeable decline in SNR with increasing  $C_{LPF}$  values, attributed to the widening of the filter passband. This observation underscores the trade-off between bandwidth expansion and noise suppression in the shaping filter design.

Probing the performance of the active shaping filter under a sequence of consecutive trigger pulses is crucial to evaluating its ability to compensate for ISI-induced errors. This test scenario demonstrates how the shaping filter handles harsh conditions with closely spaced signals, ensuring minimal signal distortion and accurate discrimination. Figure 5-25 presents the voltage signals generated by the active shaping filter for three consecutive trigger pulses, with the CSA programmed in both slow and fast modes. Despite the filter's ability to significantly reduce signal pileup, a small residual signal remains in subsequent timeframes. This is attributed to the generated voltage signals slightly exceeding the targeted timeframe of 2.5 ns. However, the amplitude of the residual signal in the following timeframe is only 7% of the maximum signal value, rendering the pileup effect negligible. Consequently, the shaping filter effectively mitigates ISI-induced errors and produces well-defined signals that can fit within 2.5 ns timeframes after discrimination.

When the threshold level  $(V_{th})$  is set within the appropriate range, the discriminator can accurately generate three distinct digital pulses corresponding to the three trigger pulses. This demonstrates the system's ability to maintain high detection accuracy even in the presence of consecutive signals.

Interestingly, as shown in Fig. 5-25, the amplitude of the second and third signals does not exhibit a strictly linear relationship. This deviation is caused by gain compression in the CSA, as noted in previous subsection. The CSA does not provide uniform gain for successive signals due to its nonlinear transfer characteristics and limited dynamic range, leading to slight variations in the amplitude ratios of the residual signals to the pileup. Nonetheless, the residual errors are sufficiently small to ensure reliable operation in practical applications.



Fig. 5-25. Measured voltage signals after the active shaping filter for three consecutive trigger pulses and  $C_{LPF} = 500$  fF for a high gain CSA programmed in slow (blue line) and fast (red line) modes.

Table 5-3 summarizes the DC offset values measured at the output of the CSA and the active shaping filter for various programming modes of the CSA. As shown, the DC baseline offset at the CSA output is significantly larger than the amplitude of the voltage signals and the noise power. This large offset poses challenges for accurate signal discrimination, as it may hinder the differentiation of small signal variations from the baseline level. The active shaping filter effectively addresses this issue through the implementation of its BLR chain and internal negative feedback loop. These mechanisms work synergistically to suppress the DC offset by a factor of 17. This substantial reduction in the baseline offset is crucial for improving the accuracy of signal discrimination, as it minimizes interference from the DC level and enhances the filter's ability to isolate the true signal components.

| CSA Program                                  | High | Gain | Low Gain |      |  |
|----------------------------------------------|------|------|----------|------|--|
| Mode                                         | Slow | Fast | Slow     | Fast |  |
| V <sub>Offset</sub>   <sub>CSA</sub> [mV]    | 2.7  | 5.1  | 2.4      | 5.6  |  |
| V <sub>Offset</sub>   <sub>Shaper</sub> [mV] | 0.16 | 0.29 | 0.14     | 0.33 |  |

Table 5-3. Measured DC offset after the CSA and active shaping filter.

These results highlight the active shaping filter's ability to stabilize the DC level, mitigate baseline drift, and ensure accurate signal discrimination. Its robust design effectively handles ISI, processes signals within the required 2.5 ns timeframes, minimizes signal pileup, and demonstrates resilience against gain compression. These characteristics make the active shaping filter a reliable and indispensable solution for high-speed, high-precision detection and imaging applications.

#### 5.2.1.3 Charge Detection Challenges

After characterizing the core building blocks of the readout channel and identifying the optimal configuration to enhance performance, further optimization of functionality and detection accuracy requires addressing and mitigating additional error sources, particularly noise. Extrinsic noise arises from digital switching noise such as substrate coupled noise and crosstalk among the readout channels in the chip. This part explores the impact of these additional noise sources on the operational accuracy of the readout channels, emphasizing the need for the building blocks to maintain an adequate SNR under all operating conditions.

#### 5.2.1.3.1 Substrate Coupled Noise

Mixed-signal integrated circuits, combining sensitive analog stages with digital blocks, are susceptible to substrate noise coupling, primarily due to high-frequency digital signals. These signals can induce voltage fluctuations in the substrate, affecting analog transistors by modulating their threshold voltages and gains. Digital switching noise is the primary contributor to this effect.

Mitigation strategies for substrate noise coupling fall into three categories: reducing noise strength, minimizing circuit susceptibility, and reducing noise coupling [4]. While the first two require fundamental design changes, the third approach can be applied to existing circuits without major modifications.

A practical method for reducing noise coupling is limiting the swing range of digital pulses by lowering the digital power supply voltage. This approach decreases the noise generated by digital transitions; however, it comes at the cost of slower switching speeds and increased signal propagation delays, which can compromise performance in high-speed applications [4], [5]. Another widely used technique is employing an ohmic guard ring tied to the power supply or a low-noise analog supply pin. This stabilizes the substrate voltage and mitigates noise propagation [4].

For more robust isolation, a triple-well structure is an effective solution. By incorporating separate DNW regions for analog and digital circuits, the structure creates a high-impedance depletion region that blocks substrate noise coupling [4]. Additionally, isolating critical analog circuits with individual DNWs provides superior protection compared to a shared DNW for the entire analog section. As illustrated in Fig. 5-2, this technique is applied to shield critical stages of the readout pixels in Groups I and II, demonstrating its efficacy in preserving signal integrity in noise-sensitive designs.

To evaluate these techniques, the LFSR generates random digital pulses with a maximum frequency of 400 MHz to simulate substrate noise. During this test, the DEC blocks of the readout channels are left untriggered, and the noise power is quantified by analyzing the signal histogram. This analysis captures both the substrate coupled noise and the inherent random noise of the readout channel. The inherent noise is then subtracted, allowing for the isolation of the substrate noise contribution.

Figure 5-26 presents the peak-to-peak substrate noise measurements for three groups of readout channels, emphasizing the DNW layer's effectiveness in mitigating substrate noise. A comparison across the three groups reveals that the DNW layer achieves an attenuation of approximately 10 dB, demonstrating its ability to effectively isolate noise-sensitive components. Additionally, the average measured noise gradient is -3.68 dB across the readout channels relative to their position to the LFSR. These results highlight the DNW layer's consistent performance in reducing substrate noise and affirm its critical role in enhancing noise resilience within the readout channel architecture.

Figure 5.27 presents the peak-to-peak substrate noise of readout channels in Group II for various power supply levels used to bias the LFSR. The results reveal that lowering the LFSR bias level reduces the digital swing range, which in turn decreases the amplitude of the coupled noise by limiting the available energy for substrate coupling. However, this approach also reduces the digital circuitry's switching speeds and increases signal propagation delays, which can impact overall system performance.



Fig. 5-26. Measured peak-to-peak substrate coupled noise of the readout channels. Group I (green), Group II (blue), and Group III (red).



Fig. 5-27. Measured peak-to-peak substrate coupled noise of the readout channels in Group II for various power supply levels used to bias the LFSR.

#### 5.2.1.3.2 Pixels Crosstalk

Each readout channel is susceptible to signal interference from neighboring pixels due to capacitive or conductive coupling, leading to crosstalk signals that have a net area of zero. The timing of these crosstalk signals is influenced by the pulse duration within the readout channel [6].



Fig. 5-28. Measured crosstalk after PIX 5 as a function of the quantized levels of input charge signal for different configuration modes of the CSA.



Fig. 5-29. Measured crosstalk to  $3\sigma_{Noise}$  ratio for PIX 5 as a function of the quantized levels of the input charge signals for different configuration modes of the CSA.

To assess crosstalk, PIX 5, positioned centrally within the 3 × 3 matrix (Fig. 5-2), was selected for testing while all other readout channels were triggered by the DAB. Figure 5-28 displays the measured crosstalk amplitude at PIX 5 as a function of the quantized input charge levels generated by the DEC, with the analysis conducted under various CSA stage configurations and a fixed total input capacitance of  $C_D = 50$  fF. The results indicate that the crosstalk amplitude increases proportionally with both the input charge signal amplitude and the CSA stage gain.

Figure 5-29 shows the measured crosstalk-to- $3\sigma$  noise ratio for PIX 5, plotted as a function of the input charge levels generated by the DEC. For input charges exceeding 160 aC, the crosstalk amplitude becomes comparable to the intrinsic noise, leading to signal distortion. This interference reduces the accuracy of the readout channel in detecting the charge signals, highlighting the need to manage crosstalk to maintain high detection fidelity.

#### 5.2.1.4 Assessment of Detection Accuracy

To experimentally assess the detection accuracy of the readout channels, the DAB generates a series of trigger signals based on digital codes stored in the trigger register. Comparing the logical state patterns in both the trigger and data registers, the DAB performs a comparative analysis to evaluate the detection error rate and operational accuracy of the designed readout channel. The 'true' counts represent the number of bits where the logical state of the trigger and data registers align. However, in certain instances, due to the inherent noise of the readout channel, the discriminator may generate additional digital pulses and assign them to subsequent time frames in the data register, resulting in 'erroneous' counts. Furthermore, any triggers missed by the discriminator are categorized as 'missed' counts. The combined total of erroneous and missed counts is denoted as 'incorrect' counts. The error rate is subsequently computed as the ratio of incorrect counts to the total number of logical states of '1' in the trigger register. This comprehensive evaluation methodology enables a thorough assessment of the readout channel performance and detection accuracy under various conditions.

To ensure precise and consistent data across all experimental test scenarios, the readout channel undergoes 100 firing cycles via the trigger code, and the outcomes are subsequently averaged. For detector charge signal  $Q_{in} = 160 \text{ aC}$ , Fig. 5-30 illustrates the average count of erroneous and missed counts relative to the threshold level in the discriminator. This evaluation is conducted under the condition where the readout channel is triggered by  $10^8$  well-separated charge pulses in each firing cycle, with a trigger period of 250 ns (equivalent to an event rate of 4 MHz).

Analyzing the presented data, the optimal threshold level is identified as  $V_{th} = 105 \text{ mV}$ . At this threshold level, the average count of erroneous and missed counts is nearly equivalent. This equilibrium signifies a balanced performance where the readout channel effectively mitigates both types of errors (in applications where this is acceptable) highlighting the efficacy of this particular threshold setting. Such an experiment can be performed for other values of detector charge signal to evaluate the optimum threshold level.



Fig. 5-30. Average number of erroneous (blue) and missed (red) counts as a function of the discriminator threshold level for  $Q_{in} = 160 \text{ aC}$ .



Fig. 5-31. Digital pulses generated by the readout channel for '110000....' and '111000....' trigger codes in blue and red, respectively.

To probe the operation accuracy of the readout channel when 2 or 3 consecutive events occur in consecutive time frames, using the optimum threshold level, a trigger code is used in every trigger period of 250 ns, which can be illustrated by logic states '110000 ....' and '111000 ....', respectively. Figure 5-31 illustrates the digital pulses generated by the readout channel for the aforementioned trigger codes, indicating the ability of the readout channel to accurately respond to and register consecutive events within consecutive timeframes. The measured propagation delay of the readout channel is 1.83 ns, attesting to its swift and accurate response in capturing consecutive events occurring in quick succession.



Fig. 5-32. Average number of dark counts as a function of the threshold level.



Fig. 5-33. Average error rate along with the corresponding standard deviation, as well as the 3-sigma error rate, as a function of the threshold level for  $Q_{in} = 160 \text{ aC}$ .

Furthermore, it is imperative that the readout channel refrains from generating any digital pulses at its output in the absence of charge generated by the detector, indicated

by an all-zero trigger code. Digital pulses generated in this scenario are termed "dark" counts, attributed to the inherent noise within the readout channel building blocks. To evaluate this aspect, the digital output of the readout channel is systematically recorded for a specified duration while the entire trigger register maintains a logic state of '0'. Conducting this test 100 cycles, Fig. 5-32 illustrates the average dark counts as a function of the threshold level, spanning a measurement time equivalent to 10<sup>10</sup> time frames in each cycle. The average number of dark counts diminishes with a larger threshold; however, these counts remain negligible in comparison to the number of incorrect occurrences across all threshold levels. This is attributed to the conservative setting of the threshold, maintained at levels greater than 6 times the noise power.

To evaluate the effectiveness of the proposed architecture and estimate the detection error rate, the readout channel is activated by a trigger code consisting of  $10^{10}$  Poissonian-distributed logic states of '1'. In a series of 100 iterations of experimental test for  $Q_{in} = 160$  aC, Fig. 5-33 illustrates the average error rate  $\mu_{error}$  (in blue) along with the corresponding standard deviation  $\sigma_{error}$  (in black) as a function of threshold level.

Additionally, the 3-sigma error rate, calculated through  $\text{Error}_{3\sigma} = \mu_{\text{error}} + 3 \times \sigma_{\text{error}}$ , is highlighted in red. This visualization provides a comprehensive overview of the system performance and the impact of threshold levels on error rates. For instance, at the optimum threshold level  $V_{\text{Th}} = 105 \text{ mV}$ , the average error rate of the proposed architecture is calculated as  $\mu_{\text{error}} = 0.63 \text{ ppm}$  with a standard deviation of  $\sigma_{\text{error}} = 0.28$ . This corresponds to a 3-sigma error rate of  $\text{Error}_{3\sigma} = 1.47 \text{ ppm}$ . Figure 5-34 denotes the 3-sigma error rate as a function of threshold level for various detector charge signals ( $Q_{\text{in}}$ ) spanning from 140 aC to 200 aC. As anticipated, larger input charge provides further improved performance this adjustment would likely result in a better SNR. The readout channels have a power consumption of 370 $\mu$ W per pixel.

Figure 5-35 illustrates measured  $\text{Error}_{3\sigma}$  across nine readout pixels in five chip dies for a detector charge signal of  $Q_{\text{in}} = 160 \text{ aC}$  and a threshold level of  $V_{\text{Th}} = 105 \text{ mV}$ . The maximum  $\text{Error}_{3\sigma}$  across these nine pixels in five dies is less than 1.485 ppm.



Fig. 5-34. The maximum error rate as a function of the threshold level for various detector charge signals  $(Q_{in})$  spanning from 140 aC to 200 aC.



Fig. 5-35. The measured  $\text{Error}_{3\sigma}$  across nine readout pixels in five chip dies for a detector charge signal of  $Q_{in} = 160$  aC and a threshold level of  $V_{Th} = 105$  mV.

# 5.2.2 Experiemental Qualification of Readout Channels Operating in Open Circuit Mode

The performance of the readout channels in open circuit mode is evaluated by analyzing  $OUT_{ph_1}$  and  $OUT_{ph_2}$ . Tests include verifying offset compensation and capacitor matching, optimizing the threshold level, and assessing the detection accuracy.

#### 5.2.2.1 Offset Compensation

For offset compensation, the inputs of the comparator are shorted to enable the mechanism. This compensation process is executed within a 200 ns time window and must be repeated periodically every 1 ms to maintain accuracy. Figure 5-36 illustrates the output buffer signals during the offset compensation procedure for a storage capacitor value of  $C_c = 2$  fF.

Initially, due to the inherent offset of the comparator, the output  $OUT_{ph_2}$  is forced to a logic '1'. However, after approximately 160 ns, the comparator outputs begin toggling, demonstrating the success of the proposed mechanism in actively compensating for the offset. This ensures robust operation by mitigating offset-induced errors.



Fig. 5-36. The output buffer signals during the offset compensation procedure for a storage capacitor value of  $C_c = 2$  fF.

#### 5.2.2.2 Capacitor Matching

After compensating for the comparator's offset, the capacitor matching mechanism is activated to correct the mismatch between the detector capacitor ( $C_D$ ) and the dummy capacitor ( $C_{dummy}$ ), thereby preventing the generation of differential signals at the comparator's input. This self-calibration process is executed within a 350 ns time window during the startup phase of the readout channel. For a detector capacitor of  $C_D = 40$  fF and a dummy capacitor of  $C_{dummy} = 30$  fF (resulting in a 10 fF mismatch), Figure 5-37 illustrates the output buffer signals during the capacitor matching process.

Initially, the differential voltage caused by the capacitor mismatch drives the comparator output  $OUT_{ph_2}$  to logic '1'. Sparse pulses then appear on  $OUT_{ph_1}$ , reflecting the SAR logic's attempts to align  $C_{dummy}$  with  $C_D$ . After approximately 140 ns, the comparator outputs begin toggling, demonstrating the successful matching of the input capacitors. After the capacitor matching mechanism, the offset compensation is reactivated to account for offset drift over the storage capacitor ( $C_C$ ) during the matching process. This results in a total initial self-calibration time of 800 ns, which includes two 25 ns idle periods between the offset compensation and capacitor matching processes.



Fig. 5-37. The output buffer signals during the capacitor matching process for a 10 fF mismatch.

#### 5.2.2.3 Assessment of Detection Accuracy

To experimentally assess the detection accuracy of the readout channels, the DAB generates trigger signals based on digital codes stored in the trigger register. By comparing the logical state patterns in the trigger and data registers, the DAB evaluates the detection error rate and operational accuracy of the readout channel. To ensure consistent results, the readout channel undergoes 100 firing cycles using the trigger code, with the outcomes averaged for reliability.

As previously mentioned, during the offset compensation phase, the readout channel is temporarily blind to incoming charge signals as it disconnects from the detector, resulting in a period of deadtime during which electrons landing on the detector surface go undetected. The duration of this phase is directly correlated with the detection accuracy. However, in specific SEM applications, periodic intermediate breaks in the scanning process can be repurposed for offset compensation [7]. Utilizing these breaks allows the detection error rate to be minimized, ensuring enhanced performance without compromising accuracy. The subsequent experimental tests are conducted under these conditions.



Fig. 5-38. Average number of erroneous (blue) and missed (red) counts after the comparator as a function of the threshold level for  $Q_{in} = 160 \text{ aC}$ .



Fig. 5-39. Average error rate as a function of the threshold level for  $Q_{in} = 160 \text{ aC}$  in a temperature range of  $20^{\circ}$ C to  $43^{\circ}$ C.

For a detector charge signal of  $Q_{in} = 160 \text{ aC}$  and detector capacitance of  $C_D = 30 \text{ fF}$  matched with  $C_{dummv}$ , Fig. 5-38 shows the average counts of erroneous and

missed detections after the comparator as a function of the threshold level. This evaluation is conducted with the readout channel triggered by  $10^8$  well-separated charge pulses per firing cycle, at a trigger period of 250 ns (equivalent to a 4 MHz event rate). The data reveals that the optimal threshold level is  $Q_{th} \approx 88$  aC (equivalent to 550 e<sup>-</sup>), where the average counts of erroneous and missed detections are nearly equal. This methodology can be extended to other detector charge signal values to determine their corresponding optimal threshold levels.

To assess the effectiveness of the proposed architecture and estimate the detection error rate, the readout channel is activated using a trigger code comprising  $10^{10}$  Poissonian-distributed logic states of '1'. Over 100 iterations of experimental tests for  $Q_{in} = 160 \text{ aC}$ , Fig. 5-39 depicts the average error rate  $\mu_{error}$  as a function of the threshold level within a temperature range of 20°C to 43°C for a detector capacitance of  $C_D = 30$  fF matched with  $C_{dummy}$ .

The results show that the error rate sharply increases at the extreme ends of the threshold level, particularly at higher temperatures. This degradation in accuracy arises from the substantial rise in leakage currents caused by elevated temperatures, leading to increased occurrences of false positives and false negatives in the comparator. The leakage currents introduce additional noise charge at the comparator inputs, which, at higher temperatures, can rival the signal and threshold charge stored in the detector and dummy capacitors. Lowering the die temperature effectively reduces leakage currents, thereby enhancing the detection accuracy of the readout channel. In the targeted application of this thesis, the open circuit mode readout channels are designed to operate at a controlled temperature of 20°C using active cooling mechanism [7], and all subsequent results adhere to this operating condition.

For a series of experimental tests conducted at 20°C, Fig. 5-40 illustrates the average error rate,  $\mu_{error}$  (blue), alongside its standard deviation,  $\sigma_{error}$  (black), as a function of the threshold level for  $Q_{in} = 160$  aC. Additionally, the 3-sigma error rate, calculated as  $\text{Error}_{3\sigma} = \mu_{error} + 3 \times \sigma_{error}$ , is shown in red. This representation provides an insightful analysis of system performance and highlights the influence of threshold levels on error rates. The measured propagation delay of the readout channel is 0.7 ns at 20°C, demonstrating its rapid and precise response, enabling reliable detection of consecutive events occurring in consecutive timeframes.



Fig. 5-40. Average error rate along with the corresponding standard deviation, as well as the 3-sigma error rate, as a function of the threshold level for  $Q_{in} = 160 \text{ aC}$  at 20°C.



Fig. 5-41. The 3-sigma error rate as a function of the threshold level for various detector charge signals (Q<sub>in</sub>) spanning from 140 aC to 200 aC at 20°C.

At the optimum threshold level of  $Q_{th} \approx 88 \text{ aC}$  (equivalent to 550 e<sup>-</sup>), the proposed architecture achieves an average error rate of  $\mu_{error} = 0.85$  ppm with a standard deviation of  $\sigma_{error} = 0.29$ . This corresponds to a 3-sigma error rate of  $\text{Error}_{3\sigma} = 1.72$  ppm. Figure 5-41 illustrates the 3-sigma error rate as a function of the threshold level for various detector charge signals ( $Q_{in}$ ) ranging from 140 aC to 200 aC at 20°C.

As expected, higher input charge signals result in improved performance due to an enhanced SNR, further validating the effectiveness of the proposed architecture in highaccuracy detection scenarios.



Fig. 5-42. The 3-sigma error rate as a function of the threshold level for a fixed input charge signal of  $Q_{in} = 160 \text{ aC}$ , with  $C_D$  values ranging from 30 fF to 50 fF.



Fig. 5-43. Average number of dark counts as a function of the threshold level in different temperatures.

As discussed in Chapter 4, the signal amplitude at the input of the comparator is inversely proportional to the value of the detector capacitance ( $C_D$ ), meaning that a larger  $C_D$  results in a smaller signal amplitude. Consequently, higher detection error rates are anticipated for larger $C_D$  values. Figure 5-42 illustrates the 3-sigma error rate as a function of the threshold level for a fixed input charge signal of  $Q_{in} = 160 \text{ aC}$ ,

with  $C_D$  values ranging from 30 fF to 50 fF, each matched with  $C_{dummy}$ . As shown, the detection accuracy deteriorates noticeably for larger  $C_D$  values due to the reduced signal amplitude at the comparator input. The readout channels have a power consumption of 200µW per pixel.

To evaluated the dark counts of the readout channel, the digital output of the comparator is systematically recorded for a specified duration while the entire trigger register maintains a logic state of '0'. Conducting this test 100 cycles, Fig. 5-43 illustrates the average dark counts as a function of the threshold level, spanning a measurement time equivalent to  $10^{10}$  time frames in each cycle. The test is repeated in different temperatures. The average number of dark counts diminishes with a larger threshold; however, higher temperatures gives rise to a rise in the dark counts. For larger threshold levels, the number of dark counts becomes less significant; however, in lower threshold levels, especially at higher temperatures, they play a more significant role in operational accuracy of the readout channel.

Figure 5-44 illustrates the measured  $\text{Error}_{3\sigma}$  across twelve readout pixels in five chip dies for a detector charge signal of  $Q_{in} = 160 \text{ aC}$ , detector capacitance of  $C_D = 30 \text{ fF}$  and a threshold level of  $Q_{th} \approx 88 \text{ aC}$  (equivalent to 550 e<sup>-</sup>). The maximum  $\text{Error}_{3\sigma}$  across these twelve pixels in five dies is less than 1.74 ppm.



Fig. 5-44. The measured  $\text{Error}_{3\sigma}$  across twelve readout pixels in five chip dies for a detector charge signal of  $Q_{\text{in}} = 160 \text{ aC}$ , detector capacitance of  $C_D = 30 \text{ fF}$  and a threshold level of  $Q_{\text{th}} \approx 88 \text{ aC}$  (equivalent to 550 e<sup>-</sup>).

# 5.3 Reference

- A. Mohammad Zaki and S. Nihtianov, "Characterization Challenges of a Low Noise Charge Detection ROIC," IEEE Trans. Instrum. Meas., vol. 71, pp. 1–8, 2022, doi: 10.1109/TIM.2022.3160529.
- [2] A. Mohammad Zaki and S. Nihtianov, "A High-Precision Particle Detection ROIC With an Active Shaper in 40 nm CMOS with Sub 200 aC Sensitivity," in 2024 IEEE 33rd International Symposium on Industrial Electronics (ISIE), Ulsan, Korea, Republic of: IEEE, Jun. 2024, pp. 1–6. doi: 10.1109/ISIE54533.2024.10595727.
- [3] H. Spieler, Semiconductor Detector Systems. Oxford University Press, 2005. doi: 10.1093/acprof:oso/9780198527848.001.0001.
- [4] A. Afzali-Kusha, M. Nagata, N. K. Verghese, and D. J. Allstot, "Substrate Noise Coupling in SoC Design: Modeling, Avoidance, and Validation," Proceedings of the IEEE, vol. 94, no. 12, pp. 2109–2138, Dec. 2006, doi: 10.1109/JPROC.2006.886029.
- [5] A. A.G. Helmy and M. Ismail, Substrate noise coupling in RFICs. in Analog circuits and signal processing series. New York: Springer, 2008.
- [6] K. Joardar, "A simple approach to modeling cross-talk in integrated circuits," in IEEE Journal of Solid-State Circuits, vol. 29, no. 10, pp. 1212-1219, Oct. 1994, doi: 10.1109/4.315205.
- [7] Y. Wang, Z. Dong, R.-L. Lai, and K. Kanai, "Semiconductor charged particle detector for microscopy," WO2019233991A1, Dec. 12, 2019.

# **6** Conclusions and Future Works

# 6.1 Conclusions

This thesis has presented a comprehensive investigation into the development of advanced readout frontend electronics for high-precision charge detection, addressing key challenges in noise reduction, time resolution, detection accuracy, and power efficiency. The analysis in Chapter 2 demonstrates that state-of-the-art solutions, while capable of achieving low power consumption and reasonable noise levels, often fall short in balancing these attributes with high time resolution, particularly under high input flux conditions.

To bridge these performance gaps, this work proposed, designed, and experimentally validated two distinct readout frontends tailored for short circuit and open circuit operation modes of a PIN diode detector. These frontends are optimized to detect weak charge signals generated by external electrons striking the semiconductor-based detector at random intervals, with a focus on achieving high sensitivity to low-energy signals while conserving power. The designs demonstrate a strategic balance between signal amplification, noise reduction, and power efficiency, meeting the demanding requirements of precision charge detection.

A comparative analysis of the proposed frontends, summarized in Table 6-1, emphasizes their capability in detecting fast, low-energy charge signals with enhanced noise performance, quantified through the Equivalent Noise Charge (ENC) metric. For the sake of the comparative study considering the targeted performance requirements we introduced a figure of merit (FoM), defined as the product of ENC, time resolution, and power consumption, providing a holistic measure of performance. A lower FoM signifies superior design, optimizing the trade-offs among noise, timing precision, and energy efficiency. It is worth noting that this FoM, although useful in this research, does not pretend for universality. The experimental findings demonstrate that the proposed readout frontends achieve a comparable FoM with state-of-the-art solutions while surpassing them in certain critical metrics. Notably, both frontends achieve a remarkable time resolution of 2.5 ns— more than ten times faster than existing designs—and maintain low noise performance, enabling detection accuracy below 6 ppm. These advancements are accomplished while sustaining power consumption below 400  $\mu$ W, highlighting the efficiency of the proposed solutions.

|                                                | [1]       | [2]    | [3]     | [4]        | [5]    | [6]*      | This Work                |                         |
|------------------------------------------------|-----------|--------|---------|------------|--------|-----------|--------------------------|-------------------------|
|                                                |           |        |         |            |        |           | Short<br>Circuit<br>Mode | Open<br>Circuit<br>Mode |
| Process<br>[nm]                                | 40        | 130    | 40      | 110        | 65     | 40        | 40                       | 40                      |
| Pixel Area<br>[µm <sup>2</sup> ]               | 50×50     | 75×75  | 100×100 | 75×75      | 35×35  | 80×90     | 130×50                   | 150×100                 |
| Input Charge<br>[Ke <sup>-</sup> ]             | 4.1 - 8.3 | 2.4    | 2.2     | 0.85 - 45  | 1 – 16 | 0.8 - 1.2 | 0.8 – 1.2                |                         |
| ENC<br>[e <sup>-</sup> <sub>rms</sub> ]        | 188       | 44     | 212     | 89 - 150   | 20     | 27        | 48                       | 27                      |
| Time Resolution<br>[ns]                        | 34        | 1666   | 81      | 100        | 10000  | 2.5       | 2.5                      |                         |
| Power/Pixel<br>[µW]                            | 26        | 42     | 45      | 8 - 55     | 180    | 190       | 370                      | 200                     |
| Pileup Correction                              | Yes       | Yes    | No      | Yes        | No     | Yes       | Yes                      |                         |
| #Threshold Bins                                | 1 – 3     | 2      | 1       | 2          | 2      | 1         | 1                        |                         |
| 3σ Error Rate<br>[ppm]                         | -         | -      | -       | -          | -      | -         | 1.72                     | 1.47                    |
| FoM<br>$[e_{rms}^- \times \mu s \times \mu W]$ | 166.2     | 3078.7 | 772.7   | 71.2 - 825 | 36000  | 12.8      | 40.7                     | 13.5                    |

Table 6-1. Performance summary of the state-of-the-art readout frontends.

\*Simulation results

A comprehensive comparative analysis highlights the unique strengths of each operational mode. The open circuit mode frontend stands out for its remarkable energy efficiency, achieving 45 % lower power consumption compared to its short circuit counterpart. This makes it particularly well-suited for applications where minimizing power consumption is a critical requirement. However, despite its efficiency advantage, the open circuit mode introduces periodic deadtimes during charge detection due to the offset compensation phases, during which the readout frontend becomes temporarily blind to incoming charge signals, potentially leading to undetected events. Nevertheless, this limitation is mitigated in specific SEM applications, where natural scanning pauses can be strategically utilized for offset compensation, minimizing detection errors and maintaining performance without degradation. Additionally, this mode's temperaturesensitive nature necessitates active cooling to maintain stable and accurate detection. It is noteworthy that this design represents the first experimentally implemented and tested prototype for open circuit operation. Further investigations and deeper studies are expected to address the current limitations and unlock pathways to significantly enhance its performance. Thanks to its competitive performance, it remains a promising candidate for future advancements.

On the other hand, the short circuit mode frontend offers several critical advantages that make it indispensable for applications requiring continuous and highly accurate monitoring. Foremost among these is its superior detection accuracy, evidenced by significantly lower average and 3-sigma detection error rates compared to the open circuit mode. Furthermore, the absence of periodic deadtime in this mode allows for uninterrupted signal acquisition, making it highly suitable for applications where every event must be captured without fail. The short circuit mode's stability and reliability, despite higher power consumption, render it the preferred choice for scenarios where maintaining operational accuracy and stability are of paramount importance.

These findings underscore the complementary strengths of the both proposed designs. The short circuit mode excels in scenarios prioritizing detection accuracy and stability, while the open circuit mode provides a compelling solution for energy-constrained applications. This flexibility demonstrates the adaptability of the designs for optimized deployment across a diverse range of operational requirements. In conclusion, this thesis contributes a significant advancement in the design and evaluation of high-precision readout frontends, providing both theoretical insights and practical solutions for charge detection in scanning electron microscopy and similar high-resolution detection systems. The proposed designs demonstrate a successful trade-off among key performance metrics, establishing a foundation for further innovations in low-power, high-precision readout systems.

# 6.2 Main Findings and Contributions

The proposed solutions in this research were rigorously qualified and tested through the design and fabrication of five custom chips (Appendix C), which were used to experimentally evaluate the operational accuracy and performance of the developed readout circuits. The key findings and contributions of this research work are summarized as follows:

- A thorough investigation of state-of-the-art readout ASICs identified critical limitations in detecting weak charge signals with high time resolution, precision, and low power consumption. This research addresses these limitations by proposing the design of two optimized readout frontends, featuring enhanced transfer characteristics to achieve high precision and time resolution for low-energy charge detection, all while maintaining low power consumption. (Chapters 2, 3, and 4)
- In the short circuit operation mode, a low-bandwidth preamplifier interfaces the detector, ensuring sufficient SNR to maintain precision in charge detection. The signal shaping filter achieves nanosecond-level time resolution by selectively passing relevant frequency components. The proposed designed solutions explore the trade-off between active and passive filter structures, which impacts the comparator's complexity and power consumption. This approach demonstrates an effective balance between speed, accuracy, and energy efficiency, addressing key challenges in highperformance charge detection systems. (Chapter 3)

- In the open circuit operation mode, significant power efficiency is achieved by leveraging the detector's intrinsic junction capacitance for charge-to-voltage conversion, effectively eliminating the need for power-hungry preamplifiers. Additionally, the detector's inherent memory characteristic allows for dynamic comparator operation to further minimize the power consumption. This design not only demonstrates improved energy efficiency but also represents the first experimentally implemented and tested prototype for open circuit operation, marking a novel contribution to the field. (Chapter 4)
- A comparative analysis demonstrated that the short circuit mode frontend delivers lower average and 3-sigma detection error rates, making it ideal for high-precision detection applications. In contrast, the open circuit mode design offers a 45% reduction in power consumption, making it a viable option for energy-efficient applications. However, for optimal high-precision performance, periodic offset compensation phases must be synchronized with scanning pauses. Additionally, active cooling is recommended to maintain detection accuracy in temperature-sensitive environments. (Chapter 5)

These findings collectively underscore the advancements made in readout electronics design for high-precision charge detection, demonstrating a successful balance between high performance and energy efficiency. The solutions developed in this thesis have the potential to contribute to more advanced and power-efficient readout systems for future high-resolution detection applications.

#### 6.3 Future Works

While the work presented in this thesis represents a significant step in improving the performance and operational accuracy of the charge detection readout frontends, it does not represent the end of this pursuit. Below are some aspects of this work that could be explored for further improvement, based on innovative architectures or circuit
designs, which pose challenges due to insufficient knowledge or feasibility at the current moment.

#### 6.3.1 Short Circuit Operation Mode

The architecture of the readout frontend presents opportunities for further optimization to achieve lower power consumption while maintaining high performance and detection accuracy. Several potential improvements, which are not immediately applicable but could be explored in future work, can be identified based on the proposed solutions.

In the readout frontend employing a CSA, passive signal shaping filter, and precision comparator, two critical challenges are the small signal amplitude and baseline drift after the passive shaping filter. These factors can adversely affect event detection accuracy. One possible innovative solution to mitigate baseline drift could involve developing a hybrid baseline restoration mechanism integrated with the shaping function itself, possibly through advanced adaptive feedback loops, instead of the conventional external baseline restoration approaches, as suggested in [1]. This would aim to stabilize the signal baseline with minimal additional power consumption, but would require novel circuit designs to balance efficiency and baseline stabilization. Leveraging advanced CMOS process nodes could facilitate the integration of such adaptive feedback circuits by providing higher transistor density and improved analog performance at lower power.

For the readout frontend incorporating a CSA, active signal shaping filter, and comparator, a promising direction for improvement involves integrating the signal shaping function directly into the CSA's transfer characteristics, for example, by employing an adaptive DC servo loop to suppress low-frequency contributions, as proposed in [7], [8]. However, this concept requires the development of dynamic servo mechanisms that could adjust in real-time to varying input signals, which currently faces challenges in terms of control loop stability and response time. Utilizing advanced CMOS nodes could enable the implementation of dynamic servo loops with enhanced speed and stability, given the faster transistor switching speeds and reduced parasitics. Implementing this approach would require a substantial redesign of the CSA's core amplifier with a differential topology, which inherently increases power consumption. Although this topology offers superior common-mode rejection and stability, its design poses a significant challenge in optimizing power efficiency without compromising performance, particularly for high-precision tasks. The comparator would also need to be redesigned to handle low-noise, low-offset detection at the reduced signal amplitudes resulting from the CSA's limited gain.

Despite the complexity at the CSA and comparator levels, integrating the shaping function into the CSA and reducing the need for a separate active shaping filter could offer a more power-efficient solution, without sacrificing detection accuracy. This approach would require new circuit design techniques that combine both signal processing and shaping into a single stage, which is currently an unexplored area that needs further development. The design complexity associated with such integrated functions may be mitigated by exploiting the benefits of advanced CMOS process technologies that support mixed-signal circuit integration.

These architectural enhancements highlight the potential for energy-efficient designs and the importance of developing innovative circuit topologies to overcome the trade-offs between noise performance, signal stability, and power efficiency in highprecision readout frontends.

#### 6.3.2 Open Circuit Operation Mode

The architecture of the readout frontend could be redesigned and optimized to eliminate the deadtime associated with the offset compensation mechanism. One potential approach involves exploring a Ping-Pong readout architecture where one comparator detects the input charge signals while the other is dedicated to self-calibrating for offset compensation. By alternating between these two comparators, this architecture could enable continuous operation without deadtime during offset compensation. However, this method requires the development of fast switching mechanisms and precise synchronization, which are currently not fully feasible due to limited circuit performance and control mechanisms. The use of advanced CMOS technologies could offer faster switching and better synchronization due to their superior transistor performance and reduced parasitic effects.

While the Ping-Pong architecture presents a promising direction for eliminating deadtime, it involves considerable trade-offs, including the need for doubling the power consumption and increasing area occupation, as an additional copy of the readout channel would be implemented within the pixel. Furthermore, the switching process between comparators needs to be optimized, as any misalignment could result in missed detections, undermining the accuracy of the readout system. Advanced CMOS nodes with smaller feature sizes could alleviate the area constraints while offering power-efficient solutions for high-frequency switching circuits.

Thus, while the Ping-Pong architecture offers significant potential for eliminating deadtime, its implementation poses challenges that require a deeper understanding of fast comparator switching, synchronization, and resource optimization. A thorough investigation of these aspects is needed to balance the advantages of continuous readout with the inherent costs and complexities introduced by increased resource usage.

### 6.4 References

- P. Grybos et al., "SPHIRD–Single Photon Counting Pixel Readout ASIC With Pulse Pile-Up Compensation Methods," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 70, no. 9, pp. 3248–3252, Sep. 2023, doi: 10.1109/TCSII.2023.3267859.
- [2] R. Kleczek, P. Kmon, P. Maj, R. Szczygiel, M. Zoladz, and P. Grybos, "Single Photon Counting Readout IC With 44 e- rms ENC and 5.5 e- rms Offset Spread With Charge Sensitive Amplifier Active Feedback Discharge," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 70, no. 5, pp. 1882–1892, May 2023, doi: 10.1109/TCSI.2023.3241738.
- [3] R. Kleczek, P. Grybos, R. Szczygiel, and P. Maj, "Single Photon-Counting Pixel Readout Chip Operating Up to 1.2 Gcps/mm2 for Digital X-Ray Imaging Systems," IEEE Journal of Solid-State Circuits, vol. 53, no. 9, pp. 2651–2662, Sep. 2018, doi: 10.1109/JSSC.2018.2851234.
- [4] M. Bochenek et al., "IBEX: Versatile Readout ASIC With Spectral Imaging Capability and High Count Rate Capability," IEEE Transactions on Nuclear Science, vol. 65, no. 6, pp. 1285–1291, Jun. 2018, doi: 10.1109/TNS.2018.2832464.
- [5] E. Fabbrica et al., "MIRA: A Low-Noise ASIC With 35-µm Pixel Pitch for the Readout of Microchannel Plates," IEEE Transactions on Nuclear Science, vol. 71, no. 6, pp. 1339–1347, Jun. 2024, doi: 10.1109/TNS.2024.3401221.
- [6] L. Bouman, "High-Speed Readout Circuit for PIN Single Electron Detector in Voltage Mode," M.S. thesis, Dept. Microelectronics, TU Delft, Delft, the Nether-lands, 2023.
- [7] G. Ferrari, F. Gozzini, and M. Sampietro, "A Current-Sensitive Front-End Amplifier for Nano-Biosensors with a 2MHz BW," in 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, Feb. 2007, pp. 164–165. doi: 10.1109/ISSCC.2007.373345.
- [8] G. Ferrari, F. Gozzini, and M. Sampietro, "Very high sensitivity CMOS circuit to track fast biological current signals," in 2006 IEEE Biomedical Circuits and Systems Conference, Nov. 2006, pp. 53–56. doi: 10.1109/BIOCAS.2006.4600306.

# **Appendix A**

### A.1 CSA Loop and Stability Analysis

In analyzing the loop gain and stability of the CSA, the circuit can be conceptualized as two nested loops: the internal loop and the total loop. The internal loop consists of the core amplifier, source follower stage, and feedback capacitor ( $C_F$ ). The total loop incorporates the ICON Cell in addition to the internal loop, as shown schematically in Fig. A-1.



Fig. A-1. Schematic of the CSA as well as the division of the loops.

### A.2 Internal Loop Analysis

The ICON Cell receives current at its input and outputs a proportional current, making the current transfer function the most convenient way to study the internal loop. In the ideal case, the current transfer function  $(T_{ideal})$  of the internal loop (Fig. A-2) is:

$$T_{ideal}(s) = \frac{I_{out}}{I_{in}} = \frac{1 + SC_F R_1}{SC_F R_1}$$
(A-1)

To determine the real transfer function  $(T_{real})$ , the singularities of the loop must be calculated. Given the structure of the internal loop, calculating the voltage transfer

function is simpler. Once the loop's singularities are obtained from the voltage transfer function, they can be applied to  $T_{ideal}$  to yield  $T_{real}$ .



Fig. A-2. Internal loop of the CSA.

The voltage transfer function includes a forward path, from the core amplifier's input  $(V_f)$  to its output  $(V_t)$ , and a feedback path, from  $V_t$  back to  $V_f$  through the internal loop network. The feedback transfer function ( $\beta$ ) is given by:

$$\beta(s) = \frac{v_f}{v_t} = \frac{g_{m_b}R_1}{1+g_{m_b}R_1} \times \frac{SC_F R_{out}_{ICON}}{(1+s\tau_{LF})(1+s\tau_{HF})}$$
(A-2)

where  $\tau_{LF} = C_{tot}R_{out_{ICON}}$  and  $\tau_{HF} = C_AR_1$  while  $C_{tot} = (C_D + C_G + C_F)$  and  $C_A = ((C_D + C_G) \times C_F)/C_{tot}$ .

To assess stability, the feedback network's inverse transfer function  $1/\beta$  is mapped over the forward path transfer function (A<sub>F</sub>) as shown in Fig. A-3. The crossover frequencies where these functions intersect are calculated as:

$$f_{cut_1} = \frac{1}{2\pi C_{tot} R_{out_{ICON}} A(0)}$$
(A-3)

$$f_{cut_2} = GBWP \times \frac{C_F}{C_{tot}} \times \frac{g_{m_b}R_1}{1+g_{m_b}R_1}$$
(A-4)

Since  $R_{out_{ICON}}$  is on the order of  $G\Omega$ ,  $f_{cut_1}$  occurs at very low frequencies (a few Hz), allowing it to be neglected for high-frequency stability analysis. The real transfer function ( $T_{real}$ ) then closely follows ( $T_{ideal}$ ) below  $f_{cut_2}$  and aligns with  $A_F$  afterward:

$$T_{\text{ideal}}(s) = \frac{I_{\text{out}}}{I_{\text{in}}} = A(0) \times \frac{g_{\text{m}_{b}}R_{1}}{1+g_{\text{m}_{b}}R_{1}} \times \frac{R_{\text{out}_{\text{ICON}}}}{R_{1}} \times \frac{1+SC_{F}R_{1}}{(1+s\tau_{\text{cut}_{1}})(1+s\tau_{\text{cut}_{2}})(1+s\tau_{p_{2}})}$$
(A-4)

where  $\tau_{p_2}$  corresponds to the second pole of the core amplifier.



Fig. A-3. Stability analysis of the internal loop in the CSA.

#### A.2.1 Total Loop Analysis

The total loop gain  $(G_{loop_{total}})$  combines the real transfer function  $(T_{real})$  of the internal loop and the transfer function of the ICON Cell  $(T_{ICON})$  which, considering the dominant contributions, can be written as:

$$T_{ICON}(s) = \frac{1}{K} \times \frac{1 - s\tau_{z_{ICON}}}{1 + s\tau_{p_{ICON}}}$$
(A-5)

where  $\tau_{p_{ICON}} = ((C_{gs_5} + C_{gs_9})/(g_{m_5} + g_{m_8})) + ((C_{gs_6} + C_{gs_7})/(g_{m_6} + g_{m_7}))$  and  $\tau_{z_{ICON}} = C_{ds_6}/(g_{m_9} + g_{m_{10}})$  represent the dominant pole and zero of the ICON Cell. The total loop gain ( $G_{loop_{total}}$ ) is:

$$G_{loop_{total}}(s) = T_{real} \times \frac{R_{out_{SF}}}{R_{out_{SF}} + \frac{1}{g_{m_6} + g_{m_7}}} \times T_{ICON}(s)$$
(A-6)

where  $R_{out_{SF}}$  is the resistance seen from source follower output node while  $g_{m_6}$  and  $g_{m_7}$  are the transconductances of the transistors at the input branch of the ICON Cell. Substituting the relevant transfer functions, we get:

$$G_{\text{loop}_{\text{total}}}(s) = \frac{A(0)}{K} \times \frac{g_{\text{m}_{b}} R_{\text{out}_{\text{ICON}}}}{1 + g_{\text{m}_{b}} R_{1}} \times \frac{R_{\text{out}_{\text{SF}}}}{R_{\text{out}_{\text{SF}}} + \frac{1}{g_{\text{m}_{6}} + g_{\text{m}_{7}}}} \times \frac{1 + SC_F R_1}{(1 + s\tau_{\text{cut}_{1}})(1 + s\tau_{\text{cut}_{2}})} \frac{1 - s\tau_{z_{\text{ICON}}}}{1 + s\tau_{\text{pICON}}} (A-7)$$

### A.2.2 Stability Assessment

In Fig. A-4, the total loop gain  $(G_{loop_{total}})$  is plotted, and the total loop cutoff frequency ( $f_{cut_{total}}$ ) is determined by finding the frequency at which  $|G_{loop_{total}}(f_{cut_{total}})| = 1$  which is:

$$f_{cut_{total}} = \frac{1}{2\pi C_F K R_1}$$
(A-8)

Based on simulations, the DC loop gain is  $G_{loop_{total}}(0) = 93$  dB with a phase margin of 82 degrees.



Fig. A-4. Total loop gain of the CSA.

# **Appendix B**

## **B.1 FPGA Code**



Fig. B-1. Overview of the top-level schematic used for programming the FPGA.

# **Appendix C**

## C.1 Chip Gallary



Fig. C-1. Chip 1 – Taped-out on 8 June 2022.





Fig. C-2. Chip 2 - Taped-out on 19 October 2022.



Fig. C-4. Chip 4 – Taped-out on 27 March 2024.

Fig. C-3. Chip 3 – Taped-out on 19 April 2023.



Fig. C-5. Chip 5 – Taped-out on 27 March 2024.

## **Summary**

This Ph.D. dissertation focuses on designing high-precision readout frontends for low energy charge detection in scanning electron microscopy (SEM), achieving a time resolution of 2.5 ns, a detection error rate below 6 ppm, and power consumption under 400  $\mu$ W. Novel techniques at both the system and circuit levels were developed to enhance operational accuracy and meet the target specifications. Two prototypes were presented and experimentally tested to demonstrate the effectiveness of these techniques.

Chapter 1 introduces the motivation, research objectives, and organization of the thesis, highlighting the advancements in SEMs for nanometer-resolution imaging and the challenges posed by high scanning speeds. It emphasizes the need for sensitive detectors and low-noise, power-efficient readout electronics, which often conflict. The main research question is defined as developing a frontend readout architecture with power consumption below 500  $\mu$ W, time resolution of 2.5 ns, and an electron count error under 10 ppm. To address this, the thesis employs a systematic study and iterative design process, resulting in two novel readout frontend architectures. The chapter also outlines the structure of the thesis, covering the operating principles of the PIN diode, design details, experimental evaluations, and conclusions.

Chapter 2 provides a detailed review of the target application specifications, focusing on the design and requirements for detecting weak charge signals with high precision and time resolution. It critically analyzes the current state-of-the-art readout frontends, highlighting their strengths and inherent limitations, particularly in terms of noise performance, time resolution, and power consumption. This chapter also introduces the concept of short and open circuit readout modes for PIN-diodes, offering insights into their potential advantages for addressing the challenges identified in the existing systems.

Chapter 3 presents the design of readout solutions for the short circuit operation mode of PIN-diodes, critical for BSE detection in electron microscopy. It examines the use of a preamplifier to create a virtual ground, effectively simulating a zero-impedance load and ensuring accurate charge transfer. The chapter further explores the analog

frontend components: preamplifier, signal shaping filters, and threshold discriminators. Signal shaping filters, both passive and active high-pass types, are discussed for their role in signal optimization by reducing noise and improving signal clarity. Additionally, the threshold discriminator design is analyzed for both filter types, emphasizing the importance of accurate signal discrimination to minimize detection errors. A key focus is the tradeoff between power consumption, noise performance, and detection accuracy, with each stage's design detailed to ensure optimal performance and signal integrity in the short circuit mode.

Chapter 4 explores readout solutions for the open circuit mode of PIN-diodes, focusing on high sensitivity, low power consumption, and signal integrity. It highlights challenges like charge pileup and saturation, proposing solutions such as a reset mechanism and dynamic comparators. The chapter discusses an advanced frontend architecture with offset compensation and active capacitor matching for improved accuracy. Periodic sampling at 800 MHz minimizes timing misalignments, balancing power efficiency and reliability for high-resolution, high-rate applications.

Chapter 5 discusses the experimental setup and qualification of the proposed readout architectures. The device under test (DUT), a 40 nm CMOS chip with short and open circuit mode readout matrices, is tested to validate its ability to detect and digitize charge signals within the specified power budget. The test includes evaluating performance across gain, noise, bandwidth, and threshold levels, using a programmable detector emulating circuit (DEC) to simulate charge signals. The setup features a FPGA-based Data Acquisition Board (DAB) for signal monitoring and a test PCB to run experimental qualifications.

Chapter 6 concludes the thesis by highlighting the development of advanced readout frontends for high-precision charge detection, achieving improved time resolution, accuracy, and power efficiency. The proposed designs, optimized for short circuit and open circuit modes, demonstrate excellent performance. This chapter also proposes and discusses some aspects of this work that could be explored for further improvements.

## Samenvatting

Dit proefschrift richt zich op het ontwerpen van high-precision readout frontends voor lage-energie ladingsdetectie in scanning-elektronenmicroscopie (SEM), met een tijdsresolutie van 2,5 ns, een detectiefoutpercentage onder de 6 ppm, en een energie-verbruik onder de 400  $\mu$ W. Nieuwe technieken op zowel systeem- als circuitniveau zijn ontwikkeld om de operationele nauwkeurigheid te verbeteren en de doelspecificaties te halen. Twee prototypes werden gepresenteerd en experimenteel getest om de effectiviteit van deze technieken te demonstreren.

Hoofdstuk 1 introduceert de motivatie, onderzoeksdoelen en structuur van het proefschrift, waarbij de vooruitgang van SEM's voor beeldvorming met nanometerresolutie en de uitdagingen van hoge scansnelheden worden belicht. Het benadrukt de noodzaak van gevoelige detectoren en ruisarme, energie-efficiënte readout-elektronica, welke vaak conflicteren. De belangrijkste onderzoeksvraag is gedefinieerd als het ontwikkelen van een frontend readout-architectuur met een energieverbruik van minder dan 500  $\mu$ W, een tijdsresolutie van 2,5 ns en een elektronenfout van minder dan 10 ppm. Om dit te bereiken wordt een systematische studie en iteratief ontwerpproces gehanteerd, wat resulteert in twee nieuwe readout frontend-architecturen. Het hoofdstuk beschrijft ook de structuur van het proefschrift, inclusief de werkingsprincipes van de PIN-diode, ontwerpdetails, experimentele evaluaties en conclusies.

Hoofdstuk 2 biedt een gedetailleerd overzicht van de specificaties van de doeltoepassing, met de nadruk op het ontwerp en de vereisten voor het detecteren van zwakke ladingssignalen met hoge precisie en tijdsresolutie. Het analyseert kritisch de huidige state-of-the-art readout frontends, hun sterke punten, en inherente beperkingen, met name op het gebied van ruisprestaties, tijdsresolutie en energieverbruik. Dit hoofdstuk introduceert ook het concept van short en open circuit readout-modi voor PIN-diodes, met inzicht in hun potentiële voordelen ten opzichte van de uitdagingen in bestaande systemen.

Hoofdstuk 3 presenteert het ontwerp van readout-oplossingen voor de short circuitoperationele modus van PIN-diodes, essentieel voor BSE-detectie in elektronenmicroscopie. Het onderzoekt ook het gebruik van een preamplifier om een virtuele massa te creëren, waardoor effectief een nul-impedantiebelasting wordt gesimuleerd en een nauwkeurige ladingsoverdracht wordt gewaarborgd. Het hoofdstuk verkent verder de analoge frontend-componenten: preamplifier, signaalvormfilters en drempeldiscriminatoren. Signaalvormfilters, zowel passieve als actieve high-pass types, worden besproken vanwege hun rol in signaaloptimalisatie door ruis te verminderen en signaalhelderheid te verbeteren. Bovendien wordt het ontwerp van de drempeldiscriminator geanalyseerd voor beide filtertypes, waarbij het belang van nauwkeurige signaaldetectie wordt benadrukt om detectiefouten te minimaliseren. Een belangrijk aandachtspunt is de afweging tussen energieverbruik, ruisprestaties en detectienauwkeurigheid, waarbij elk ontwerpstadium gedetailleerd wordt beschreven om optimale prestaties en signaalintegriteit in de short circuit-modus te waarborgen.

Hoofdstuk 4 onderzoekt readout-oplossingen voor de open circuit-modus van PINdiodes, met de nadruk op hoge gevoeligheid, laag energieverbruik en signaalintegriteit. Het belicht uitdagingen zoals ladingsophoping en verzadiging, en stelt oplossingen voor zoals een resetmechanisme en dynamische comparatoren. Het hoofdstuk bespreekt verder een geavanceerde frontend-architectuur met offsetcompensatie en actieve capacitormatching voor verbeterde nauwkeurigheid. Periodieke sampling bij 800 MHz minimaliseert timingmisalignments en balanceert energie-efficiëntie en betrouwbaarheid voor toepassingen met hoge resolutie en hoge snelheid.

Hoofdstuk 5 bespreekt de experimentele setup en kwalificatie van de voorgestelde readout-architecturen. De Device Under Test (DUT), een 40 nm CMOS-chip met short- en open circuit-modus readout-matrices, wordt getest op zijn vermogen om ladingssignalen te detecteren en digitaliseren binnen het gespecificeerde energieverbruik. De test omvat prestatie-evaluatie op het gebied van versterking, ruis, bandbreedte en drempelniveaus, met behulp van een programmeerbare detector-emulatiecircuit (DEC) om ladingssignalen te simuleren. De setup omvat een FPGA-gebaseerde Data Acquisition Board (DAB) voor signaalmonitoring en een test-PCB om experimentele kwalificaties uit te voeren.

Hoofdstuk 6 sluit het proefschrift af door de ontwikkeling van geavanceerde readout frontends voor high-precision ladingsdetectie te benadrukken, met verbeterde tijdsresolutie, nauwkeurigheid en energie-efficiëntie. De voorgestelde ontwerpen, geoptimaliseerd voor short circuit- en open circuit-modi, tonen uitstekende prestaties. Dit hoofdstuk stelt ook aspecten van dit werk voor die verder onderzocht kunnen worden voor verbeteringen.

# **List of Publications**

### **Journal Papers**

- A. Mohammad Zaki and S. Nihtianov, "Characterization Challenges of a Low Noise Charge Detection ROIC," in IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1-8, 2022.
- 2. A. Mohammad Zaki and S. Nihtianov, "Characterization of a Charge-Sensitive Readout Electronics for Pixelated PIN Detectors," in IEEE Transactions on Instrumentation and Measurement. (Ready for Submission)

### **Conference Papers**

- M. A. Disi, A. Mohammad Zaki, Q. Fan and S. Nihtianov, "High-Count Rate, Low Power and Low Noise Single Electron Readout ASIC in 65nm CMOS Technology," 2021 XXX International Scientific Conference Electronics (ET), Sozopol, Bulgaria, 2021, pp. 1-5.
- A. Mohammad Zaki and S. Nihtianov, "High Time Resolution, Low-Noise, Power-Efficient, Charge-Sensitive Amplifier in 40 nm Technology," 2022 XXXI International Scientific Conference Electronics (ET), pp. 1-6, 2022.
- A. Mohammad Zaki and S. Nihtianov, "Experimental Qualification of a Low-Noise Charge-Sensitive ROIC with Very High Time Resolution," 2023 IEEE 32st International Symposium on Industrial Electronics (ISIE), Helsinki-Espoo, Finland, 2023.
- A. Mohammad Zaki, Yutong Du, S. Nihtianov, "Design and Qualification of a High-Speed Low-Power Comparator in 40 nm CMOS Technology," XXXI International Scientific Conference Electronics (ET), pp. 1-5, 2023.
- A. Mohammad Zaki and S. Nihtianov, "Low-Offset Band-Pass Signal Shaper with High Time Resolution in 40nm CMOS Technology," the 49th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 16–19 October 2023.

- A. Mohammad Zaki, Yutong Du, S. Nihtianov, "Low-Power High Time Resolution Charge Detection ROIC in 40nm CMOS Technology," 2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Glasgow, United Kingdom, 2024.
- A. Mohammad Zaki and S. Nihtianov, "A High-Precision Particle Detection ROIC With an Active Shaper in 40 nm CMOS with Sub 200 aC Sensitivity," 2024 IEEE 33rd International Symposium on Industrial Electronics (ISIE), Busan, South Korea, 2024.
- A. Mohammad Zaki and S. Nihtianov, "Challenges of High-Resolution Electron Detection ASICs for SEM Microscopy," 2024 9th International Conference on Mathematics and Computers in Sciences and Industry (MCSI), Rhodes Island, Greece, 2024, pp. 68-76.
- A. Mohammad Zaki, L. Bouman, S. Nihtianov, "A Readout Frontend with Sub-6ppm Error Rate for Sub-200aC Charge Detection and 200µW Power Consumption in Scanning Electron Microscopy," 2025 IEEE European Solid-State Electronics Research Conference (ESSERC), Munich, Germany, 2025.

## Acknowledgments

As I reach the final stages of this thesis, I am profoundly aware that this achievement is not solely the result of my individual efforts but rather the culmination of guidance, encouragement, and expertise from many remarkable individuals. It is with deep gratitude that I acknowledge the invaluable contributions of those who have supported and shaped this endeavor.

First and foremost, I would like to express my sincerest appreciation to my supervisor and promotor, Prof. Stoyan Nihtianov. Your guidance throughout this research has been truly indispensable. You have continuously encouraged me to think critically and explore new ideas, always offering insightful advice and a broader perspective. Your steady support and belief in my abilities have been instrumental in driving this work forward. I am immensely grateful for the opportunities you have provided and for the academic rigor you have instilled in me.

I am also profoundly grateful to my copromotor, Dr. Sijun Du, for his support and thorough review of my work. Though our interactions were limited, your insightful feedback and the time you dedicated to reviewing my thesis have been valuable in refining and finalizing this research. Your contributions are sincerely appreciated.

A special note of thanks goes to Prof. Kofi Makinwa for his exceptional support throughout my research journey. His guidance and mentorship have been truly invaluable. I have learned a great deal from his vast knowledge and his approach to problem-solving. The thoughtful advice and constructive feedback from Prof. Makinwa have significantly shaped my thinking and contributed to the depth of this work. I feel truly fortunate to have had the opportunity to learn from such an outstanding mentor.

I would also like to extend my heartfelt appreciation to the technicians in the Electronic Instrumentation Laboratory—Zu Yao, Lukasz, Ron, and Jeroen. Your exceptional expertise and dedication were crucial to the success of my experimental tests. From the intricate process of chip bonding to troubleshooting and ensuring the

proper functioning of measurement devices, your support was invaluable in overcoming the technical challenges of this project. Your knowledge and patience allowed me to conduct my experiments with confidence, and I am sincerely grateful for your unwavering assistance.

My sincere gratitude also extends to Joyce, Kellen, and Elizabeth, our group management assistants, whose organizational support has been vital in ensuring the smooth progression of this research. From handling administrative tasks to coordinating logistics, your efficiency and attention to detail allowed me to focus fully on my work. I truly appreciate your tireless efforts in managing operations and your unwavering support throughout this journey. Your contributions have played a significant role in the overall success of this project, and I am deeply thankful for all that you have done.

To my colleagues and friends in our research group—Adrian, Amirhossein, Arthur, Davide, Dennis, Douwe, Efraïm, Eren, Flavia, Floris, Guilherme, Guijje, Imad, Jannik, Jacopo, Jaekyum, Karimeldeen, Lex, Lucia, Martin, Mert, Mingshuang, Mojtaba, Nandor, Nicolas, Nikola, Nuriel, Ole, Rasoul, Reza, Roger, Sandra, Suhas, Sundeep, Tianqi, Vidharshana, Yannik, Yang, Xinling, Xiaoxi, and all others—thank you for your collaborative spirit, stimulating discussions, and the camaraderie we have shared. The mutual support within our group has been essential in overcoming challenges, and I am grateful for the knowledge and insights I have gained from each of you.

I would also like to extend my heartfelt appreciation to my amazing friends— Babak, Ehsan, Faezeh, Leonhard, Mahdi, Mahan, Mohammadjavad, Marjan, Neda, Rasoul, Sajjad, Sina, and so many others—who have been my constant source of joy, encouragement, and support throughout this journey. The countless shared moments of laughter, deep conversations, and unwavering encouragement have made even the toughest days more manageable. Whether through your kind words, spontaneous gettogethers, or simply being there when I needed a friend, you have all played a special role in keeping me balanced and motivated. I am incredibly grateful for the wonderful memories we've created, and I look forward to making many more together in the future! I owe an immeasurable debt of gratitude to my parents, Mohammad and Elahe, for their unconditional love, encouragement, and unwavering belief in me. Your sacrifices, both great and small, have been the foundation upon which my academic journey was built. Through every challenge, you have provided me with the strength and motivation to persevere. Your constant support has been my anchor, and I could not have reached this point without your love and dedication.

To my brother, Erfan, thank you for your endless support and understanding. You have been a source of inspiration and strength throughout this journey. Your encouragement, kindness, and belief in me have helped me overcome many hurdles, and I am deeply grateful for your presence in my life. I wish you nothing but success and fulfillment in your own endeavors as you continue to achieve great things in your journey.

Lastly, my deepest thanks to Sophie for your love, patience, and unwavering support. Your encouragement and presence have been a constant source of strength and motivation. You have stood by me through every challenge, offering kindness, understanding, and a sense of balance. I am incredibly grateful for your support and for all the special moments we have shared throughout this journey.

This thesis represents not only the culmination of my efforts but also the collective support of all those who have been part of this journey. I am deeply grateful to each and every one of you.

> Alireza Mohmmad Zaki Delft, February 2025

### 190 | Acknowledgments

## **About the Author**



Alireza Mohammad Zaki was born in Tehran, Iran, in 1995. He received his B.Sc. degree in Electrical Engineering from Azad University, South Tehran Branch, Iran, in 2017, and his M.Sc. in Electronics Engineering from Politecnico di Milano, Italy, in 2020. For his M.Sc. thesis, he focused on the development of a readout ASIC for UV-ray spectroscopy. In 2021, he joined the Electronic Instrumentation Laboratory at Delft University of Technology, where his doctoral research centered on high-precision charge detection frontend electronics for scanning electron microscopy.