## SOME MODELS AND IMPLEMENTATIONS OF DIGITAL LOGIC FUNCTIONS USING JUNCTION CHARGE-COUPLED DEVICES



TR diss 1667 JAAP HOEKSTRA



## SOME MODELS AND IMPLEMENTATIONS OF DIGITAL LOGIC FUNCTIONS USING JUNCTION CHARGE-COUPLED DEVICES

## SOME MODELS AND IMPLEMENTATIONS OF DIGITAL LOGIC FUNCTIONS USING JUNCTION CHARGE-COUPLED DEVICES



#### PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Technische Universiteit Delft op gezag van de Rector Magnificus, Prof.drs. P.A. Schenck, in het openbaar te verdedigen ten overstaan van een commissie door het College van Dekanen daartoe aangewezen, op dinsdag 4 oktober 1988 te 14.00 uur

door

#### JAAP HOEKSTRA

geboren te Amsterdam doctorandus in de experimentele natuurkunde

> TR diss 1667

#### Dit proefschrift is goedgekeurd door de promotoren: Prof.dr. M. Kleefstra (promotor) Prof.dr.ir. P.M. Dewilde (co-promotor)

Contents

| 1. IN7 | RODUCTIC                              | DN1                                         |
|--------|---------------------------------------|---------------------------------------------|
| 2. OVE | CRVIEW                                |                                             |
| 2.1    |                                       | cal review                                  |
|        | 2.1.1                                 | Charge-coupled devices5                     |
|        | 2.1.2                                 |                                             |
| 2.2    |                                       | charge-coupled logic9                       |
| 2.3    |                                       | ction to systolic arrays10                  |
| 2.5    | Incroduc                              |                                             |
|        |                                       | OPERATION                                   |
| 3.1    |                                       | ction                                       |
| 3.2    | Equations of the electrical potential |                                             |
| 3.3    |                                       | nodels                                      |
|        | 3.3.1                                 | Charge coupling I, the hydraulic model15    |
|        | 3.3.2                                 | Charge storage I storage in the             |
|        | 3.3.2                                 | potential well                              |
| 3.4    | Derivati                              | ion of charge-potential relationships.26    |
| 3.5    |                                       | L charge transport                          |
| 3.5    | 3.5.1                                 | Vertical charge transport I,                |
|        | J.J.I                                 | principles                                  |
|        | 3.5.2                                 | Charge storage II, storage                  |
|        | 5.5.2                                 | in the pnp-transistor                       |
| 3.6    | More rea                              | alistic models                              |
| 5.0    | 3.6.1                                 | Charge coupling II, more                    |
|        | J.0.1                                 | realistic model                             |
|        | 3.6.2                                 | Vertical charge transport II,               |
|        | 0.012                                 | modeling                                    |
|        | 3.6.3                                 | Backward charge flow41                      |
| 3.7    |                                       | ic wells, and lateral confinement43         |
| 3.8    |                                       | technology of JCCL                          |
| 3.9    |                                       | on of basic JCCL structures                 |
| 5.7    | 3.9.1                                 | Structure of the AND/ $OR_{\odot}$ function |
|        | 3.9.2                                 | Balanced injector structure                 |
| 3.10   |                                       | aracteristics                               |
| 5.10   |                                       |                                             |
|        |                                       | IONS AND IMPLEMENTATIONS                    |
| 4.1    | Introduc                              | ction                                       |
| 4.2    | General                               | description of JCCL61                       |
|        | 4.2.1                                 | Functions in charge domain61                |
|        | 4.2.2                                 | Functions in charge and current             |
|        |                                       | domains                                     |
|        | 4.2.3                                 | Functions using exclusive ORs (XORs).68     |
|        | 4.2.4                                 | Threshold logic                             |
|        |                                       |                                             |

| 4.3<br>4.4<br>4.5<br>4.6<br>4.7 | The transformation of DCCL into JCCL73<br>Full adder using exclusive-OR gates80<br>Full adder/ full subtractor82<br>Threshold logic full adder |  |  |
|---------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| 5. EXF                          | PERIMENTAL RESULTS                                                                                                                             |  |  |
| 5.1                             | Introduction                                                                                                                                   |  |  |
| 5.2                             | Results on JCCL, $n \leq 2$                                                                                                                    |  |  |
|                                 | 5.2.1 Simple JCCD logic at 20 MHz90                                                                                                            |  |  |
|                                 | 5.2.2 JCCL operating up to clock                                                                                                               |  |  |
|                                 | frequencies of 40 MHz                                                                                                                          |  |  |
|                                 | 5.2.3 Threshold full adder97                                                                                                                   |  |  |
| 5.3                             | Results on JCCL, $n > 2$                                                                                                                       |  |  |
| 5.4                             | JCCL compatible logic at clock voltages                                                                                                        |  |  |
|                                 | down to 2 V100                                                                                                                                 |  |  |
| 6 CVX                           | THESIS                                                                                                                                         |  |  |
| 6.1                             | Introduction                                                                                                                                   |  |  |
| 6.2                             | Some pipelined multiplier arrays for                                                                                                           |  |  |
| 0.2                             | bit level systolic array architectures108                                                                                                      |  |  |
| 6.3                             | A bit level spiral systolic division array113                                                                                                  |  |  |
| 6.4                             | Junction charge-coupled devices for                                                                                                            |  |  |
| •••                             | bit level systolic arrays                                                                                                                      |  |  |
|                                 |                                                                                                                                                |  |  |
| Append                          | lix A: On the action formulation in semi-                                                                                                      |  |  |
|                                 | conductor physics and modeling129                                                                                                              |  |  |
|                                 |                                                                                                                                                |  |  |
| Refere                          | ences                                                                                                                                          |  |  |
| -                               | 151                                                                                                                                            |  |  |
| Summar                          | ry151                                                                                                                                          |  |  |
| Samenvatting153                 |                                                                                                                                                |  |  |
|                                 |                                                                                                                                                |  |  |
| Acknowledgement155              |                                                                                                                                                |  |  |
|                                 |                                                                                                                                                |  |  |
| List o                          | of symbols156                                                                                                                                  |  |  |

# INTRODUCTION

This thesis describes some models and implementations of digital logic functions using junction charge-coupled devices (JCCDs). It is a sequel to the research on junction charge-coupled logic (JCCL), which started some years ago [1.1,1.2]. Operating at driving frequencies below a characteristic value, junction charge-coupled devices have a recognized advantage in low power and high functional density, which justifies research on applications in the field of digital integrated circuits.

Charge-coupled device (CCD) is a generic term which has come to be applied to a family of functional solid-state electronic devices. Under the application of a proper sequence of clock pulses these devices move "potential wells" filled with quantities of electrical charge in a controlled manner across a semiconductor substrate. In digital applications the most common example of the use of CCDs is found in memories. The digital information is represented by the presence or absence of a charge packet. This memory function can be extended with logical functions. However, in charge-coupled devices where charge packets are shifted at each clock pulse, these functions can only be carried out in a pipelined manner. The throughput (= the number of sets of data that can be processed by a functional block or an array in a given amount of time) of pipelined arithmetic calculations is high. The data can enter the logic part of the device at the maximum clock rate. The answer is some time later available but still at the maximum clock rate. In 1982 Nash introduced charge-coupled devices for use in special purpose pipelined arrays of computing elements, such as systolic arrays. He considered the advantages of CCD logic in terms of—gate density X maximum clock frequency / power dissipation—to arrive at an appropriate figure of merit for very large scale integration, VLSI [1.3].

This thesis aims to furnish the reader with a working knowlegde of the physical principles of JCCDs as used in logic applications, and to provide him with tools for the concise and precise description of the basic structures and synthesis of JCCL. The core topic of the first part is the analytical solution of a simplified JCCD. In JCCD literature there are several equations expressing the amount of charge that can be stored in a junction chargecoupled device. Using the simple model a correct expression is derived. Also, a new concept of the charge transport in the junction charge-coupled device is introduced, which is not based on the charge handling capacity (= the maximum amount of charge that can be contained) of the potential well from which charge is leaving, but which is based on the equality of the electrical potential in the driving and receiving well. The description of the basic struc-tures and synthesis of JCCL is a natural extension of the existing theory on junction charge-coupled logic. The basic structures are shown to be elements of both Boolean logic and threshold logic. Bit-level systolic arrays are considered as a main tool for the synthesis.

Briefly, this thesis is organized as follows. Chapter 2 presents an overview of research on charge-coupled devices and an introduction to systolic arrays. In Chapter 3 the basic principles, technology, and limitations of JCCDs as used in logic applications are discussed. The first part of Chapter 4 deals with the general description of JCCL. In the second part of Chapter 4 several JCCL full adders are developed and discussed. Chapter 5 shows experimental results on (i) simple logic devices and (ii) a threshold full adder. Finally, in Chapter 6—the last chapter—the synthesis of JCCL is discussed. Junction charge-coupled logic is a technology for bit-level systolic arrays. From this point of view, the theory of bit-level systolic arrays is part of a synthesis technique. Appendix A discusses a possibility for obtaining the potential relationships in semiconductor devices, here the pnjunction, not from Poisson's equation but from an action integral. Chapter 1

# OVERVIEW

#### 2.1 Historical review

#### 2.1.1 Charge-coupled devices

Charge-coupled devices belong to the class of charge transfer devices. The basic ideas of charge transfer devices have grown out of the development of several concepts.

One of these concepts was that of analogue shift registers. The notion of a shift register involved the passage of charge along a line of capacitors through the sequential switching of transistors. An integrated version of an analogue shift register was proposed by Sangster in 1966, under the name "bucket-brigade device" [2.1]. In 1970 integrated versions of these circuits were shown to be practical for delay and other applications.

Another development came from research on surface charge transistors (Engeler et al.) in 1970 [2.2]. This research involved a concept for controlling the transfer of stored electrical charge along the surface of a semiconductor.

Probably the most important concept originated from the work of Boyle and Smith (1970) on charge-coupled devices [2.3]. In a special issue of IEEE Transactions on Electron Devices (1976) Boyle and Smith recalled that the chargecoupled device concept was a structure that called upon existing technology, and was stimulated by the analogous work that preceded it in magnetic bubbles [2.4]. It was interesting to look for a semiconductor analogy of the magnetic bubble device. First, the charge packet was found as the semiconductor analogy of the magnetic bubble. The next problem was how to store this charge in a confined region. At this point a very important ingredient had been the development of the silicon diode camera tube. As well as the light sensitivity, the diode array had a charge storage capability. The charge could be stored in diodes for periods approaching a hundred seconds [2,5].

In their construction of the electric analogy of the magnetic bubble device Boyle and Smith used the metaloxide-semiconductor (MOS) capacitor in depletion, to store the charge. If a voltage was applied to this MOS structure a potential well was formed at the surface into which one could introduce charge (or not) to represent information. The final problem was to find a way to shift the charge from one side to the other, thereby allowing manipulation of the information. This was solved by placing the MOS capacitors close together to easily pass the charge from one to the next by applying a more attractive voltage to the receiver.

The MOS charge-coupled devices in their different forms are treated in many textbooks, such as those by Sequin and Tompsett [2.6], Beynon and Lamb [2.7], Howes and Morgan [2.8], and Esser and Sangster [2.9].

Boyle and Smith have described the working of a surfacechannel CCD. The following important step was the development of the buried-channel CCD (BCCD) by Esser [2.10] and Walden et al. [2.11]. In buried channel charge-coupled devices majority carriers are transported, that is electrons in an n-type conductivity layer, and individual charge packets are separated by a depleted region. If the MOS capacitors are replaced by reverse-bias pn-junctions we obtain a junction charge-coupled device. If the MOS capacitors are replaced by reverse-biased Schottky barriers, we obtain a Schottky charge-coupled device, both first proposed by Schuermeyer et al. [2.12].

#### 2.1.2 Logic circuits with charge-coupled devices

The idea of using charge-coupled devices for digital circuits and binary logic operations originated in the early 1970's. In 1971 Kosonocky and Carnes [2.13] summarized their work on the digital operation of chargecoupled circuits. Their paper described the operation and application of charge-coupled shift registers and the necessary charge regeneration stages. Thereafter Tompsett (1972) proposed the elementary logic operations NAND and NOR based on charge regenerators [1.14].

These operations did not involve direct interactions of charge packets. The presence or absence of charge packets in a parallel CCD shift register was detected by a floating gate. Mok and Salama (1972) introduced the principle of charge overflow in logic CCDs, which made it possible to have a direct interaction of charge packets in logic devices [2.15]. They used built-in potential barriers for charge packets. In this way the basic logic operation was that of CCD majority logic (= logic that tests whether the sum of a given number of charge packets is greater than a certain amount of charge). The built-in potential barrier could be realized in essentially two ways. First, by placing an ion implantation or a local increase in oxide thickness under the gate. Second, by inserting a separate gate with an offset voltage with respect to the following gate.

The principles of 'floating gate' logic or charge overflow using potential barriers formed the basis of all investigations on logic CCD circuits, such as those by Zimmerman et al. [2.16], Montgomery and Gamble [2.17], and Kerkhoff et al. [2.18].

With the development of JCCDs a new approach, that of charge injection and charge detection with bipolar transistors, was introduced (Wolsheimer [2.19], May et al. [2.20], and van der Klauw [2.21]).



Fig. 2.1 Addition: the carry bits propagate on the diagonal, the set of delays on the input bits is above the diagonal, the set of delays on the output lines is under the diagonal.

In 1977 in an article presenting digital charge coupled logic (DCCL), Zimmerman et al. described a method of implementing digital logic functions based on the use of CCDs in pipelined configurations. The reason that pipelined calculations in arithmetic units are required is associated with the generation of the carry bit at each stage. For example, in the addition of two n-bits words, the two least significant bits can be added immediately and produce their sum and carry outputs. This carry is only then available to be combined with the following significant bits and produce a new sum and carry. Figure 2.1 shows that in this manner the carry is delayed during each operation and so the application of the next significant bits must be delayed by an equal amount. This requires a set of delays on the input lines. An analogous set of delays must be inserted on the output lines if the entire word must be available at one clock pulse sometime in the future. There is another implication of using pipelined arithmetic. As data enter at one clock-phase and exist at one clock-phase in the future it is not efficient

8

to do random calculations with pipelined techniques. This means that digital CCD technologies are best suited for signal processing functions on several blocks of data simultaneously.

Fortunately a large number of algorithms are either already in a pipelined organization or can be cast into one. When a variety of such systems is considered, certain basic functions appear repeatedly. The multiplier, for example, requires adders; the fast Fourrier transformation requires multipliers and adders; serial correlators require shift-registers, multipliers, and accumulators; digital differential analysers use adders and shift registers to perform integration; division and Hadamar transforms require add and subtract functions. The most advanced result, achieved with digital charge-coupled logic was the design of a Hadamar transformer chip of 100 mm<sup>2</sup> in 1979 [2.22].

Another approach was described by Nash in 1982. He combined charge-coupled devices with conventional MOS circuitry in such a way that it combines the low power, high packing density of CCDs with some of the high speed combinatorial (nonclocked) logic capabilities of conventional NMOS circuits [2.23]. This approach possesses a capability that allow information to propagate or ripple through a circuit, and can reduce the number of delays. At the moment the follow-up of this research, the description of a technology which combines charge-coupled devices with CMOS circuitry, is being done in several places throughout the world. Nash introduced systolic arrays as an important candidate for the application of CCD logic circuits. The special features of this kind of array, namely: regular structures, pipelined architecture, special purpose, neighbor communication could be well matched by a charge-coupled device logic.

#### 2.2 Digital charge-coupled logic

The most elaborated research published on charge-coupled logic was that on digital charge-coupled logic [2.22]. For over 6 years many scientists worked on this topic resulting in, among other things, 14 publications and



Fig. 2.2 Conventional computer architecture with a single processing element (PE).

6 patents. It is important to recapitulate some of their conclusions. In section 4.3 the basic logic functions in digital charge-coupled logic are considered, and the transformation of logic functions from DCCL into JCCL will be discussed.

The design and realization of DCCL logic and arithmetic circuits presented a number of very difficult concept and modeling problems. The basic adder cells emerged as the most difficult and most essential circuits for performing DCCL logic and arithmetic functions. The half adder had become the basic element, being easily configurable into other essential logic functions, such as charge refresh and logical AND, and having an overall performance better than the full adder. Computer models predicted speeds of 5-10 MHz. Layout problems were caused by the lack of standardized symbolism and the inability to directly interconnect two physically separated signal points with a metal conductor. The final layout obstacle was the lack of computer-aided design rule checking. In a n-channel technology the half adder was successfully demonstrated at 5 MHz. The arithmetic functions obtained in digital charge-coupled logic include 16x16 multipliers and Hadamar transformation.

#### 2.3 Introduction to systolic arrays

In many digital signal processing applications, there are



Fig. 2.3 Systolic computer architecture using local data storage.

increasing demands for large-volume or high-speed computations that have to be performed on continuous data streams. The classical Von Neumann circuit architecture fixes a limit to the computing speed. Figure 2.2 shows that in a Von Neumann machine all the processing logic is contained in a processing element (PE) and the memory (M) is located almost entirely in a separate part.

There are two main limitations to this type of circuit organization which must be overcome to obtain maximum benefit from VLSI technology. First, the sequential nature of such machines places a basic limit on their operating capabilities in high-speed processing. Second, a limitation is caused by the long global communication path owing to the separation of the processing element from the memory. Ultimately, the computation speed of such circuitry will be dominated by the time taken to communicate information between the logic elements and the memory, rather than by the intrinsic speed of the logic devices. To overcome these limitations research is stimulated on parallel processing techniques. Systolic arrays, Fig. 2.3, are well suited to implement a major class of signal processing algorithms. According to H.T. Kung and C.E. Leiserson [2.24] 'A systolic array is a network of processors which rhythmically compute and pass system'. data through the

A formal definition is given by S.Y. Kung [2.25]. systolic a computing network possessing the following features: array is

- þ Synchrony: the data are rhythmically computed (timed a global clock) and passed through the network. a)
- and Regularity: (Modularity and Local Interconnections): the array consists of modular elements with regular spatially local interconnections. (q
  - delay allotted so that signal transactions from one Temporal locality: there is at least one unit-time to the next can be completed. node ি
    - Pipelinability: the throughput is independent of the size of the array. Ŧ

ದ the features a, c, and d can easily been conformed to by charge-coupled logic. In fact, junction charge-coupled logic is a technology for bit-level systolic arrays. If we consider bit-level systems we note that especially

be used as building blocks in a system level design [2.28]. The final step of the trajectory is the selection of concrete algorithms for bit-level computations Some saging this into a more suitable form, mapping it onto an array architecture and converting this to a systolic array approaches can be distin-Some, ad hoc, systolic arrays on the bit-level are described in chapter 6. then the data dependences can often be expressed as matrix-type computations [2.27]. McCanny and McWhirter use this approach to select bit-level solutions which can Anneveling and Dewilde [2.29]. Other techniques are based on the recurrence equations. If algorithms recursive properties are already available architectural design. Different approaches can be dist guished. S.Y. Kung describes a graphical approach which follows the progression of choosing an algorithm, mas-Much of the research on systolic arrays is devoted to for bit-level computations. [2.26]. This approach is also discussed by Kung, on operations on the recurrence equations. algorithms 'strong' with

## THEORY AND

## OPERATION

#### 3.1 Introduction

In this chapter an introduction to the basic principles, technology and limitations of junction charge-coupled devices is given. Using a very simple model, an analytic solution of the equations describing the potential profile in a JCCD with uniform doping concentrations and abrupt junctions, it is possible to explain the essential features of JCCDs. In JCCD literature simple models have been studied. However an analytical solution has not yet been obtained [3.1-3.4].

Unlike other integrated circuits, the charge-coupled device has no discrete equivalent circuit, that is to say, it cannot be made up of discrete devices. The typically dynamic and transient behavior of JCCDs make it apparent that although conventional one-dimensional considerations lead to qualitative and heuristic arguments concerning the device operation, the phenomenon of charge coupling along the transport direction is essentially two-dimensional, the phenomenon of charge transports along the lateral confinements are essentially three-dimensional, and the phenomenon of vertical charge transport is essentially



Fig. 3.1 Simple picture of the charge transfer in a CCD. Under influence of clock pulses  $(\phi_1, \phi_2, \phi_3)$  a potential well is transported. The dots represent the signal charge that is transported in the potential well.

An essential feature of JCCDs is to store information in the form of electrical packets in potential wells created in the semiconductor by the influence of separated gates. Under the control of external voltages applied to the gates, the potential wells, and hence the charge packets, can be shifted through the semiconductor. Because of the almost linear relation between the 'depth' of the potential well and the voltage on the gate, a simple hydraulic model for the charge-storage and chargetransport mechanism is generally used to depict the operation of (J)CCD structures.

To understand how a potential well can be removed from one location to another in a JCCD structure, consider the arrangement of three separated gates shown schematically in Fig. 3.1. We assume that some charge is stored initially in the potential well under the first gate which is clocked to 7 V. The other gates are at ground potential. The well underneath the 7 V gate will be much deeper than

16



Fig. 3.2 Perspective view of the two dimensional potential distribution. The potential distribution is a result of the sequence of gate voltages as shown in (a). It clearly shows the local potential well underneath gate C.

0f unambiguously in one direction, in this case, is three. Figure 3.2 shows a perspective view of the two-dimensional be 0f those underneath the grounded gates. It is only well that charge will be stored. By applying a potential distribution if one is looking against the transport direction. clock phases needed to propagate the potential well ambiguously in one direction, in this case, is three. varying voltages to the JCCD gates a charge packet can propagated in a controlled manner. The minimum number is only in this succession

# 3.3.2 Charge storage I, storage in the potential well

source and drain consist of n+-diffusions. diffused p-gates. lightly doped p-type substrate and an n-type Basically a junction charge-coupled device consists of Figure ω .ω shows an n-type epilayer with a typical JCCD. The ھ

the the voltage to source and drain, while the are kept at ground potential In this the For is layer extending from the gate-epilayer interface and from referred to as channel potential, Vch(Vg=0,Qs=0). electrical substrate-epilayer interface touch. At this point, epilayer by applying a sufficiently large positive a given JCCD structure all electrons potential has reached a maximum. In this case, gate are removed from the depletion and substrate This point

This the the voltage when the entire n-epilayer was depleted by the replace information-carrying electrons are majority carriers which 'n quantities of stored charge, the charge minimum. The structure electrons introduced into the structure will therefore be attracted to a plane parallel to the gate surface and passing through P where the potential energy will be a and can serve as a potential well for electrons. A well-defined local potential maximum can be created produce in the a JCCD resembles that in a buried channel CCD, the vicinity of the epilayer by clocking a gate to a positive voltage, Vg s local potential maximum is referred to as Vch(Vg,0), semiconductor. At low gate voltages, or with small thin 'slab' مf -a thin 'slab' مf applied some of the to the source and electrons that were previously of neutral ís urce and drain. In collecting in through P the electron thus capable of storing charge semiconductor storage mechanism in an Any actually removed nt Vg.



Fig. 3.3 A typical JCCD consists of a p-substrate and an n-epilayer with diffused p-gates. The first and the last gates are the n<sup>+</sup>-source and n<sup>-</sup>drain, respectively. The typical potential distribution along the x-axis has a minimum (the positive V-axis is drawn downwards). The point P indicates the potential maximum.

otherwise totally depleted n-type region.

Consider a JCCD built up of uniform doping profiles as shown in Fig. 3.4. From the analysis of section 3.4 we derive the curves of Fig. 3.5 which shows the potential distribution along the axis perpendicular to the center of the gate surface. In this case the acceptor concentration



## Fig. 3.4 Impurity profile through a p-gate for an idealized JCCD.

of the gates is  $10^{25}$  acceptor  $atoms/m^3$ , the n-epilayer of doping level  $7.10^{20}$  donor  $atoms/m^3$  is 5  $\mu$ m thick, and the substrate has  $2.10^{20}$  acceptor  $atoms/m^3$ . The upper curve shows the potential profile in the absence of a gate voltage Vg. The lowest curve shows the potential profile when a voltage of 5 Volts is applied to the gate. The curve in the middle shows the potential variation in the same structure, when a signal charge of  $7.3 \times 10^{14}$  electrons/m<sup>2</sup> is introduced. The region of constant potential indicates the physical extent of neutral semiconductor, the width of which is a few microns.

We now consider the effect of varying the voltage on a gate. A detailed analysis is given in the following section but it is intuitively obvious that making the gate more positive must produce a general downward shift (the electron energy is plotted) in the curves of Fig. 3.5. From the analysis of section 3.4 we derive the curves of Fig. 3.6 which shows the relationship between the channel potential Vch(Vg,O), and the gate voltage for various doping concentrations of the epilayer (the signal charge is zero). It clearly shows the almost linear relationship between the gate voltage and the maximum voltage in the epilayer.



Fig. 3.5 Potential distribution along the axis perpendicular to the center of the gate surface. The curve in the middle shows the distribution in the presence of signal charge.

Figure 3.7 shows the channel potential, in the absence of signal charge, Vch(0,0) as a function of the thickness of the epilayer at different epilayer dopings. It indicates the possibilities of scaling down the gate voltages, for the logic devices, under 5 V. The process described in this thesis was obtained after decreasing the epilayer thickness from 7  $\mu$ m to 5  $\mu$ m in a standard process for analogue JCCDs.

Figure 3.8 shows the situation when at a given gate-voltage the 'well' is filled maximally; if more charge is introduced it will flow underneath adjacent gates, having in common (the potential at) point  $x_2$ . As indicated in the figure this situation can occur if the gate voltage is equal to or less than Vch(0,0)-Vb, where Vb is the builtin voltage of the gate-epilayer junction. If the gate voltage is over Vch(0,0)-Vb, the maximum charge packet in



22



## Fig. 3.8 The potential distribution if the potential well is filled maximally.

the 'potential well' is limited by the requirement that the gate epilayer junction remains reverse-biased. In this case Vch(Vg,Qs) equals Vg+Vb.

Two remarks must now be made about the use of the hydraulic model. First it is necessary to realize that the amount of charge cannot be represented by an area enclosed by the different potential curves, as is the case when we consider a bucket filled with water. The amount of charge is proportional to the length of the straight line between  $x_1$  and  $x_2$ . Second it is useful to notice that the shape of the potential curve is different for different amounts of charge, in this situation it more resembles an elastic film. Consequently it is not possible to obtain a very simple relation that expresses the amount of charge that can be transported in a single clock cycle per unit of gate area, Qs, in terms of the donor concentration, thick





ness of the epilayer, and the applied gate voltage. It can, of course, easily be expressed in terms of  $x_1$  and  $x_2$ :

$$Qs = -q \int_{x_1}^{x_2} Ne \, dx$$
 (3.12)

but then we have to realize that  $x_1$  and  $x_2$  are functions of Ne, d(epi), Vg, and Vch(Vg,Qs). In terms of  $x_1$  and  $x_2$ ,  $x_2$  is situated before the point at full depletion, at zero gate voltage, and  $x_1$  is situated after the point where the gate-epilayer junction equilibrium potential difference has been built up. The formula 3.12, which expresses the signal charge in the JCCD, has not appeared in international JCCD literature before. In section 3.4 the formulas for Qs, and its maximal value for a given gate voltage,  $Q_{smax}$ , are derived.



Fig. 3.10 The maximal amount of signal charge that can be handled before charge is spilled into neighboring gates, Osmax, as a function of the gate voltage.

In Fig. 3.9 Vch(Vg,Qs) is shown as a function of the amount of signal charge in the potential well at different gate voltages. It also shows the physically allowed conditions for charge to be stored in the JCCD. The boundaries reflect

- (i) the impossibility of storing charge at a lower potential than that of the adjacent gates,
- (ii) that the gate-epilayer junction has to be reversebiased, and
- (iii) reach-through between the gates will not take place. The potential difference at the junction of a neighboring gate is equal or less than the built-in voltage (reach-through). For higher gate voltages this junction is forward biased and injection of holes is initiated.

The maximal signal charge that can be handled before charge is spilled into neighboring gates,  $Q_{smax}$ , is

shown in Fig. 3.10 as a function of the gate voltage. It shows the maximal amount of charge under the same conditions as above. The maximum amount of charge, Qs,max, is also called the (signal) charge handling capability (CHC). For JCCDs the CHC-curve has a maximum,  $Max(Q_{Smax})$ , if the gate voltage equals Vch(0,0) minus the built-in voltage of the gate-epilayer junction. This maximum in the CHC curve must not be confused with the term used in CCD literature 'Maximal Charge Handling Capability', which stands for another quantity.

#### 3.4 Derivation of charge-potential relationships

Consider that the gate and substrate are kept at ground potential, and the depletion layers extending from the gate-epilayer interface and from the substrate-epilayer interface touch. At this point the electrical potential has reached a maximum and is referred to as Vch(0,0). A well-defined local potential maximum is created by clocking a gate to a positive voltage Vg. This local potential maximum serves as a well for electrons.

In one dimension the distribution of the potential, V, and electric field, E, follow from:

$$\frac{d\mathbf{E}}{d\mathbf{x}} = - \frac{d^2 \mathbf{V}}{d\mathbf{x}^2} = \frac{\rho}{\epsilon} , \qquad \epsilon = \epsilon_{\text{Si}} \epsilon_0 \qquad (3.13)$$

If we use the abrupt approximation and neglect the minority carriers, we obtain (see Fig. 3.11):

Area I: x in interval  $[a_1, a_2]$ We use the boundary conditions  $E(a_1) = 0$ ;  $V(a_1) = Vg$ :

$$^{E}I = \frac{-q Ng}{\epsilon} (x - a_{1})$$
(3.14)

$$V_{I} = \frac{q}{2\epsilon} \left( Ng(x - a_{1})^{2} + \frac{2\epsilon}{q} Vg \right)$$
(3.15)

aı

a<sub>2</sub>



Fig. 3.11 Doping profile, electric field E, and potential V for an idealized and empty JCCD. The figure shows the parameters a...a, which are used in the derivation of the charge-potential relationships.

a4.

a3

х

.a<sub>5</sub>

Chapter 3

Area II: x in interval  $[a_2, a_4]$ With the boundary conditions  $E(a_3) = 0$ ;  $V_T(a_2) = V_{TT}(a_2)$ :

$$E_{II} = \frac{q Ne}{\epsilon} (x-a_3)$$
(3.16)

$$V_{II} = \frac{-q Ne}{2\epsilon} (x - a_3)^2 + K$$
(3.17)

$$K = \frac{q}{2\epsilon} \left[ Ng(a_2 - a_1)^2 + Ne(a_3 - a_2)^2 \right] + \frac{2\epsilon}{q} Vg \quad (3.18)$$

Area III: x in interval  $]a_4, a_5]$ With the boundary conditions  $E(a_5) = 0$  and  $V(a_5) = 0$ :

$$E_{\text{III}} = \frac{-q \text{ Ns}}{\epsilon} (x - a_5) \qquad (3.19)$$

$$V_{\text{III}} = \frac{q \text{ Ns}}{2\epsilon} (x - a_5)^2$$
(3.20)

In these equations  $a_2$  and  $a_4$  are known;  $a_1$ ,  $a_3$ , and  $a_5$  are unknown. The unknown variables can be obtained from the following equations (continuity of the functions E and V):

$$E_{I}(a_{2}) = E_{II}(a_{2})$$
$$E_{II}(a_{4}) = E_{III}(a_{4})$$
$$V_{II}(a_{4}) = V_{III}(a_{4})$$

Which result in:

$$(a_2 - a_1)Ng = (a_3 - a_2)Ne$$
 (3.21)  
 $(a_5 - a_4)Ns = (a_4 - a_3)Ne$  (3.22)

$$Ng(a_2-a_1)^2 + Ne(a_3-a_2)^2 + \frac{2\epsilon}{q}Vg = Ns(a_5-a_4)^2 + Ne(a_4-a_3)^2$$
  
(3.23)

As we assume Ng>Ne>Ns then the solution of this set of equations can be written using:

 $A = \text{NeNg}^{-1} + 1$  $B = \text{NeNs}^{-1} + 1$ as: <u>1</u> if  $A \neq B$ 

$$a_1 = a_2 - NeNg^{-1}(a_3 - a_2)$$
 (3.24)  
 $a_5 = a_4 + NeNg^{-1}(a_4 - a_3)$  (3.25)

$$a_3 = \frac{-(Ba_4 - Aa_2) - \sqrt{D}}{A - B}$$
 with: (3.26)

$$D = AB(a_4 - a_2)^2 + \frac{(Ng - Ns)^2 \epsilon}{NgNs q} Vg$$
(3.27)

 $\underline{2}$  if A = B

$$a_3 = \frac{a_2 + a_4}{2} - \frac{\epsilon}{q NeA(a_4 - a_2)} Vg$$
 (3.28)

To obtain the equations for the amount of signal charge Qs that can be transported in the JCCD, when the gate-epilayer junction is reverse-biased, we consider two separate diode structures that are coupled by (see Fig. 3.12):

$$V(a_{31}) = Vch(Vg,Qs) = V(a_{3r})$$
 (3.29)

Instead of equations (2.21), (2.22), and (2.23) we obtain:

$$(a_2 - a_1)Ng = (a_{31} - a_2)Ne$$
 (3.30)

$$(a_5 - a_4)Ns = (a_4 - a_{3r})Ne$$
 (3.31)

$$Ng(a_2-a_1)^2 + Ne(a_{31}-a_2)^2 + \frac{2\epsilon}{q}Vg = Vch(Vg,Qs)$$
 (3.32)



Fig. 3.12 The potential distribution in presence of signal charge. The channel potential equals the maximal potential and depends on Vg and Os.

$$Ns(a_5-a_4)^2 + Ne(a_4-a_{3r})^2 = Vch(Vg,Qs)$$
 (3.33)

From these four equations we obtain, under the same conditions as in the previous case:

$$a_{31} = a_2 - \sqrt{\left\{\frac{2\epsilon \left[Vch(Vg,Qs) - Vg\right]}{q \text{ Ne } A}\right\}}$$
(3.34)

$$a_{3r} = a_{4} - \sqrt{\frac{2\epsilon \left[Vch(Vg,Qs) - Vg\right]}{q \text{ Ne } B}}$$
(3.35)

The signal charge Qs is given by:

$$Qs = -qNe \ (a_{3r} - a_{3l}) \tag{3.36}$$

which equals:

$$Qs = qNe (a_4 - a_2) - \left\{ \sqrt{\left\{ \frac{Vch(Vg,Qs)}{p} + \sqrt{\left\{ \frac{Vch(Vg,Qs)}{r} \right\}} \right\}} \right\}$$
(3.37)

with 
$$p = \frac{A}{2\epsilon qNe}$$
  
 $r = \frac{B}{2\epsilon qNe}$ 

If Vb is the built-in voltage of the gate-epilayer junction:

$$Vb = \frac{kT}{q} \left( \ln \frac{NgNe}{ni^2} \right)$$
 (3.38)

Than equation (3.37) holds under the following conditions:

- 1) if 0 ≤ Vg ≤ Vch(0,0) -Vb <=> Vch(0,0) ≤ Vch(Vg,Qs) ≤
   Vch(Vg,0)
   this condition expresses the fact that if charge
   'over-fills' the potential well it flows to the
   neighboring gates,
- 2) if Vg > Vch(0,0)-Vb <=> Vg+Vb ≤ Vch(Vg,Qs) ≤ Vch(Vg,0) which reflects the fact that the gate-epilayer junction must be reverse-biased.

The maximum signal charge, Qs,max, is obtained in the first case if Vch(Vg,Qs) = Vch(0,0), and in the second case if Vch(Vg,Qs) = Vg+Vb.

3.5 Vertical charge transport

3.5.1 Vertical charge transport I, principles

Junction charge-coupled logic makes use of charge transport through JCCD channels, as well as through junctions induced by surplus charge. Horizontal charge transport, in the JCCD channel, is controlled by the clock voltages through reverse-biased **pn**-junctions.

We consider the combination of two packets of electrons under one gate, when the potential on the receiving gate is raised. In this case the surplus charge, if both wells are filled, is just one packet. This surplus charge can be used for vertical charge transport. Two structures for vertical charge transport are distinguished:

 the pnp-transistor, which is formed by the substrate, the epilayer, and the gate, which will become active when the gate voltage is taken above the channel potential and surplus electrons are present



Fig. 3.13 Potential profile, the dashed line representing the curve at maximal filling if the gate potential is above Vch(0,0).

underneath the gate;

ii) the injector structure, which consists of an n diffusion placed in a p-type gate. By forwardbiasing this n p-junction, charge can be injected into the CCD channel, acting as a collector of this vertical npn-transistor. This type of injector has been used successfully in filters and as input structures for multiplexers [3.4],[3.5].

The substrate pnp-transistor opens up the way to vertical charge transport out of the JCCD, charge 'overflow'. We consider the case that the gate voltage is well above Vch(0,0)-Vb, which is the case in logic applications with JCCDs. Figure 3.13 shows the potential profiles, the dashed line representing the curve at maximal 'filling'. If surplus charge is present in this situation it can not spread laterally, but instead it will forward bias the gate epilayer junction. The JCCD structure will act as a pnp-transistor. The gate (emitter) will inject holes into the epilayer. This charge flow will continue until all surplus electrons have been removed.

The behavior of this vertical overflow is strongly related to the JCCD properties, since for vertical charge flow the substrate pnp-transistor is biased by surplus charge transfer through the JCCD. We treat the surplus charge storage first. The processes involved in vertical charge transport and storage of surplus charge is extensively discussed by van der Klauw [3.4].

# 3.5.2 Charge storage II, storage in the pnp-transistor

The amount of signal charge that can be stored in a potential well has been determined in section 3.3.2. All this charge is transportable, if transfer inefficiency is not considered. If we look at charge overflow when one charge packet or several charge packets are transferred into a potential well that can only contain less than the total amount of charge supplied, while the clock voltage is over Vch(0,0)-Vb, two additional charge storage mechanisms can be distinguished.

Under influence of the externally induced field electrons will drift towards the potential well of the receiving gate. After the potential well is filled, surplus electrons will start to decrease the potential barrier at the gate-epilayer junction. In this way putting the junction under forward bias, initiating injection of minority carriers at the junction. Charge flow will continue until all surplus electrons have been removed from the epilayer and the potential difference across the junction is again Vb or untill the surplus charge is 'dumped' in a drain. If a drain is not used, this process will take considerably more time than the common clock periods for JCCDs. We have two additional storage mechanisms:

- i) charge storage associated with the changing depletion layers at the gate and substrate side of the epilayer,
- ii) the storage of electrons to maintain charge neutrality if holes traverse from gate to substrate.





The amount of surplus electrons that can be stored in the changing depletion layers can, for example, easily be estimated using the simple model of section 3.3.2: a forward bias of 300 mV results in a increase in stored charge of 15 %. In real devices the increase in storage capacity is measured to be about 20 % [3.4].

There are two time-constants involved in these storage mechanisms. First, the time-constant  $r_{\rm d}$  indicates the time necessary to decrease the depletion layers until a significant transport current starts. Thereafter, the surplus electron charge that is present in the base region can vanish by injection into the emitter or by recombination with traversing holes. The recombination can be neglected in the present technology [3.4b]. The second time-constant  $\tau_{\rm ct}$  appears in the formula

$$I_{ct} = I_{ct,max} \exp \left(-t/\tau_{ct}\right)$$
(3.39)

in which  $I_{ct}$  is the transport current measured at the collector terminal. This experimental curve expresses the decay of the transport current, and thus of surplus

charge. Figure 3.14 shows a realistic case [3.4c]. It clearly shows that the excess charge may be transferred through the JCCD if a substrate current is tolerated.

The only way to calculate  $\tau_d$  and  $\tau_{ct}$  is by means of a twodimensional and transient program which simultaneously solves Poisson's equation and the Continuity equations for discrete time steps. However for a given device they can be estimated experimentally.

# 3.6 More realistic models

#### 3.6.1 Charge coupling II, more realistic model

Broadly outlined, the process of charge coupling can be formulated with the JCCD model of section 3.3.2. The physics of the transport of one packet is rather simple: the electrons move towards the point at the highest potential until an equilibrium is established. In equilibrium the channel potential Vch(Vg,Qs) under the transporting gate equals the channel potential under the receiving gate. We consider the case that the clock voltage is 8V.

Figure 3.15 shows a schematic representation of the charge transport mechanism. Figure 3.15 a) indicates that the gate voltage on A is maximal, Vg,max(8V), while gate B is grounded. The well under gate A is maximally filled, as suggested by the shaded region. The transport of charge is started as Vch(Vg,max;Qs,max) < Vch(Vg,b;0), where Vg,b indicates a value of the gate voltage on gate B during the rising edge of the clock pulse. The moment that the channel potential under gate A, when filled with Qs, max equals the channel potential Vch(Vg,b;0) under gate B is drawn in Fig 3.15 b). In the model this occurs at Vg,b = 3.8V. In the next step both gates are at 8V. In 3.15 d) we see that if the clock phase on A drops the charge transport is completed if Vch(Vg,a;0) = Vch(Vg,max;Qs,max), thus if Vg,a = 3.8V. In the model used it is assumed that the charge is transported instantaneously.

The charge transport can be described by a path in the plane by Qs,max and Vg. The physical region in the  $\rm Q_{smax}$  ,



Fig. 3.15 Schematic representation of the charge transport mechanism. (a) represents a maximally filled potential well at a given voltage on the gate Vg. In (c) the signal charge is spread over the two gates which have the same clock voltage. In the intermediate state (b) it is not possible to indicate the signal charge and the gate voltage together in one picture because the liquid model cannot be used in this case. The left part indicates the distribution of the gate voltages and the right part the distribution of the signal charge. In (e) is the transport of charge completed.



GATE VOLTAGE (V)

Fig. 3.16 The charge transport describes a path in the plane by Osmax and Vg. The physical allowed region in this plane is shaded.

Vg variables is shaded in Fig. 3.16. It is obvious that even in practical devices the charge transport does not follow the charge handling capability curve (Qs,maxcurve), as is suggested by van der Klauw [3.4d].

Another way to look at this process is to consider the path that is described in the Vch(Vg,Qs) and Qs variables. Figure 3.17 shows this path. The path from a to e is the "charge line" of the charge under gate A. From a to c Vg is not changed but half the charge has drifted towards gate B. The curve from c to d is obtained using the relations:

37





$$Qs,max(Vg=Vg,max) = Qs_{A} + Qs_{B}$$
(3.40)

and

$$Vch(Qs_A, Vg_A) = Vch(Qs_B, Vg_{max})$$
 (3.41)

The "charge line" of the charge under gate B is exactly the reverse path.

We notice that the first half of the charge is transported from the well under A to that under B during the last part of the rising edge of the clock phase on gate B, and that the second half of the charge is transported at the beginning of the voltage drop on gate A.

# 3.6.2 Vertical charge transport II, modeling

As said before, the process of vertical charge transport



Fig. 3.18 Schematic representation of the combination of two completely filled potential wells. (a) initial situation, (b) charge is divided over the gates at the overlap of the clock pulses. In (c) the situation is drawn in which the well in the middle is filled maximally. The gate voltages in this case is shown in the left part. From this timeslot vertical charge transport through the pnp transistor is initiated.

can only be described by means of a two-dimensional and transient computer program. However, we can formulate qualitatively this transport using the simple model. Figure 3.18 shows the combination of two, completely filled, potential wells. The time slot when all gates are at gate voltage Vg,max is depicted in Fig. 3.18 b). The common well is filled to the 2/3 fraction of Qs,max at the voltage Vg,max. In Fig. 3.19 the "charge line" of gate B is going through the point a' to c'; that of gate A and C from a to c. The transport from the well under A and C to B starts at Vg(B)=3.8V as indicated in Fig. 3.19. The condition that gate B is filled maximally and the channel potential under gate A and C equals the channel potential



Fig. 3.20 Schematic representation of backward charge flow, indicated by the curved arrow.

the value at the beginning of the next puls [3.4e].

The question is, of course, wether backward charge flow rules out the possibility of having logic devices which obtain the function in more than one clock cycle, in other than the present technology. Is backward charge flow an intrinsic mechanism?

Certainly, the possible values for the gate voltage are limited. For example, in the 7  $\mu$ m epi devices the channel potential Vch(0,0) is approximately 9 Volts, while the reach-through between the gates starts at voltages near 10 Volts [3.4f]. And, in the 5  $\mu$ m epi series we have to deal with large parasitic potential wells, that reach potentials of 1-1.5 Volts above Vch(0,0). These conditions probably account for the existence of backward charge flow in the present technology. Unfortunately, these limitations are a consequence of the most favorable technology,



# Fig. 3.21 Schematic representation of the equipotential lines in the simplified JCCD model.

not for digital, but for analogue applications. In principle, the problem should easily be solved, except for very high frequencies.

# 3.7 Parasitic wells, and lateral confinement

To a large extent the quality of a JCCD is determined by the smoothness of the channel potential profile in the transport direction. Distortions in the potential profile act as parasitic wells for electrons.

Figure 3.21 shows a schematic view of the equipotential lines in a JCCD with doping profiles as described in the simple model and with a gate-to-gate distance of 5  $\mu$ m. A large parasitic potential well can be observed between gates A and B. Even, when gate C is clocked to 5 Volts, it will not completely disappear, as can be seen from the equipotential lines around gate C. If we consider another case in which the well underneath gate C is filled with electrons in such a way that Vch(Vg,Qs) equals 10 Volts (of course the gate voltage is in this case above 5 V). then if the clock voltage drops electrons will be trapped in the parasitic well which remains. These trapped electrons will not be transported, thus a large transfer inefficiency can be expected. To eliminate the parasitic potentials between the gates an overall surface implantation--to make it possible to decrease the gate-to-gate distance (increase of reach-through voltage) -- and a phos



Fig. 3.22 The presence of parasitic wells and parasitic channels at the crossings of the lateral gap and the inter-gate gaps.

phorus channel implantation through the mask for the gates—to increase locally the impurity concentration—are performed.

The lateral confinement of JCCD channels, in logic applications, is formed by the same p-type diffusion as used for the gates. The JCCD gates are in this way embedded in a large gate at ground potential. The epilayer underneath will then be fully depleted by the operation of adjacent CCDs.

We consider the situation as depicted schematically in Fig. 3.22. Due to the variation of distance between the ptype gates and the p-type confinement, parasitic potential wells will appear at the crossings of the lateral gap and the the inter-gate gap. If the transport channel makes an angle of 45° a parasitic channel results, which can transport charge during the overlap of the clock phases of the transporting and receiving gate.

#### 3.9 Current technology of JCCL

The current fabrication process of junction charge-coupled devices, in an adapted standard process for fabricating

44

bipolar circuitry, is outlined.

The ideal JCCD technology meets the following requirements:

- a) Smooth potential distribution: there are no distortions in the potential of the channel when all gates are at ground potential and the epilayer is fully depleted.
- b) No parasitic barriers: the formation of parasitic barriers underneath the gates must be avoided, because they are hardly affected by voltage differences applied between the gates [3.3].
- c) Only small parasitic wells: in a technology using ptype islands for lateral confinement, parasitic wells will be present. Their size should be as small as possible.
- d) High reach-through voltage: the terminal voltage difference between the p-regions at which reach through occurs should be well above Vch(0,0).
- e) Optimal properties of pnp-substrate transistor: the maximum value of the collector transport current of the substrate transistor should be as high as possible, and the decay time as short as possible.
- f) Compatible with good bipolar circuitry: use a technology in which peripheral circuitry, like clock drivers can be integrated.

Of course, if some requirements cannot be simultaneously fulfilled, a compromise must be made. Apart from this, the fabrication process, which has been used, has mainly been developed for analog JCCD applications, and consequently it does not have the best features for digital applications.

The main processing steps in the current fabrication process are listed in table 3.1. The resulting impurity concentration profile through a p-type gate, which was calculated with the technological simulation program SUPREM II [3.8], is shown in Fig. 3.23. -

PROCESS STEP

ł

DETAILS

| p-substrate res          | sistivity                                   | 5080 Ωcm            |
|--------------------------|---------------------------------------------|---------------------|
| n-epitaxial layer res    | sistivity                                   | 8 Ωcm               |
| thi                      | ckness                                      | 5 μm                |
| p-isolatio (DP) she      | et res.                                     | 310,Ω/¤             |
| phosphorus surf. implant | 3x10 cm~2, 30 keV                           |                     |
| annealing                | 1200 °C, 60 min.                            |                     |
| phosphorus channel impla | $8 \times 10^{12} \text{ cm}^{-2}$ , 30 keV |                     |
| annealing                | 1135 °C, 85 min.                            |                     |
| p-base diffusion (SP) sh | neet res.                                   | 195235 Ω/□          |
| gate dep                 | oth                                         | 0.95 µm             |
| emitter p                | 0.4 µm                                      |                     |
| n-emitter diffusion sh   |                                             | 1015 Ω/¤            |
| (source/drain) (S        | 1 μm                                        |                     |
| contact holes mi         | in. dim.                                    | 4x4 μm <sup>2</sup> |
| lst layer (CO)           |                                             |                     |
| interconnection mi       |                                             | 3 µm                |
| lst layer (IN) s         | spacing (min)                               | 3 µm                |
| anodisation              | 2                                           |                     |
| contact holes mi         | n. dim.                                     | 5x5 μm <sup>2</sup> |
| . 2nd layer (CO2)        |                                             |                     |
| interconnect mi          | 12 μm                                       |                     |
| 2nd layer (IN) s         | 8 µm                                        |                     |
| scratch protection (CB)  |                                             |                     |
|                          |                                             |                     |
|                          |                                             |                     |



real JCCD.

The JCCD structure was analyzed with equipotential plots made with SEMMY2 [3.9]. SEMMY2 is a program for the analysis of two dimensional electrostatic (zero current) semiconductor problems. It solves Poisson's equation using Boltzmann statistics for the charge density. The current densities are assumed to be zero and consequently the quasi-Fermi levels to be constant throughout the device. The terminal voltages are converted into electrical potentials using built-in voltage contributions. The assumption of constant guasi-Fermi levels causes a contradiction if different voltages are applied, like in the JCCD structure in Fig. 3.24. Then in the n-type epilayer different quasi-Fermi levels for the holes, corresponding to the applied voltages on the p-type gates, are defined. This is overcome by assuming that the minority carrier concentration is negligible. The different quasi-Fermi levels for holes in p-gate and p-type substrate are coupled by a curved part in the n-type epilayer. This does not violate the no-current condition because the minority carrier concentration is very small. The potential distribution in





Fig. 3.24 (a) The equipotential lines in the JCCD if no signal charge is present; (b) The equipotential lines in the JCCD if the maximal amount of signal charge is present underneath gate B.



Fig. 3.25 The one-dimensional potential distributions in the JCCD extracted from the figures 3.24 (a),(b).

the presence of charge, Fig. 3.24(b), is obtained using the condition that the voltage over the gate-epilayer junction is just the built-in voltage, and so represents Qs,max.

Figure 3.24(a) shows the three gates of a JCCD cell. Gate A appears partly on the upper left of the plot. Gate C appears partly on the upper right of the plot. Gate A and B, and the substrate are held at ground potential, while the epilayer is fully depleted. The gate voltage on gate C is 6 Volt, which is slightly more than the JCCD channel potential, at Vg=0 and Qs=0, minus the built-in voltage.

A parasitic potential well can be observed between gates A and B. This well does not completely disappear when gate C is clocked to 6 V. Electrons within this well are not transported and thus contribute to the transfer ineffi ciency.

The value of the potential maximum at the silicon-silicon dioxide interface, Vs, between two gates at ground potential is approx. 5.5 V, the value of this potential between gate B and C is approx. 9 V. When a positive voltage is applied to a gate, the potential maximum Vs increases and its position is shifted towards the positive gate. If no avalanche breakdown occurs, the maximum allowable voltage applied to gate C is Vs minus the built-in voltage (reachthrough condition). For higher voltages this junction is forward biased and injection of holes is started. Although experiments show maximum voltages slightly over Vg=Vch(0,0)-Vb, before reach-through between the gates occurs, the plot does not show this effect.

From Fig. 3.24 the potential profile along a line perpendicular to the gate surface through the middle of the gate can be obtained. This is depicted in Fig. 3.25. This figure makes it possible to compare this real JCCD with the simplified model. Clearly, the shape of the curve is qualitatively predicted by the simple model. Because of the phosphorus channel implant the position of the potential maximum is closer to the gate epilayer junction.

#### 3.9 Operation of basic JCCL structures

#### 3.9.1 Structure of the AND/OR function

As in all CCD logic, the AND/OR function is the basic configuration. The OR function is the easiest to implement. If two unit charge packets are transferred into an unit area potential well, the resulting charge in this well represents the OR function of the input packets. The normalization of this charge packet can be used to obtain the AND function. In DCCL this normalization is performed laterally by spilling the surplus charge over a potential barrier which is controlled by a DC voltage. In JCCL the normalization is performed vertically through the gate. The AND function explicitly uses the overflow signal that results from bringing together two charge packets in a potential well that can contain only one charge packet. The total overflow current is  $\beta$  times the surplus charge,



Fig. 3.26 (a) Schematic topview of the AND/OR function; (b) Cross-section of the AND/OR function.

 $\beta$  is the amplification of the pnp-transistor. The layout and a cross-section of the AND/OR function is given in Figs. 3.26(a) and 3.26(b). The configuration essentially consists of two JCCD channels, one giving the OR of the two inputs and the other giving the AND function. The charge packets X and Y are present in the input/OR channel. The clock phases are indicated in the upper right corners of the gates. The arrow and ' $\geq$ 2' denote an overflow current if two or more charge packets are present. This overflow current is injected in another channel forming the result of the AND function.

In the logic AND channel, the injector current Iin must result in a full charge packet within the interval bounded by the onset of overflow somewhere during the falling edge of the previous clock phase,  $t_1$ , and its termination somewhere during the rising edge of the following clock phase,  $t_2$ . The clock phase of the injector during this interval is high, Fig. 3.27. illustrates this.

An expression for the value of the resistor R is obtained by van der Klauw [3.4]. The overflow current,  $I_0$ , is taken constant over the interval  $t_2$ - $t_1$ , as can be assumed if  $t_2$  $t_1$  is small compared with the decay time of the pnp-

51



Fig. 3.27 The clock waveforms at charge injection. Charge injection by the injector gate that it driven by  $\phi_2$  is initiated at time  $t_1$  and is terminated at time  $t_2$ .

transport current. The injector structure is seen as a diode parallel to a capacitor C. The diode represents the injector gate in which the base current is negligible, which implies that all the injector current flows into the JCCD channel. The capacitor C represents the injector capacitance including wiring. This capacitor introduces a short delay, typically about 1 ns, in the injection process since it has to be charged to the voltage level at which the diode takes over the current  $I_0$ , if the value of the resistor R is high enough. After the capacitor has been sufficiently charged the current  $I_0$  flows through the injector and the resistor R only. The injector is assumed to follow the ideal relation:

$$I_{in} = I_s \exp(qV_g/kT)$$
 (3.46)

in which Is is the saturation current of the injector, and Vg is the applied voltage. The injector current has to create a charge packet Qs,max(Vg) in the remaining time  $\Delta t$ . The required voltage is also the voltage across R and so [3.4]:

$$I_0 = \frac{Q}{\Delta t} + \frac{kT}{qR} \ln \left( \frac{Q}{\Delta t} \cdot \frac{1}{Is} \right)$$
(3.47)

And the minimum value for R equals:

$$R = \frac{kT}{q} \ln \left( \frac{Q}{\Delta t} \cdot \frac{1}{Is} \right) / \left( I_0 - \frac{Q}{\Delta t} \right)$$
(3.48)

At room temperature, the saturation current of an injector gate with a 10x10  $\mu$ m<sup>2</sup> emitter is about 65 fA; using Qmax(7V) = 0.08 pC, I<sub>0</sub> = 0.12 mA (experimental value),  $\Delta$ t = 15 ns (for a device operating at 10 MHz), a minimum resistor value of 4.1 k $\Omega$  is required.

As long as additional capacitive loads from other injectors or wires do not increase the delay too much, it is



Fig. 3.28 Schematic layout of the function F=P·O+X·Y, using a wired-or operation.



Fig. 3.29 Schematic layout of a complex inverter calculating  $(X \cdot Y) \cdot \overline{(P \cdot Q)}$ .

possible to implement wired-OR functions in JCCL. Figure 3.28 shows an implementation of the function F = PQ + XY using a wired-OR operation [3.4].

# 3.9.2 Balanced injector structure

To obtain a complete description of logic devices we have to define a complementation operator. For this purpose the so-called balanced injector structure is defined.

The balanced injector structure consists of an injector gate to which two overflow structures are connected, one to the emitter of the n+pn-transistor and one to the base of this transistor. If an overflow current is offered to the base, it is converted into a voltage across the resistor and consequently switches off the process of injection if an overflow current is offered to the emitter. The possibility of injecting a charge packet now depends on both overflow currents.

An example of the use of the balanced injector structure is shown in Fig. 3.29. In this case, two identical load resistors R are inserted in the emitter and base connections with the clock. The voltage drops across these resistors, if all inputs are a logical "1". will be identical. The net voltage drop across the injector junction is therefore zero and no charge is injected. The output charge packet represents  $(X.Y).(\overline{P.Q})$ .

#### 3.10 JCCL characteristics

The JCCL characteristics were measured and extensively discussed by van der Klauw [3.4]. In the following these results are summarized.

First the transfer inefficiency is discussed. The physical mechanisms that are responsible for signal charge loss in JCCDs include:

- i) Inadequate time for the carriers to move over the required distance under influence of diffusion or externally induced field,
- ii) potential barrier humps that may exist in the channel potential profile,
- iii) potential wells that are not emptied by the externally induced field,
- iv) the trapping of carriers at bulk states and later release.

If the transfer inefficiency,  $\epsilon$ , is the fraction of the charge that is left behind as a charge packet is transferred from one well to the next, and n is the number of transfers, then for digital applications the transfer inefficiency product, n $\epsilon$ , must be equal to or less than 0.25 if a 50 percent reduction in the noise margin between logic levels is acceptable [3.10]. The transfer inefficiency is different at voltages below and above Vch(0,0)-Vb, because of additional loss of charge due to injection into p-gates, if surplus charge is present. In the present case this loss of charge is less than 0.1%. In a typical JCCL process the  $\epsilon$  is measured to be between  $10^{-2}$ 

and  $10^{-3}$ . Charge losses that are independent of the amount of charge are usually denoted by  $\delta$ . In the case where no precautions are taken to avoid large parasitic wells at the crossing of the lateral gap and the intergate gap this  $\delta$  can be as large as 15% of the total charge packet if measured after 30 transfers. Power dissipation in dynamic systems increases linearly with the clock frequency as  $P = C V^2 f$ . In JCCL devices, however, the contribution of vertical overflow currents dominates this figure in most cases. A convenient way to characterize power dissipation in JCCL devices is to express the total load of the clock by the equivalent number of unit area gates,  $n_{eq}$ , with equivalent capacitance  $C_{eq}$  which is the average value over the clock voltage swing:

 $P = n_{eq} C_{eq} V^2 f = n_{eq} Q_{eq} V f$ (3.49)

 $Q_{eq}$  represents the amount of charge that flows through the clock voltage source to the unit area gate, in either direction during a single clock cycle. In C<sub>eo</sub> the capaci--tance to the lateral confinement and to adjacent gates, and the capacitance of an average length of interconnect are included. In the same way the corresponding charges are included in Q<sub>eq</sub>. Although these contributions are constant for a fixed clock voltage swing, the power dissipation is dependent on the presence of a charge packet in the JCCD channel since the charge packet contributes to  $Q_{eq}$ . Values for  $n_{eq}$  and  $Q_{eq}$  are obtained from models of the charge overflow. If we consider a typical logic circuit consisting of 10 gates in which three gates can be used for vertical overflow the n<sub>eq</sub> is estimated to be in the range of 50-100, and the  $Q_{eq}$  in the range of 0.2-0.4 pC.

Two aspects that determine the maximum and minimum operating frequencies of JCCL devices are the typical aspects that are concerned with vertical overflow and some general JCCD characteristics. The minimum operating frequency is determined by the JCCD dark current, which is approximately 12 nA/cm<sup>2</sup> at 300 K. If a filling of 10% of a full packet by this current can be tolerated, then the permissable storage time of a signal is just below one second. This implies a clock frequency of, say, 100 Hz as a hundred JCCD cells are passed from input to output. The use of vertical overflow limits the maximum operating frequency. In the available time for vertical overflow a full charge packet must be created. The limit is at present somewhere between 50 and 100 MHz [3.3,3.4].

đ the performance of logical circuits at clock voltages well The noise margin in JCCL is defined as the maximum amount logic stage while giving the correct output signal. The input-output characteristics for several logic circuits were measured in great detail. Also, the dependences of these characteristics and the clock voltages at several clock frequencies were measured. The conclusion was that the input of of spurious charge that can be tolerated at above the channel potential is acceptable. Chapter 3

÷

# CONFIGURATIONS AND IMPLEMENTATIONS

#### 4.1 Introduction

If a suitable technology is available the possibilities of constructing logic functions with JCCDs are:

- a) Logic functions using only transfers in the charge domain. In this way simple Boolean functions and, say, the carry function can be obtained.
- b) Logic functions using transfers in the charge and current domains. The following classes are distinguished:

i) Boolean logic: in this case the logic functions are based on the conditional operator implication (i.e. if x is true then so is y). The x is said to imply y, written  $x \rightarrow y$ , if the table 4.1 is satisfied (a table of this sort is referred to as a truth table).

ii) Threshold logic: a function  $f(x_1, x_2, ..., xn)$  is a threshold function if a set of numbers  $\{w_1, w_2, ..., wn\}$  (called weights) and a number T (called threshold) exist such that  $f(x_1, x_2, ..., xn) = 1$  if and only if:

ר(x→y) x→y х y 0 0 1 0 0 1 1 0 1 0 0 1 1 1 0 1 The arrow symbol is used in many different contects and so extreme care is needed in deciphering their meaning. So: (4.1)x→y is the same as (-ıx)Vy (4.2)where  $\neg$  is the Boolean complementation operator.

Table 4.1 Truth table of logical implication operator

$$\prod_{i=1}^{n} w_i x_i \ge T$$

$$(4.3)$$

where xi = 0 or 1 and the multiplication and summation are arithmetic (rather than Boolean). In JCCL, a threshold function can be realized using a single device (called a threshold element), as shown in Fig. 4.1.

iii) Other mixtures of horizontal charge transport and overflow currents. The earliest full adders were designed in much of an ad hoc way, using the different possibilities of charge transport. If digital



Fig 4.1 Fepresentation of a threshold element. The numbers w<sub>1</sub>,w<sub>2</sub>...w<sub>n</sub> are weights, T is the threshold.

charge-coupled logic (DCCL) is transformed into JCCL the resulting devices belong to this category too.

An introduction to a systematic approach to obtain functions in JCCL is given. Logic functions with JCCDs were first described by May et al. [4.1]. The inverter structure was introduced by Kleefstra. The basic AND/OR device and the wired-OR function were described by van der Klauw [4,2]. An overview of the symbols and variables used in the description of JCCL is depicted in Fig. 4.2. The rectangles represent the various gates. The area of the injector gate is taken as a unit. The area of the normal gates and the injector gates are expressed by the size of the symbols. The size of the symbols for the overflow gates is not proportional to the area of the overflow gate, instead the function performed by the overflow gate is denoted. In the upper right corner the clock phase by which the gate is driven is indicated. The variables representing currents are typed in lower case, the variables representing charge packets are typed in capitals.

#### 4.2 General description of JCCL

# 4.2.1 Functions in charge domain

A survey of basic configurations combining different numbers of injector gates with overflow gates of different sizes is shown in Fig. 4.3. In this and the following gate area A/2, connected to clock phase clock phase  $\phi_{\nu}$  (k=1,2,3)



gate area A



injector gate with area A



drain

n over

overflow gate performing the function  $\geq n$ 

x,y,z,p,q,r are variables in the cut current domain

X,Y,Z,P,Q,R are variables in the charge do charge domain

Fig. 4.2 Used symbols and variables in the description of JCCL. The area of the overflow gates does not have a particular meaning, the overflow gate performs the function ≥n.

sections the load resistors are left out for the sake of simplicity. Figure 4.3 shows an upper triangular matrix with nonzero functions. The first row shows the OR-functions of the input variables. On the diagonal we find the AND-functions of the input variables. At the intersection of the second row and the third column, the function is the OR-function of all possible combinations of the ANDfunction of two variables (without using the inverse of a



Fig. 4.3 JCCL basic devices. The input currents are denoted in the upper row. In all devices the overflow gate is connected to clock phase  $\phi_2$ .

variable), f = xy + xz + yz. This basic function in JCCL equals the carry function as used in a full adder (FA). This function is the majority function of its three inputs. The majority function is a multi-input-singleoutput gate the output of which is the same value as that of the majority of its inputs. To avoid possible confusion over what constitutes a "majority", it is universally taken that the number of gate inputs is an odd number, that is n = 3 or 5 etc., from which the gate output may be expressed as

$$f(x) = 1 \quad \text{if} \quad \underset{i=1}{\overset{n}{\sum}} \quad x_i \ge n/2 \qquad (4.4)$$
$$= 0 \text{ otherwise}$$

Also the majority functions of more inputs can easily be realized.

If the matrix is expanded, the intersection of the third row and fourth column would give the OR function of all possible AND functions of three variables.

#### 4.2.2 Functions in charge and current domain

To obtain a complete description of logic devices for two variables we have to define a complementation operator. For this purpose the so-called balanced injector structure is defined. The possibility of injecting a charge packet depends on both overflow currents (x and y). The truth table of the balanced injector structure is given in Fig. 4.4. The truth table is the same as that of the function  $x \cdot y$  or  $y \cdot (x \cdot y)$ . If y is 0 we have the normal injector structure. If x always equals 1, x is a so-called 1-generator, then we obtain the invertor structure. The AND, OR, and invert operation form a functionally complete set of operators. All logical functions can be constructed. Using the overflow currents x and y, the balanced injector structure, and the overflow gates " $\geq 1$ " and " $\geq 2$ " we can build a two-valued Boolean algebra.

This description of JCCL yields, in principle, the possibility of building every logical Boolean function. However, owing to clocked logic, and for minimizing the



| x | у | BI | - = -<br>יy | X•-ıY |
|---|---|----|-------------|-------|
| 0 | 0 | 0  | 1           | 0     |
| 0 | 1 | 0  | 0           | 0     |
| 1 | 0 | 1  | 1           | 1     |
| 1 | 1 | 0  | 0           | 0     |

Fig. 4.4 The balanced injector structure and truth table showing the equivalence with the non-implication operation.

device area and delay time, the most interesting functions are performed in a single charge transport step. Table 4.2 gives the truth tables and the names for the 16 functions of two variables. Figure 4.5 shows that all Boolean functions of two variables, except the function  $F9 = x \odot y$ (the equivalence function) can be realized in a single transport step. In particular the functions inhibition (F2 = x.¬y) and exclusive-OR (F6 = x $\oplus$ y) are easily realized in JCCL. In contrast, the standard logic gates NAND and NOR are not realized easily because they need 1generators. Chapter 4

| x<br>y                     | 0 | 0<br>1 | 1<br>0 | 1<br>1 | function | name            |
|----------------------------|---|--------|--------|--------|----------|-----------------|
| F                          | 0 | 0      | 0      | 0      | 0        | null            |
| F<br>F                     | 0 | 0      | 0      | 1      | х∧у      | AND             |
| F                          | 0 | 0      | 1      | 0      | -ı(x→y)  | non-implication |
| $F_2$<br>$F_3$             | 0 | 0      | 1      | 1      | х        | transfer        |
| F,                         | 0 | 1      | 0      | 0      | y→x) ר   | non-implication |
| F4<br>F5<br>F6<br>F7<br>F7 | 0 | 1      | 0      | 1      | у        | transfer        |
| F                          | 0 | 1      | 1      | 0      | х⊕у      | exclusive-OR    |
| F7                         | 0 | 1      | 1      | 1      | x∨y      | OR              |
| F'                         | 1 | 0      | 0      | 0      | ע∨x) ר   | NOR             |
| F8<br>F9<br>F10            | 1 | 0      | 0      | 1      | x0y      | equivalence     |
| F <sup>9</sup>             | 1 | 0      | 1      | 0      | ⊐у       | complement      |
| F 10                       | 1 | 0      | 1      | 1      | y→x      | implication     |
| $F_{10}^{11}$              | 1 | 1      | 0      | 0      | лх       | complement      |
| $F_{12}$                   | 1 | 1      | 0      | 1      | х→у      | implication     |
| $F_{1}^{13}$               | 1 | 1      | 1      | 0      | ¬(x^y)   | NAND            |
| F <sup>14</sup><br>15      | 1 | 1      | 1      | 1      | 1        | identity        |

# Table 4.2 Truth tables and names of the Boolean functions of two variables.

The possibilities of creating functions of 3 variables in a single charge transport step can be given using the following properties:

- property 1) With the overflow gate "≥1" we can build all maxterms (3 variables forming an OR-term, each containing all the input variables in either true or complemented form).
- property 2) With the overflow gate "≥1" we can also build terms with 2 literals, in either true or complemented form, OR-ed together with the third (e.g. x.¬y+z, z.¬x+x.¬z+¬y).
- property 3) With the overflow gate "≥2" we can build the OR-function of two-literal expressions of the form xy+xz+yz and forms with the substitutions x→¬x, y→¬y, z→¬z.

66

Chapter 4





F<sub>2</sub>



F<sub>4</sub>







F<sub>14</sub>





Fig. 4.5 JCCL devices for Boolean functions of two variables. All the non-trivial functions are given, except except the equivalence function  $x \partial y$ , which cannot be rea realized in one transfer step. All overflow gates are connected to  $\phi_2$ 

67

property 4) With the overflow "≥3" we can build all minterms (3 variables forming an AND-term, each containing all the input variables in either true or complemented form).

In general, one way to proceed is to make a Karnaugh-map of the required function and consider the possibilities of expressing the function in the above described constructions. A boundary condition is formed by the geometry of possible devices.

#### 4.2.3 Functions using exclusive-OR functions

The exclusive-OR (XOR) function is easily realized in JCCL. The AND plus exclusive-OR operators form a functionally complete set of operators, but require augmenting with steady logic 1 signals to provide full functional completeness. Further, as the usually listed rules of Boolean algebra do not normally include the exclusive-OR operator, it is frequently not classed as a Boolean logic gate in the same sense as the AND, OR, NAND, NOR family. Yet it and its variations are extremely powerful logic building blocks [4.3].

The basic exclusive-OR gate is generally considered to be a two-input gate, which with input  $x_1$  and  $x_2$  obeys the following rules:

output f(x) = 1 if  $x_1 \neq x_2$  (4.5) = 0 if  $x_1 = x_2$ 

Algebraically we may re-express this action in the more usual way:

 $f(x) = [x_1 \oplus x_2] \tag{4.6}$ 

As seen in Fig. 4.6, the three-input exclusive-OR gate may be regarded as an odd-parity gate, that is a gate output signal of which the output is 1 when an odd number of its input signals is 1. In this way it resembles the sum output of a full adder. The three-input exclusive-OR gate is logically equivalent to two two-input exclusive-OR gates in cascade.



Fig. 4.6 Three-input exclusive-or gates : (a) threeinputs, the "2k+1" denotes that the output is a logical one if an odd number of logical ones is offered at the input. (b) threeinput exclusive-or made by a cascade of two two-input exclusive-or gates, the "=1" denotes that the input must be one (only one input is a logical one). (c) truth table

# 4.2.4 Threshold logic

As has been pointed out in section 4.2.1 and Fig. 4.3, majority functions of more inputs can easily be realized. Figure 4.7 illustrates a three-input majority gate and its JCCL realization. The logic functions which are directly realized by majority (and minority) gates are in the class of symmetric functions. All symmetric functions are characterized by some "symmetry" in their input variables, whereby the interchanging of two (or more) inputs causes no change to result in the output function f(x). If this invariance holds for all possible pairs of input variables then the function is said to be completely symmetric. The basic AND/OR gates are clearly completely symmetric in their inputs. Similarly all the majority and p-out-of-qtype functions are completely symmetric.





Fig. 4.7 Three-input majority gate (a) and its JCCL implementation (b). The output f(x) will be a logical one if the majority of its inputs are a logical one, that is, two or more inputs are "one".

functions form an important class of Boolean functions, and recognition of the symmetry may aid in producing an efficient realization of the function [4.3].

The majority gates introduce two features as follows:

- (i) It is possible to specify a logic gate from which it is possible to generalize the basic AND/OR functions
- (ii) The functions realized by majority gates are com pletely symmetric.

The simple AND/OR gates and majority gates may be considered as particular and simple cases of the general class of threshold logic gates. Threshold logic gates have binary-valued input and output signals, however the gate inputs need not each have the same "importance" in determining the 0 or 1 gate output state. The threshold-logic gate is a logic circuit that can by some means "weight" its various binary inputs, sum the resultant weighted products, and give a gate output 1 or 0 if this weighted sum is above or below a chosen threshold value.

In JCCL it is possible to obtain threshold logic gates if the input weighting factors are not varied over a large range. The weights are carried out by injector gates of different sizes. Figure 4.8 illustrates the general symbol used for a threshold logic gate, together with the



Fig. 4.8 Symbols on "old" symbol system (a), and a possible symbol in the IEC-system (b), and the general expression for a threshold logic element.

general expression for the gate output f(x). Figure 4.9 illustrates some specific threshold logic gates together with the threshold expressions and corresponding Boolean expressions for each. Also the design structure in JCCL is depicted.

The example in Fig. 4.9(c) and (d) is a special threshold function called the 4-universal threshold function [4.4]. A gate or building block is said to be N-universal for N, a positive integer, when it can be used to realize any positive N-dimensional threshold function f by application of signals indicative of the arguments of f, and possibly signals with a constant value 0 or 1, to the inputs of said gate or building block. In the simplest example of an N-universal gate, N=4. There is only one 4-universal function of 4 arguments. A gate realizing this function has 4 inputs and it can realize any of the 10 essentially different, 3-argument positive threshold functions, plus the threshold function  $<2x_1+x_2+x_3+x_4>_3$ . The 4-universal 4argument function is defined by the following Boolean equation:

$$U_4 (x_1 + x_2 + x_3 + x_4) = x_1 (x_2 + x_3 + x_4) + x_2 x_3 x_4$$
(4.7)



$$f(x) = \langle 2x_1 + x_2 + x_3 + x_4 \rangle_3, \text{ threshold} \\ = [x_1 (x_2 + x_3 + x_4) + x_2 x_3 x_4], \text{ Boolean}$$

Fig. 4.9 Examples of a three-input threshold gate (a), with its JCCL implementation (b), and a four-input threshold gate(c) with its JCCL implementation.

Table 4.3 demonstrates that this function is 4-universal by making substitutions of variables which correspond to the application of signals. The first ten functions are the 10 different, positive threshold functions of 3 arguments.

A historical note must be made. The research on threshold logic gates was carried out during the era which began in 1963 and ended in 1973. The reasons for the increased I.

Table 4.3 The substitutions in the 4-universal threshold function  $(x_i)$  for obtaining the 10 essentially different positive threshold functions of 3 variables [4.4].

interest in threshold logic gates and functions then were the discoveries of some important potential advantages over traditional Boolean realizations. The end of the period was marked by the notion that a possible new technology was necessary for efficient implementation of threshold gates. A selected literature survey on threshold logic can be found in [4.5].

### 4.3 The transformation of DCCL into JCCL

The most elaborated research published on charge-coupled logic is that on digital charge-coupled logic [4.6]. The transformation presented here offers the possibility of



Fig. 4.10 AND/OR functions in DCCL and JCCL. (a) DCCL AND/OR structure (b) Decomposition-level drawing of the

- AND/OR structure
- (c) Equivalent JCCL AND/OR structure.

obtaining the whole range of logic functions used in DCCL.

To make the transformation of DCCL structures into JCCL structure clear, the structures are first described by decomposition-level drawings. Decomposition-level drawings show the physical and logical relations between input and output charges, (and currents,) and logic values [4.7]. It should be emphasized that the models describe ideal physical relations between inputs and outputs of gates.

Figure 4.10 (a) shows the basic AND/OR function in DCCL. The operation of this AND/OR function can be explained briefly in the following way. The charge packets X and Y are transferred towards the common storage electrode D. If both inputs are logical ones, the common storage electrode will contain a charge quantity which is twice that of a logical one. This quantity is normalized by providing a potential barrier. One charge quantity is spilled over the potential barrier and forms the logical AND function. The charge that is left is transported on a alternate clockphase and forms the logical OR. In Fig. 4.10 (b) a decomposition-level drawing is depicted. The left part of Fig. 4.10 (b) shows the decomposition-level drawing of an addition in the charge domain, the right shows the structure performing the  $\geq$ 2 operation together with a shift of the charge that remains. Figure 4.10 (c) shows the equivalent JCCL AND/OR.

The DCCL AND/OR gate may be altered to fulfill the exclusive-OR function. The exclusive-OR function is the sum output of the half adder. The DCCL half adder function is shown in Fig. 4.11 (a). In the exclusive-OR implementation the output is taken from the OR function output. However, the output is corrected for the (1+1) state by detecting the AND output with a special gate that changes the potential of the transfer gate and blocks the OR output. In this way the sum of the half adder is obtained. The charge that is left if the transfer gate is blocked is transferred and forms the carry function of the half adder, as is illustrated in the truth table in table. 4.3. The decomposition-level drawing of the DCCL half adder is depicted in fig 4.11 (b). It is obtained by considering the carry-output and the branch with the barrier to form functionally one part. This part yields as a functional result the carry, and if complemented and AND-ed with the OR-output it yields the sum of the half adder. This result is illustrated in the truth table in table. 4.4. The new symbol in the decomposition-level drawing is the symbol for the function Inhibition (x, y). An overview of used symbols in the decomposition-level drawings is given in Fig. 4.12.

To obtain the half adder circuit in JCCL we have to realize that the charge transfer in a channel cannot be blocked by a change in the potential on a gate in the same way as it is done in DCCL. This is due to the fact that



- Fig. 4.11 Halfadder function in DCCL and JCCL.

  - (a) DCCL half adder structure.
    (b) Tecomposition-level drawing of the halfadder structure
  - (c) To make it possible to find the JCCL equivalent half adder structure, the drawing is extended,
  - (d) Equivalent JCCL half adder structure.

Chapter 4

| Addition                                                    |  |  |  |  |  |  |  |  |
|-------------------------------------------------------------|--|--|--|--|--|--|--|--|
| $X_{1} \longrightarrow F \qquad F = x_{1} + \cdots + x_{n}$ |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| Semi-threshold                                              |  |  |  |  |  |  |  |  |
| Yes $F_{Yes} = 1$ if $z \ge k$                              |  |  |  |  |  |  |  |  |
| $Z \xrightarrow{\geq k} F_{Yes} = 0  \text{if } z < k$      |  |  |  |  |  |  |  |  |
| F = k-1 if $z > k$                                          |  |  |  |  |  |  |  |  |
| $F_{No} = z  \text{if } z < k$                              |  |  |  |  |  |  |  |  |
| Inhibition                                                  |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| $X_1 \rightarrow F = [x_1 - x_2]$ , Boolean                 |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| x <sub>2</sub>                                              |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| Fig. 4.12 Symbols used in decomposition-level drawings.     |  |  |  |  |  |  |  |  |
| X Y   S C                                                   |  |  |  |  |  |  |  |  |
| 0 0 0 0                                                     |  |  |  |  |  |  |  |  |
| 0 1 1 0                                                     |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| <u>Table 4.3 Truth table of a half adder.</u>               |  |  |  |  |  |  |  |  |
| $X Y \ge 2 OR (OR \cdot \eta \ge 2)$                        |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| 1 0 0 1 1                                                   |  |  |  |  |  |  |  |  |
|                                                             |  |  |  |  |  |  |  |  |
| carry sum                                                   |  |  |  |  |  |  |  |  |
| Table 4.4 Truth table of the half adder build with an       |  |  |  |  |  |  |  |  |

Table 4.4 Truth table of the half adder build with an or function and-ed with the complement of the function ≥2

| x | у | C' | С    | - C · OR | (¬C·OR)+xyC' |
|---|---|----|------|----------|--------------|
| 0 | 0 | 0  | 0    | 0        | 0            |
| Ó | 0 | 1  | 0    | 1        | ·· 1         |
| 0 | 1 | 0  | 0    | 1        | 1            |
| 0 | 1 | 1  | 1    | 0        | 0            |
| 1 | 0 | 0  | 0    | 1        | 1            |
| 1 | 0 | 1  | 1    | 0        | 0`           |
| 1 | 1 | 0  | 1    | 0        | 0            |
| 1 | 1 | 1  | 1    | 0        | 1,           |
| • |   |    | carr | у.       | sum          |

Table 4.5 Truth table of a full adder, illustrating the building blocks for constructing a full adder in DCCL and JCCL.

the JCCD is a buried-channel CCD. However, if we have a current, the injection mechanism of an injector structure can be blocked with another current (balanced injector structure). To find the JCCL structure for the same operation we first extend the decomposition-level drawing. Without loss of generality, we can insert a ' $\geq$ 1' building block after the ' $\geq$ 2'. By testing the ' $\geq$ 1' condition, in JCCL, we obtain an overflow current that can be injected in another channel. The 'no'-output of the  $\geq$ 1 condition is the functional zero and can be omitted. Figure 4.11 (c) gives the adapted decomposition-level drawing and Fig. 4.11 (d) the equivalent JCCL half adder circuit.

and the second second

The full adder circuit in DCCL, Fig. 4.13(a), can be implemented by adding a third input to the input AND gate, an additional barrier and a storage location, and an OR gate. In this way, the carry function still occurs if two or more inputs are logical ones: carry = xy+yz+zx, the sum function is now built with the OR function of  $\neg$ carry.OR (as in the half adder) and x.y.z. The truth table in table 4.5 illustrates this functioning. The decompositionlevel drawing for this full adder is depicted in Fig. 4.13 (b). This drawing can directly mapped into a JCCL full adder structure. The JCCL full adder is schematically shown in fig 4.13 (c).

The full adder circuit obtained in this way is basically

Ċ







Fig. 4.13 Full adder function in DCCL and corresponding structure in JCCL

- (a) DCCL full adder
- (b) decomposition-level drawing for the full adder
- (c) equivalent JCCL full adder structure.

79





the same full adder as described by van der Klauw [4.2], which was derived in a heuristic way. The transformation formalism shows that it is possible to obtain the DCCL functions in JCCL.

### 4.4 Full adder using exclusive-OR gates in JCCL

If the fan-out of the logical devices is greater than 2, the logical operations can also be performed in separate structures. The implementation of a half adder can be straightforward using an exclusive-OR configuration for the sum-output:

$$S = y \cdot x + x \cdot y = x \oplus y \tag{4.8}$$

and an AND configuration for the carry-output:

 $C = x.y \tag{4.9}$ 

The configurations are depicted schematically in Fig. 4.14. If we use a cascade of half adders, the sum and carry-output have to be of the same type of gate, then the



Fig. 4.15 Schematic topview JCCL full adder using exclusive-OR building blocks.

number of gates is the same as in the previously considered half adder. But one resistor is saved and the delay in the circuit is only one-third of the delay in the previous half adder.

The full adder is also easily implemented using exclusive-OR gates. The sum-output is realized with a cascade of two exclusive-OR gates:

(4.10)

While the carry is obtained using a  $\geq 2$  gate as can be seen



Fig. 4.16 Photomicrograph of the JCCL full adder using exclusive-OR gates.

in Fig. 4.15. The z-input must be delayed before it is combined with the other inputs. In practical applications this could be an advantage (see for example section 6.4 of this thesis). Figure 4.16 shows a photomicrograph of this full adder. It is not easy to compare this full adder, at this stage, with the previous one. A comparison between complete cells, in a way that is possible to cascade full adder cells, is made in the last section of this chapter.

### 4.5 full adder/ full subtractor

As an application of the more general theory of section 4.2.2, in which functions with two variables in a Boolean logic were discussed, the full subtractor is discussed. The truth table of a full subtractor is shown in table 4.6, it shows the borrow (B) and difference (D) functions. The difference function equals the sum function in a full adder. Figure 4.17 shows the Karnaugh-map of the borrow function. It shows that

| × <sub>i</sub> | y <sub>i</sub> | <sup>B</sup> i | <sup>B</sup> i+1 | D <sub>i</sub> |
|----------------|----------------|----------------|------------------|----------------|
| 0              | 0              | 0              | · 0              | 0              |
| 0              | 0              | 1              | 1                | 1              |
| 0              | 1              | 0              | 1                | 1              |
| 0              | 1              | 1              | 1                | 0              |
| 1              | 0              | 0              | 0                | 1              |
| 1              | 0              | 1              | 0                | 0              |
| 1              | 1              | 0              | 0                | 0              |
| 1              | 1              | 1              | 1                | 1              |

Table 4.6 Truth table for subtraction. B is the borrow from a previous stage. D stands for difference. The function performed is x minus y





 $B = x \cdot y + x \cdot z + yz$ 

(4.11)

and equals the carry function if the x is complemented. Using the property that with the overflow gate " $\geq$ 2" we can build the OR-function of two-literal expressions of the form xy+yz+xz and the forms with the substitutions x→¬x, y→¬y or z→¬z (property 3 in section 4.2.2), this borrow function can easily be implemented. The combination of the carry and borrow function in one device is possible. In this way we can obtain a full adder/full subtractor cell in which no additional logic parts, such as exclusive-ORs, are necessary.

The cell is composed bearing in mind that the sum function

as well as the difference function are obtained in two charge transfer steps using two exclusive-OR gates. The combined carry/borrow device is, together with its components, drawn in Fig. 4.19. The functions are determined by a control signal M such that:

> f(x,y,z,M) = xy + yz + xz if M=0, =  $\neg xy + yz + \neg xz$  if M=1. (4.12)

A photomicrograph of this combined carry/borrow function is shown in Fig. 4.18.

### 4.6 Threshold logic full adder

Conventional threshold gate full adders can be found in literature on threshold gates published during the 1960's. Two designs are depicted in Fig. 4.20. The first, Fig. 4.20 (a), shows a full adder based on two 4-universal building blocks [4.4]. It operates as an adder with input



Fig. 4.18 Photomicrograph of a combined carry/borrow device in JCCL.



Fig. 4.19 Schematic topview:

- (a) carry device for full adder
  (b) borrow device for full subtractor
  (c) combined carry/borrow device.



# Fig. 4.20 Threshold gate full adders: (a) using two U<sub>4</sub> building blocks (b) simplified by using a '≥2' carry.

signals x, y, and C(i). One U4 block produces the sumoutput S(i) and the other produces the carry-output C(i+1). Figure 4.20 (b) shows a full adder that is obtained by simplifying the part that produces the carry signal [4.8]. The schematic topview of this full adder is shown in Fig. 4.21. In the chapter on experimental results, the actual layout and design considerations are discussed (section 5.2.3).

### 4.7 Summary and comparison of full adders

In this chapter several full adders have been presented. The adders have different advantages and disadvantages. Further, the list of adders is not complete, for example,



Fig. 4.21 Schematic topview of JCCL full adder based on the simplified circuit of fig. 4.20 (b).

full adders can be obtained using majority gates only.

The full adder (FA-1), first described, was obtained by a transformation from DCCL into JCCL. It possesses the advantage that the inputs are combined in a single channel which gives the sum and carry. The fan-out, if the adders are used in a pipeline manner, is one. A disadvantage of this adder is the number of transport shifts in the charge domain that is necessary, namely four. The number of load resistors is four.

The second full adder (FA-2), which uses exclusive-OR gates to obtain the sum, only uses two transport shifts, in the charge domain, and three load resistors are necessary. The fan-out of two could be a disadvantage. If a full adder is used in a cell for pipeline multiplication

overflow current is increased with approximately 40 % if the epilayer thickness is decreased from 7  $\mu$ m to 5  $\mu$ m [5.1].

In this chapter experimental results of the logic functions realized in the 5  $\mu$ m epilayer process are presented. Also, some comments will be given on results with JCCD logic functions claimed in literature. The experimental results on the logic functions are split up into two categories, belonging to different values of the number of transfers, n. Section 5.2 shows results on JCCL if  $n \le 2$ . In this section the experimental results on simple JCCD logic functions, as decribed in two articles [5.2,5.3], are presented. Experiments with the threshold full adder complete this part. In section 5.3 the results on JCCL if more than two transfers are involved are discussed. The final section, 5.4, describes an interesting experiment which is related to the JCCL research, but violates an important condition on the values of the terminal voltages. It shows the possibility of having a chargecoupled logic at clock voltages under the channel potential Vch(0,0) [5.4].

#### 5.2 Results on JCCL, $n \leq 2$

### 5.2.1 Simple JCCD Logic at 20 MHz [5.2]

Using the properties of junction charge-coupled devices, simple logic functions operating up to 20 MHz can be obtained. The AND function of two inputs and the carry function of three inputs show an improvement of clock frequency, when compared with earlier results of junction charge-coupled logic and other synchronously clocked CCD logic, such as digital charge-coupled logic and multivalued logic CCDs.

Figure 5.1 shows schematically the configuration of the AND and carry functions, basically consisting of one JCCD channel. Charge packets are injected into the channel by means of an injector structure. The clock phases are indicated in the upper right corners of the gates. The arrow denotes an overflow current, if surplus charge is present. If more than one well underneath an injector is



Fig. 5.1 Configuration of gates for carry and AND functions



Fig. 5.2 Schematic diagram of the device structure.

filled, an overflow current will occur at the next clock phase. This overflow current can be injected elsewhere, forming the result of the function. The remaining packet, underneath the overflow gate, is transported to the drain. Load resistors have been left out for simplicity. A schematic diagram of the total test structure is shown in Fig. 5.2

A photomicrograph of the logic circuit, performing the carry function, is shown in Fig. 5.3. The channel potential is 5.2 V and the clock voltage is 6.3 V. Internal load resistors are 2.2  $\Omega$ . The results of the AND and carry functions at clock frequencies of 21.1 MHz are shown in Fig. 5.4. Fig. 5.4a shows the AND functions. Figs. 5.4b and c show all permutations of input signals, performing the carry function of three inputs, with carry (x, y, z) = xy + yz + xz. This function is an essential part of a full adder.



20 µm

Fig. 5.3 Photomicrograph of the carry function.

Earlier publications on JCCL demonstrated the operation of AND and NAND functions at frequencies up to 5 MHz [5.5]. Logic circuits with multivalued CCDs operated up to 1 MHz [5.6]. Groups working on digital charge-coupled logic (DCCL) reported a half-adder successfully operating at 5 MHz [5.7].

### 5.2.2 Junction charge-coupled logic operating up to clock frequencies of 40 MHz [5.3]

Using the properties of junction charge-coupled logic, logic functions operating up to 40 MHz can be obtained. A test device is develloped which minimizes parasitic transport channels between not-neighboring gates. Experiments with this device, performing the carry function of three inputs shows an improvement of clock frequency when compared with earlier results with junction charge-coupled logic and other synchronously clocked CCD logic, such as digital charge-coupled logic and multivalued logic CCDs. The operation frequency has been increased up to 40 MHz.





Fig. 5.4 Experimental results at 21.1 MHz: (a) AND function (b),(c) carry function





Fig. 5.5 JCCL device performing the carry function at higher frequencies.

Figure 5.5 shows a drawing of the tested device, which performs the carry function of three input currents. The three inputs are indicated with an arrow. The input currents are converted into charge packets using a charge injector structure, which is basically an additional n+diffusion in a p-gate (the in this way created npn transistor has its collector in the channel of the JCCD channel). The charge packets under the input gates are transported towards a central gate were the function is performed. The gate can only contain one charge packet, excess charge causes an overflow current representing the functional result.

On the left side of this central gate a small gate and a drain, respectively, are visible. The main improvements in this test device are a 5  $\mu$ m epilayer thickness and a constriction near the small gate. In previous devices electrons drifted around the corners towards the drain instead of contributing to the vertical charge transport of the central gate.

The response of the carry circuit at a clock frequency of 40.7 MHz is shown in Figs. 5.6a and 5.6b. The lower traces present the output signal across 50  $\Omega$ , 10 mV/div. The other traces present the input signals, a high level voltage implies the input of charge that is sampled by one clock phase. The clock voltage is 7 V, the internal load resistors are 2.2 k $\Omega$ . The photographs show all permutations of input signals, performing the carry function, with carry(x,y,z) = xy + yz + xz. This function is an essential part of a full adder.

The results of the presented basic JCCL function show that it is possible to obtain charge-coupled logic at interesting frequencies. If CCD logic is combined with the inherent small memories, it provides an interesting opportunity for pipeline architectures on the bit-level [5.8].





Fig. 5.6 response of the carry function at 40.7 MHz.

### 5.2.3 Threshold full adder

This section shows the results of the threshold full adder, which has been described in section 4.6. The actual layout of the full adder and the pheripheral circuitry of the test device is depicted in Fig. 5.7. It shows the pgates, the resistors, and the n+-diffusions (dotted areas); it also shows the contact holes. In the layout, the following parts can be distinguished:

- a) the input channels  $I_1$ ,  $I_2$ , and  $I_3$ ,
- b) the 1-generator (1),
- c) the output channels  $0_1$  and  $0_2$
- d) the part performing the carry function, A,
- e) the part creating the sum signal, C,
- f) the central drain in the full adder structure, B.

The full adder structure is described in section 4.6. In the implementation special attention is given to the shape of the boundaries near the central drain. Constrictions in the main channel are made to avoid parasitic channels, through which electrons can reach the central drain before they are used to create an overflow current.

The value of the resistors is 4 kΩ. The input and output structures are standard JCCL structures. For the full adder function only the parts A, B, C, and three resistors are necessary. The remaining parts are input and output structures. The size of the full adder part is approximately 180x100  $\mu$ m<sup>2</sup>.

Figure 5.8 shows the response of the circuit at a clock frequency of 1.1 MHz. The clock voltage is 5.8 V. The upper trace shows the carry signal. The second trace shows the output of the sum channel. The three lower traces represent the input signals. In this case two input signals are exactly the same. The picture shows the input combinations (0,0,0), (1,1,0), and (1,1,1). The operation can easily be verified.



Fig. 5.7 Layout of the threshold full adder. The layout shows the diffusions for the gates and n<sup>+</sup>-areas (dotted), and the contact holes. The parts that can be distinguished are explained in the text.

Chapter 5



Fig. 5.8 Response of the full adder circuit.

### 5.3 Results on JCCL, n > 2

Measurements on basic logic functions with more than two charge transfers, realized in the process with  $5\mu m$ epilayer thickness, reveal a problem. The functions show that the logic operations are not performed correctly for every combination of input signals. They all show that the first "1" which should be present in the output signal is absent.

A first question is if this is a consequence of the decrease in epilayer thickness. To answer the question we investigate the references [5.1], [5.5] and [5.9] which claim results with JCCD logic functions in the 7  $\mu$ m epilayer process. Careful inspection shows that the logic operations are not performed correctly too. In reference [5.1] the response of the input signals shows that the 'spike' of the first "1" is reduced to approximately 10% of its normal length. This is the case in both the AND-OR-INVERT circuit and the full adder circuit. In the first case, due to the same mechanism, the switch off takes to long too. In reference [5.9] one input, the lower input in the sum, supplies more than one charge packet, resulting in extra 'spikes' in the output. All these circuits obtain

the logical result in more than one clock cycle.

Figure 5.9 tries to clarify the statement that the logic operations are not performed correctly. Because of the delay properties of the CCD, a direct comparison between the input and output signals is sometimes difficult. To examine the operation the input signals are delayed with a time  $\Delta t$  in the figures on the right side. In these drawings the clock feedthrough is removed too. The arrows denote the incorrect responses. Two remarks should be made on this procedure. First, the delay  $\Delta t$  is only known upto 2/3 of a clockcycle,  $\Delta t = \Delta t' \pm 3/2 \phi$ (clock). This is due to uncertainties in the delay in the input and output structures. However the proposed shifts are obvious. The second remark concerns the small signals, especially those at the beginning of the response. All output pulses are a result of overflow signals. This implicates that even small signals in the response are a result of putting together charge packets that create a surplus charge of at least 40 % . The last oscillogram, Fig. 5.9d, shows another published oscillogram concerning the operation of JCCL. It is clear that this one cannot be investigated because of the absence of the input signals. A possible explanation of the described incomplete operations is, the in section 3.6.3 discussed backward chargeflow because of the non-ideal technology of the present JCCL.

### 5.4 Charge coupled transistor logic, a JCCL compatible logic at clock voltages down to 2 V [5.4]

The possibilities for a charge-coupled logic operating at low clock voltages is investigated. Charge-coupled transistor logic (CCTL) is a mutation of junction chargecoupled logic (JCCL). It is driven by a low voltage clock swing. The described experiment shows the AND function realised in CCTL operating at 12.5 MHz and driven by a clock voltage swing of 3V. If compared with other synchronously clocked CCD logic the experiment shows a decrease of the clock voltage swing.

The experiment has been carried out to investigate the possibilities of a logic device, compatible with Junction





Fig. 5.9 Responses of the various logic devices in JCCL. The responses can be found in :[a]-[5.9]; [b],[c]-[5.1]; (d]-[5.5]. In the text an explanation is proposed for the irregularities denoted by an arrow in the right parts.

101

Charge-Coupled Logic (JCCL), operating at low clock voltages. Charge-Coupled Transistor Logic (CCTL) combines properties of bipolar transistors and junction chargecoupled devices.

CCTL probably best fits in the class of charge-coupled logic. The disadvantages of charge-coupled logic in MOS processes, like DCCL [5.10], are based, among other things, on the inability of directly interconnect two physically separated locations with a metal conductor, the performance only at low frequencies (typically upto 5 MHz), and the high clock voltage swing which makes it almost impossible to be compatible with any 5 Volt (digital) MOS process. JCCL [5.5], made in a bipolar process solves some of the problems. It is compatible with bipolar circuitry, it can connect physically separated locations with metal conductors, and it can operate at frequencies upto 40 MHz. However it is driven by a clock voltage swing of 6-10 Volt.

CTTL, in this experiment, made in the same technology as JCCL, can have the same advantages as JCCL, but is driven by a clock voltge swing of less than 3 Volt.

Figure 5.10 shows a photomicrograph of the device performing the AND function of two input currents, i(x)and i(y). The cross-section along line PP' is depicted in fig. 5.11. The structure consists of a lightly doped ptype substrate and an n-type epilayer with diffused p-type gates. The p-gates are connected to a three phase clock.

The cross-section shows an injector gate at clockphase  $\phi_1$ , the central gate—which performs the AND function— at  $\phi_2$ , a gate at  $\phi_3$ , and the n+ drain connected at a voltage generator at voltage V<sub>n</sub>+.

The result of the AND function in CCTL at a clock frequency of 12.5 MHz is shown in fig. 5.12. The output spikes are delayed with 2/3 of the clock frequency, due to the charge-coupled device operation. One part of this delay is caused by the transfer in the logic function, the other part is caused by the transport in an input circuit that converts a charge packet into a current. The circuit has been tested upto 15 MHz. The clock voltage swing is 3



Fig. 5.10 Photomicrograph of the device performing the AND function.



Fig. 5.11 Cross-section along the line PP'.





V, the value of the voltage on the n+ gate, V , equals 4 V. The successful operation of the logic device is determined by the poper choice of the potential on the n+ gate at a given clock voltage swing. Figure 5.13 gives the relation of these voltages at different frequencies. The line indicates the minimal values that are acceptable.

### Discussion

The transportable charge (Qs) in a JCCL can be split up into three components, Qs = Qw0 + Qwe + Qf, where Qw0 is the charge in the potential well under the condition that the gate-epilayer junction is reverse-biased, Qwe is the excess charge that can be stored in the potential well if the gate-epilayer junction is forward biased but no measurable substrate current is flowing, Qf is the excess of majority charges compensating the charge of the traversing minority carriers which make up the forward transport current. In CCTL the drain becomes a source, Vn+ < 6 V, which fixes the potential in the epilayer. This can be seen as a background charge. The potential on Vn+ is chosen such, that background charge fills the potential well until the pnp transistor becomes active. In this way the transportable charge is probably only Qf. Further research is necessary.



Fig. 5.13 relation between the clock voltage,  $V_{cl}$ , and the voltage on the n<sup>+</sup>-gate ,  $V_{n^+}$ .

Chapter 5

·

\_

## SYNTHESIS

#### 6.1 Introduction

As has been stated in 2.3 junction charge-coupled logic is a technology for bit level systolic arrays. It is timed by a global clock, the transport of charge packets is related to a unit time delay. If JCCL is used in a regular array consisting of modular cells with only neighbor cell interconnections, in which the throughput is independent of the size of the array, then we obtain a bit level systolic array.

Several other technologies can be used for implementing bit level systolic arrays. The most advanced devices implement important signal processing functions using systolic arrays of single bit processors based on gated full adders in a CMOS process. These commercial chips include a bit-slice correlator chip, FIR filters, and Winograd Fourier transforms [6.1].

Parallel bit-level pipelined VLSI designs for signal processing using a mature CMOS technology show a multiplication throughput of 70 MHz [6.2]. Although not completely pipelined, so not systolic on the lowest level, these circuits are excellent examples of building blocks for implementing those signal processing algorithms where throughput and real-time operation are the major concerns, and latency is not a critical factor.

Bit level systolic architectures are very powerful for implementing signal processing functions with JCCL. Systematic approaches to establish a method whereby system level systolic circuits are constructed from building blocks which are themselves systolic arrays at the bit level have been discussed by McCanny and McWhirter [6.1], Danielson [6.3], Li and Wah [6.4], S.Y. Kung [6.5] and Deprettere at al. [6.25].

McCanny and McWhirter investigated bit level systolic arrays for computing sums of products. For application in bit parallel multipliers, they combined two designs, that of McCanny and McWhirter [6.6] and that of Hoekstra [6.7], which were derived on a heuristic basis. Section 6.3 discusses a spiral systolic array for bit level division of two words. It accomplishes the design of the full adder/ full subtractor cell of section 4.5. Finally, section 6.4 discusses the use of JCCL in bit level systolic arrays [6.8].

### 6.2 Some pipelined multiplier arrays for bit level systolic array architectures

In 1982 McCanny and McWhirter proposed bit level systolic arrays to improve the pipelining rate of systolic arrays (at the word level) and to better utilize the current integration level of VLSI technology [6.9]. They described a circuit for the pipelined multiplication of two continuous streams of 4-bit positive numbers.

The array is depicted in Fig. 6.1, it is a pipelined carry-save multiplier, but it differs from previously proposed carry-save devices [6.11] in one important respect. Each bit of b(n) interacts with only one bit of a(n) on a given clock cycle and so no broadcasting of data takes place.



Fig. 6.1 Bit level systolic multiplier array based on carry-save algorithm [6.9].

Each cell comprises a full adder, some simple logic and a number of delays (latches). The logic performed is:

 $S = S' \oplus (a.b) \oplus C'$ C = (a.b), S' + (a.b), C' + S', C'

The rectangles represent delays which enable multiplications to be carried out in a completely pipelined and parallel way. The multiplication array is already systolized.

The numbers a and b to be multiplied are input to the circuit along the upper edges of the array with their constituent bits staggered by means of external delays, as shown. The most significant bit of a (i.e.  $a_2$ ) and the least significant bit of b (i.e.  $b_0$ ) enter the circuit first. The second most significant bit of a and the second least significant bit of b follow one delay later during which the logic operation is performed. This ensures that as each bit of a moves across the array it meets every bit of b—one at each of the cells which it crosses. If the bits enter every clock cycle then the latency of the array is twelve clock cycles, however it is possible to start a new multiplication and to complete the product of a previous on every clock cycle.

An alternative systolized bit parallel multiplier proposed by Hoekstra [6.7] is shown in Fig. 6.2. In this multiplier array the least significant bit of a and the least significant bit of b enter the circuit first. The circuit can be seen as the 'systolization' of the ripple-carry multiplier array [6.10].

Figure 6.2 shows the complete multiplier array, the necessary delays are included in case of a multiplication of two 4-bit positive numbers. This multiplier array uses the familiar "multiplication on paper" procedure. As is illustrated in Fig. 6.3, this array is just the "multiplication on paper" array as is seen in a mirror.

The elementary cell of the lattice is the so-called innerproduct step processor which performs the following operations on the binary information:



Fig. 6.2 Bit level systolic multiplier array based on ripple-carry algorithm [6.7]. The rectangles represent the delays. The crossing-rectangles represent delays in horizontal and vertical direction.

111



Fig. 6.3 Mirror image of "multiplication on paper" procedure.

- (i) copy a and b
- (ii) transfer a and b (delay)
- (iii) multiply a and b to obtain product P
  - (iv) add incoming sum S', product P and incoming carry C' to obtain outgoing sum S and outgoing carry C.

All input and output signals are fully synchronized, obtained by skewing. The arrows represent shifts at discrete time intervals. The rectangles represent delays which enable multiplications to be carried out in a completely pipelined and parallel way, which is the fundamental reason why very high throughputs are feasible.

The numbers to be multiplied are fed in parallel from the left side and shifted through the array. The carries are shifted parallel with the a's. At the bottom the sums S(0), S(1) etc. are obtained. The products a(m).b(n) are calculated at the crossings of bit streams a and b. These products are obtained in the following time order:

- (i) first a(0).b(0)
- (ii) then a(0).b(1)

(iii) next, at the same time, a(0).b(2) and a(1).b(0), and so on.

S(0) can be first obtained. The latency of this array is 11 clock cycles, but numbers can be entered every cycle.

The circuit constitutes a pipelined shift-and-add multiplier, where the carries ripple through each stage of the calculation. However, this does not induce extra delays because in each cell, as a consequence of the systolic properties, a bit of b(n) interacts with only one bit of a(n), and a new sum and carry are formed within a given clock cycle. Thus the clock speed is limited only by the propagation delay of a single cell. In this array the number of cells for multiplication of two m bit words is  $m^2$ 

Compared with the multiplier array described by McCanny and McWhirter the reduction of cells equals

 $1/2\{m(3m+1)\} - m^2 = 1/2(m^2+m)$ (6.5)

whereas the number of delays increases by only

$$1/2\{m(m-3)\} + 1$$
 (6.6)

This is important because delays require much less silicon area than whole cells.

## 6.3 A bit level spiral systolic division array

There is a class of systolic arrays in which neighbor communication not only exists between cells within the array but also between the outer cells, as if the array is projected on a cylinder. In Fig. 6.4 an example is shown. This type of arrays belong to the class of spiral systolic arrays [6.12]. The following description of an array for bit level binary division is such a spiral systolic array.

The method we use for binary division of two positive numbers is a 'systolization' of the principle of a nonrestoring division array [6.13,6.14]. An example illustrates the method.

## Fig. 6.4 Spiral systolic array. The spiral interconnections imply global wiring.

The principle of a nonrestoring array is that, when, in any row of the array, a subtraction has caused a change of sign in the remainder, the next row is arranged to add rather than to subtract.

|       | 4 /  | · 7        | 10  | 00         | divid   | lend |
|-------|------|------------|-----|------------|---------|------|
|       | . ,  | subtract   | 11  |            | divis   |      |
|       |      | Subtract   |     | - <b>-</b> | G1 / 11 |      |
|       | ^    |            | 10  | · -        |         |      |
|       | 0    |            | 10  |            |         |      |
|       | •    | add        | 1   | 11         |         |      |
|       |      |            |     | · ·        |         |      |
|       | 1    |            | 00  | 001        |         |      |
|       |      | subtract   |     | 111        |         |      |
|       |      |            |     |            |         |      |
|       | 0    |            | . 1 | L011       |         |      |
|       |      | add        |     | 111        |         |      |
|       |      |            | -   |            |         |      |
|       | 0    |            |     | 1101       |         |      |
|       |      | add        |     | 111        |         |      |
|       |      |            |     |            |         |      |
|       | 1    |            |     | 0001       |         |      |
|       |      | subtract   |     | 111        |         |      |
|       |      |            |     | • • •      |         |      |
|       | 0    |            |     | 1011       |         |      |
|       | etc. |            |     |            |         |      |
| Thus, | the  | e qoutient | is  | 0.1001     | 0       | •    |

We see that after each subtraction/addition the divisor is shifted to the right. The sign of the most significant bit determines whether the next operation is to be a subtraction or an addition. The complement of this bit is the corresponding quotient bit. If the value of this bit is a logical one an addition of the row with the next row is necessary, otherwise if the bit is a logical zero the next row has to subtracted from the present row and the quotient bit is one.

The spiral systolic array for binary division is depicted in Fig. 6.5. The input mode M, which communicates the information for adding or subtracting, in any row is connected to the output of the most significant bit of the previous row. Essentially it is a linear pipelined structure, which is transformed to a two dimensional lattice by adding delay (represented by the rectangles) in the vertical transport direction (bits a(i)) and adjusting the delay in the diagonal direction ( bits b(j)). The number of unit delays are denoted in the rectangle. The elementary cell of the lattice is an add/subtract cell. It performs the following operations on the binary information:

- (i) copy b (b(j) is the jth bit of the divisor)
- (ii) transfer b (delay)
- (iii) add b with the incoming a' and a given carry; or subtract b from incoming a' with a given borrow
  - (iv) use carry if M=0 or borrow if M=1
  - (v)  $a = a' \oplus b \oplus C'/B'$ ,
  - (vi) C/B = (b.a' + b.C'/B' + a'.C'/B') if M=0 = (b.aa' + b.C'/B' + aa'.C'/B') if M=1

All input and output signals are fully synchronized which is obtained by skewing. The arrows represent shifts at discrete time intervals. The rectangles represent the delays. The bits of the dividend are fed in parallel from the left and shifted through the array. At the same left side the quotient bits and the control signal M are obtained. The manipulations are performed in the following time order:

(1) first  $a_0' - b_0$ (2) then  $a_1' - b_1$ (3) next  $a_2' - b_2$ 



Fig. 6.5 Bit level systolic array for non-restoring division. The function performed is  $\Omega=N/R$ , where  $N=(a_0',a_1',a_2')$  and  $R=(b_0,b_1,b_2)$ . See also example in the text.

116

(4) then  $0 + - b_0$ (5) next  $a_0 + - b_1$ and so on.

The examples illustrate the possibilities of using JCCL in bit level systolic arrays and the ways to build algorithms with JCCL. The following section discusses an application of JCCL on bit level systolic arrays.

6.4 Junction charge-coupled devices for bit level systolic arrays [6.8]

0 Abstract

The application of junction charge-coupled devices (JCCDs) within the concept of bit-level systolic arrays is discussed. The extremely small basic memory cell and the low power dissipation of CCDs make it a candidate for bitlevel systolic arrays if fast suitable logic functions can be realized. Junction charge-coupled logic (JCCL) provides a good solution to the large amount of local memory. The implicit regeneration of charge packets and the variety of logic functions are strong arguments for using junction CCDs. A JCCL inner-product step processor is described.

#### 1 Introduction

The application of junction charge-coupled devices (JCCDs) is discussed within the concept of bit-level systolic arrays. Charge-coupled device (CCD) structures are very small and have excellent pipeline-type memory functions. JCCDs form a special category: the memory devices themselves perform the logic functions. This property can be used for reducing memory device size in bit-level systolic cells.

The combination of CCDs and systolic arrays was introduced by Nash [6.15]. He considered the advantages of CCD logic, for special purpose applications, in terms of throughput per unit power to arrive at an appropriate figure of merit for VLSI. The decisive factor is based on the fact that a CCD memory cell is extremely small and has a low power





dissipation, if used in dynamic signal processing. The most advanced result, achieved by Allen et al. [6.16], is the design of a Hadamar transformer chip.

The advantages of junction charge-coupled logic (JCCL) over the conventional CCD logic are, among others things, the improvement of maximum clock rate and the possibility for directly interconnecting two physically separate locations by means of a metal conductor.

The structures of the JCCD memories and logic parts are introduced, and advantages and disadvantages of JCCL are discussed. Figure 6.6 shows the position of JCCL with regard to digital logic circuits. The MOS-line is divided into the static NMOS and CMOS logic and the intrinsically dynamic CCD logic. JCCD can be seen as the bipolar counterpart of this CCD logic.

#### 2 Bit-level IPSP cells

For real-time digital signal processing with systolic arrays, pipelining at all levels should be pursued. At the bit level this requires a large amount of local memory for transferring information from cell to cell, and for skewing input and output signals. For certain algorithms it may even be necessary to include memory cells in between the cells themselves. The area used by latches in



Fig. 6.7 Structures of bit-level IPSP cells in CMOS/SOS and in JCCL

*a* CMOS: latch, full adder *b* JCCL: coefficient delay:  $100 \times 50 \ \mu m^2$ ; gate delay:  $20 \times 40 \ \mu m^2$ ; full adder:  $100 \times 200 \ \mu m^2$ total cell areas are comparable

a bit-level systolic cell, in the CMOS/SOS convolver array of GEC [6.17,6.18], is 50-70% of the total circuit area. One way to improve this figure is using dynamic latches instead of the static latches, which were used for radiation hardness reasons. Another way is using techniques combining logic functions with CCD memories. The dynamic behavior and serial access of CCDs comply very well with the concept of systolic arrays.

Figure 6.7 shows the structures of bit-level inner-product step processor (IPSP) cells performing:

P = ab  $S' = S \oplus P \oplus C$ C' = PS + CP + SC

These functions are combined with the following operations on the binary information:

(i) copy a and b(ii) transfer a and b (delay).

It should be noted that the sequence of the delay and functions in the case of JCCL is arbitrary; the incoming signals are distributed over the coefficient delays and the logical functions. In the MOS cell all input signals are latched. So there are 4 latches in the IPSP cell. In JCCL there are two different types of delays because there is a 3-phase clocking scheme which will be explained later on. The coefficient delay is functionally comparable with the data latch, whereas the gate delay is just one or two extra transfers, which are very small in size.

What is the justification for research on JCCL in an inner-product step processor? First of all, the clocked and pipelined (J)CCD logic is a technology for bit-level systolic arrays. Further, JCCL is processed in a standard bipolar process providing the possibility for integrating good bipolar circuitry for data input and output, and clock drivers. With regard to size and functionality, we can say that the first research model of the JCCL IPSP cell has the same size as the IPSP cell of the CMOS/SOS convolver [ 6.17], using the same line widths. A photomicrograph of this cell is shown in Fig. 6.8. The benefits of CCD logic must be based on the low power dissipation, when the cells are used in a dynamic mode. The power dissipation of this cell is estimated to be 3 mW.

#### 3 Application of CCDs

The concept of a memory device, or delay, using charge stored on capacitors is the basic idea of CCDs. A CCD, in its simplest form, is an array of closely spaced MOS capacitors. If at some time a positive voltage is applied to the metal electrode, a potential well is formed in the p-type silicon, in which electrons can be stored. The CCD operation consists of a time discrete transfer of separate quantities of electrons from one potential well to another. This transfer is directed by the application of clocked voltages to a concatenation of gates. Further information on CCDs and their applications may found in,



Fig. 6.8 Photomicrograph of a JCCL IPSP cell.

for example, Sequin and Tompsett [6.19], and Howes and Morgan [6.20].

For digital signals an empty well represents a "O", and a well filled with electrons represents a "l". In terms of digital filters a block diagram of an N-stage CCD is given in Fig. 6.9, and the system function is given by:

$$H(z) = | \frac{1-e}{1-ez^{-1}} |^{N-1}z$$





An ideal delay by one clock period is  $z^{-1}$ , and an ideal delay by N clock periods is  $z^{-N}$ . However, the CCD delay is not ideal; each time charge is transferred from the (k-1)th stage, a fraction  $\epsilon$  of the charge is left behind (typically  $\epsilon \approx 10^{-3}$ ).

## 4 Structure of logic JCCD functions

JCCDs offer interesting advantages over MOS CCDs in the area of logic applications. Junction CCDs have reversed biased pn-junctions as gates. Their operation is treated by Kleefstra [6.21], who also reported the first experimental results [6.22]. Junction charge-coupled logic was first described by May et al. [6.23].

The devices performing logic operations are constructed by using the JCCD property of vertical charge transport through the gates. This property, which does not have a counterpart in MOS CCDs, is exploited in two ways. First, in case of charge injection, an  $n^+$ -emitter is placed in a p-gate. Electrons can be injected by forward biasing the  $n^+$ p junction. Charge can be injected into the CCD channel, acting as a collector of this vertical npn transistor. Second, in the case of charge detection, the potential on the gate is raised, and surplus charge, that is created by bringing two charge packets together may flow from the channel into the gate as long as the gate potential exceeds the channel potential underneath the neighboring gates. The gain of the vertical pnp transistor is used to regenerate the charge packet.

Logic circuits can be designed using charge injection and charge overflow structures. By combining different structures and the CCD property of charge transport in the channel, logical functions can be constructed in several ways.

The JCCDs are driven by a 3-phase clock. The contents of a charge packet are determined by the clock voltage and the area of the p-gate. The operation of a JCCL device will be illustrated with the function f(x,y) = x + y. The device, drawn in Fig. 6.10, consists of two injector gates A and C, an overflow gate B with the size of an injector





Fig. 6.10 JCCL device performing f(x,y) = x + y  $b \ t = t_1, \psi_1 > 0$   $c \ t = t_2, \psi_2 > 0$  $d \ t = t_3, \psi_3 > 0$ 

gate, and two gates with minimum dimensions D, E. These last two form a so-called charge circulator. The logical "1" is, in this case, represented by an overflow current. At time t=t<sub>1</sub> the voltage,  $\phi$  ,on the injector gates A and C and gate E is raised ( $\phi_1 > 0$ ); charge can be injected under the injector gates A and C by means of overflow currents  $i_x$ and  $i_y$ . At this time charge is present under gate E of the charge circulator. At time t=t<sub>2</sub>( $\phi_2 > 0$ ) the contents of A, C and E are combined in the overflow gate B. If one or both of the wells under the injector gates are filled, overflow will occur. This overflow current represents the logical "1" because it can be converted into a single charge packet; thus f(0,1) = 1, f(1,0) = 1 and f(1,1) = 1. If there is not any injection  $(i_x=i_y=0)$  overflow will not occur, f(0,0) = 0, but only one minimum charge packet will circulated. At time  $t=t_3$  this packet is transferred to gate D, where it is normalized to a minimum packet. Owing to space limitations a systematic approach to JCCL functions cannot be given here. A very brief overview is given. The possibilities to construct logic functions with JCCDs are:

- (a) logic functions using only transfers in the charge domain. In this way simple Boolean functions and, say, the carry function can be obtained.
- (b) Logic functions using transfers in the charge and current domains. We distinguish the following classes:
  - (i) logic functions using exclusively the balanced injector structures. An example is the exclusive -OR, which will be discussed in the next section.(ii) threshold logic
  - (iii) other mixtures of charge transport and balanced injector structures.

In general, functions are developed and verified by the help of a logic simulation computer program. A memory structure can now be interpreted as the logic function f(x)=x, or just as the transfer of a desired result until all logic functions in the cell are completed. So the size of the coefficient delays depends on the number of transfers necessary for completing the cell's logic. The gate delays are used to obtain the results simultaneously, and are, in fact, part of the logic function.

## 5 Description of a JCCL IPSP

Recently, van der Klauw [6.24] described a JCCL full adder (FA) circuit. This FA could be used to construct a IPSP cell. In this article is stated that the FA operations are calculated within one clock-cycle. This is true. However, if used in an IPSP cell, the FA operations must be accomplished with an AND function, and the result must be available in such a form that it can be used as an input in the next cell. In general this will result in an operation time of two clock cycles, for this FA, and consequently, the size of the coefficient delays will be doubled. The total area of a cell based on this FA will be about 0.13  $\rm mm^2$ , which is twice the size of the IPSP cell of the CMOS/SOS correlator.

For applications in bit-level systolic arrays it is preferential to use logic structures which complete the logic function in one shift after the creation of charge packets underneath injector gates. Figure 6.11 shows a number of possibilities. By combining the carry function with exclusive-OR functions it is possible to obtain a cell in which the total function is obtained within one clock cycle. This cell has a total area of about 0.07 mm<sup>2</sup>.

A last remark, before we describe the cell, is that it is possible to exchange the charge circulator for a drain. In this case we only have to use a smaller overflow gate and one extra transfer gate; the power dissipation, of such a structure, will be less, but a direct voltage is necessary. At present we do not know which configuration will be most suitable; research on this point is still going on.

An inner-product step processor can be constructed with JCCL coefficient delays, an AND, two exclusive-ORs, and a carry function. It is shown schematically in Fig. 6.12 and will be discussed briefly. The exclusive-OR device is easily obtained by combining two balanced injector structures, an overflow gate and a charge circulator. If only one current is offered, just one injector is opened. If both currents are offered, both injectors are turned off, and no charge is supplied to the overflow gate. The AND and carry functions were shown in Fig. 6.11, and can be verified easily. The carry function in Fig. 6.12 is already combined with the required gate delays. The coefficient delays are drawn. The delay of the coefficient b is doubled, due to the fact that this cell can be used for bit-level multiplication based on the shift-and-add algorithm [6.7].

## 6 Conclusions

JCCL has some useful properties such as operation at frequencies up to 50 MHz (according to computer

а







С







Fig. 6.1! JCCL basic devices a f(x) = xb f(x) = 0c f(x,y) = x + y Or d f(x,y) = xy And e f(x,y) = 0f f(x,y,z) = x + y + zg f(x,y,z) = xy + yz + zx Carry h f(x,y,z) = xyz

b



Fig. 6.12 Logic and delays of an JCCL IPSP

simulations), combination with bipolar circuitry (integrated clock drivers, input, output) and small memory structures.

In general, in any synchronous circuit, clock skew will limit the maximum rate of operation. To obtain operation at high frequencies, we should use the benefits of the following facts:

- (a) only aluminum interconnections are used,
- (b) operation at 250 MHz has been achieved with JCCD delay lines,
- (c) simple logic function have been operated at clock frequencies over 30 MHz

Advantages over MOS CCDs result from the vertical charge transport. In particular, the implicit regeneration of charge packets (after each charge injector structure) and the variety of logic functions are strong arguments for using JCCDs. There are also some disadvantages, such as higher sensitivity to process variations, and larger size of the JCCD basic delay cell. For use in bit-level systolic arrays the arguments for CCDs are the expected low power dissipation and the small area used by memory and delay. In case of JCCL this area is estimated to be 20-25% of the total cell area. On the action formulation in semiconductor physics and modeling

# APPENDIX A

#### Introduction

The physics of electromagnetic phenomena in semiconductors can be described by two basically different sets of equations. First there are the familiar Maxwell equations. When combined with the equations for charge transport, they provide the familiar description of carrier dynamics in semiconductor structures. Another method is to describe the physics in terms of the quantity called the action S. The position that action occupies in physics stems from a fundamental law of physics: the principle of least action, whose classical formulation states that in real processes observed in nature action is extremal (its variation vanishes). This variational principle was introduced into physics by Fermat (1662) and today plays a profound role in ordinary mechanics, relativistic mechanics and in (quantum) field theory.

In standard textbooks on semiconductor (device) physics

the the is based on solutions of the one-dimensional field Subsequently, equilibrium, are derived from Poisson's equation. electric field, and depletion layer width in thermal the Maxwell equations, current-density equations, and continuity equations. In the case of the pn-junction diode based on finite-difference methods. equations and, if there, most numerical device modeling is physical basic equations for semiconductor device quantities, (one-dimensional) analytic such as diffusion potential, device modeling operation are

The natural proceeding. extremal action, the physical quantities can al easily obtained. In the case of numerical devic methods like the finite element method, follow and the physics behind them. If the physics is described analytical and in one dimension, by the principle of equations numerical modeling least action. The two sets of equations are "fundamental". However if device modeling is Maxwell's action S can be give a different view on the numerical methods, equations approaches, the different can be derived from the principle derived from Maxwell's equations and case of numerical device modeling, equations are equally can also be sets of done as y ρ of

made. the basic variational equation may not оf dynamics, is only approached from the mathematical point can be obtained for a range of second, for all variationally conceived problems [A.1], from Poisson's equation. At this point two remarks In standard texts on the modeling of semiconductor devices view, finite element method, which originates from fluid First, the finite element purely mathematical, and the finite element equations are obtained the finite problems for which such equations can be obtained exist element equations [A.2]. and can be ρ

If equation action S, elctrostatic phenomena are described is a natural the physics result. is clear and the finite in terms of the element

Principle of least action [A3,A5]

In order not to complicate the description we consider the simplest and best known system: the one-dimensional classical description of a particle [A.3]. For a nonrelativistic particle in a static potential, the action from an instant of time  $t_1$  to another instant  $t_2$  is:

$$S = \int_{1}^{t_2} L dt$$
 (A.1)

where L is the so-called Lagrange function:

$$L = T_{kin} - U$$
 (A.2)

where T is the kinetic energy and U the potential energy.

The principle of least action asserts that the integral S must be a minimum for infinitesimal lenghts of the path of integration. For a path of arbitrary length we can say only that S must be an extremum, not necessarily a minimum.

Let q=q(t) be the function for which S is a minimum. This means that S is increased when q(t) is replaced by any function of the form:

$$q(t) + \delta q(t) \tag{A.3}$$

where q(t) is a function which is small everywhere in the interval of time from  $t_1$  to  $t_2$ , see Fig. A.1;  $\delta q(t)$  is called a variation of the function q(t). Since for  $t=t_1$  and for  $t=t_2$ , all functions must take the value  $q(t=t_1)$  and  $q(t=t_2)$  respectively, it follows that:

$$\delta q(t_1) = \delta q(t_2) = 0 \tag{A.4}$$

The change in S when q is replaced by  $q+\delta q$  is:



Fig. A.1 A varied path

 $\begin{array}{ccc} t_2 & t_2 \\ \int L(q+\delta q, q+\delta q, t)dt & -\int L(q, q, t)dt \\ t_1 & t_1 \end{array}$  (A.5)

when this difference is expanded in powers of  $\delta q$  and  $\delta q$  in the integrand, the leading terms are of the first order. The necessary condition for S to have a minimum is that these terms are zero.

Specifically let:

$$S = \int_{t_1}^{t_2} L(q,q,t) dt$$
(A.6)

where q is the position and the time derivate q is the velocity. Thus the principle of least action may be written in the form:

$$\delta S = \delta \int_{t_1}^{t_2} L(q,q,t) dt = 0$$
(A.7)

or effecting the variation

$$\int_{t_1}^{t_2} \left(\frac{\partial L}{\partial q} \,\delta q + \frac{\partial L}{\partial \dot{q}} \,\delta \dot{q}\right) \,dt = 0 \tag{A.8}$$

since  $\delta q = d\delta q/dt$ , we obtain, on integrating the second term by parts:

$$\delta S = \begin{bmatrix} \frac{\partial L}{\partial \dot{q}} & \delta q \end{bmatrix}_{t_1}^{t_2} + \int \left( \frac{\partial L}{\partial q} - \frac{d}{dt} & \frac{\partial L}{\partial \dot{q}} \right) \delta q dt \qquad (A.9)$$
  
t<sub>1</sub>

The conditions (A.4) show that the integrand term in (A.9) is zero. There remains an integral which must vanish for all values of  $\delta q$ . This can be so only if the integrand is zero identical. Thus we have:

$$\frac{\mathrm{d}}{\mathrm{dt}}\left(\frac{\partial \mathrm{L}}{\partial \mathrm{q}}\right) - \frac{\partial \mathrm{L}}{\partial \mathrm{q}} = 0 \tag{A.10}$$

This equation is called the Euler equation and is a necessary (but not sufficient) condition for the existence of a stationary value. If in this equation the Lagrange function is substituted it gives Newton's equation of motion. And, as we will see further on, if the formalism is in three dimensions and we substitute the Lagrangian for electrostatics the solution of the Euler equation is Poisson's equation.

For the actual path S is zero, thus S is an extremum. In most cases of physical interest the stationairy value will be a minimum. The problem of determining the stationary value of S is considerably more difficult than the corresponding problem in differential calculus. Indeed, there may be no solution. In differential calculus the minimum is determined by comparing  $q(t_0)$  with q(t), where t ranges over neighboring points. Here we assume the existence of an optimum path for which S is stationary, and then compare S for our (unknown) optimum path with that obtained from neighboring paths. A valid question at this point is: if we have a (not optimal) form of the action integral, how do we get the best (approximate) path? To answer the question we describe the path with:

$$q(t,\beta) = q(t,0) + \beta \eta(t)$$
(A.11)

thus

$$\delta q = \beta \eta(t) \tag{A.12}$$

We choose  $q(t,\beta=0)$  as the unknown pth that will minimize S. Then  $q(t,\beta)$  describes a neighboring path. The function  $\eta(t)$  is arbitrary except for two restrictions. First, all varied paths must pass through the fixed end points:

 $\eta(t_1) = \eta(t_2) = 0$  (A.13)

Second,  $\eta(t)$  must be differentiable. S is now a function (technically, S is a functional) of our new parameter  $\beta$ . If we do not have the optimum path the value of S will be too high. And the best approximation is to pick the  $\beta$  that gives the minimum value for S.

#### Formulation of S for electrostatics [A4,A5]

The concept of the electromagnetic field was formed by Faraday, Maxwell and others. The field is a system with an infinite number of degrees of freedom, varying in space and time. Fields carry energy and momentum. Relativistic invariance demands that the potentials of various fields be transformed in a specific manner under a fourdimensional vector. For the case of a field we have that the action integral is expressed as:

$$S = \int L \, dx dy dz dt \tag{A.14}$$

where L is again a Lagrangian adapted to the situation. The interval is taken over the whole space-time. The action function S for the whole system, consisting of an electromagnetic field as well as the particles located in it, must consist of three parts:

 $S = S_{f} + S_{m} + S_{mf}$ (A.15)

where Sm is that part of the action which depends only on the properties of the particle, that is, just the action

134

for free particles. The quantity Smf is that part of the action which depends on the interaction between the particles and the field. Finally, Sf is that part of the action that depends only on the properties of the field itself, that is, Sf is the action for a field in the absence of charged particles.

The total expression of the action for the electromagnetic field will not be given here. It consists of an integral over the sum of actions for each of the individual particles, the potential of the field at that point of space-time at which the corresponding particle is located, and some function of the electromagnetic field tensor. This expression can be found in literature [A.4]. The action for electrostatics is [A.5]:

$$S = \frac{\epsilon}{2} \int (\nabla \phi)^2 \, dV - \int \rho \phi \, dV \qquad (A.16)$$

which is a volume integral over the space enclosed by the boundary conditions on E and  $\phi$ . S is a extremum for the correct potential distribution  $\phi(x,y,z)$ . Consider S for the correct  $\phi$  plus a small deviation  $\delta\phi$ . The  $\phi$  is what we are looking for, but we are making a variation of it to find what it has to be so that the variation of S is zero to first order.

$$S_{\phi+\delta\phi} = \frac{\epsilon}{2} \int \left[ (\nabla \phi)^2 + 2 \nabla \phi \cdot \nabla \delta q + (\nabla \delta q)^2 \right] dV - \int (\rho \phi + \rho \delta \phi) dV$$
(A.17)

disregarding second order terms we find

$$\delta S = \int (\epsilon \nabla \phi \cdot \nabla \delta \phi - \rho \delta \phi) dV \qquad (A.18)$$

We use the following equality:

$$\nabla \cdot (a\nabla b) = \nabla a \cdot \nabla b + a\nabla^2 b \tag{A.19}$$

and find that

$$\delta S = \int \left\{ \epsilon \left[ -\delta \phi \nabla^2 \phi + \nabla \cdot (\delta \phi \nabla \phi) \right] - \rho \delta \phi \right\} dV \qquad (A.20)$$

By Gauss's theorem, the divergence term integrated over the volume can be replaced by a surface integral:

$$\int_{\text{vol}} \nabla \cdot (\delta \phi \nabla \phi) \, dV = \int_{\text{surf}} \delta \phi \nabla \phi \cdot n \, da \qquad (A.21)$$

If we impose the boundary condition on the surface then the integral vanishes. The remaining volume integral is:

$$\delta S = \int (-\epsilon \nabla^2 \phi - \rho) \delta \phi dV \qquad (A.22)$$

Since  $\delta \phi$  is arbitrary (but,  $\delta \phi \neq 0$ ), we obtain for the correct  $\phi$ ,  $\delta S=0$ ,:

$$\nabla^2 \phi = -\rho/\epsilon \tag{A.23}$$

which equals Poisson's equation. Equation (A.18) equals the basic equation for obtaining the finite element equations [A.6].

Applications: the simple one-dimensional pn-junction

The potential is described with a parameter  $\alpha$ ,  $\phi = \phi(\mathbf{x}, \alpha)$ . The action now is a function of the new parameter  $\alpha$  and the condition for an extremum is:

$$\frac{\partial S}{\partial \alpha} = 0 \tag{A.24}$$

This equation gives the value of  $\alpha$  for the best approximation with the given potential function.

The abrupt junction, the linearly graded junction, and an approximation of the linearly graded junction are discussed. Probably the most characteristic feature of the simple pn-junction (homogeneous charge distribution) is that in equilibrium the electric field equals zero outside the depletion region. It is possible to use this criterion for constructing a good choice for the potential distribution. In fact, to guess the potential function we have to know the exact values of the potential distribution at the end points (compare equation (A.4)). In the case of the two junctions we know one end point, the potential at x=0 is zero. But the other end point is unknown. However, what we do know is that the electrical field at this end point is zero. Yet, we can use this boundary condition. In this case the Lagrangian is  $L = \frac{\epsilon}{2} (E)^2 - \rho \phi$ . Therefore we consider the integrand form in equation (A.9):  $\left[\frac{\partial L}{\partial E} \ \delta \phi\right]_{X_1}^{X_2}$  (A.25)

This term is zero if  $\delta\phi(x_1) = \delta\phi(x_2) = 0$ , but also if

$$\delta\phi(\mathbf{x}_1) = \frac{\partial \mathbf{L}}{\partial \mathbf{E}}\Big|_{\mathbf{x}=\mathbf{x}_2} = 0 \tag{A.26}$$

In our case:

$$\frac{\partial \mathbf{L}}{\partial \mathbf{E}} = \epsilon \mathbf{E} \tag{A.27}$$

thus the boundary condition  $E(x_2) = 0$  can be used.

## The abrupt junction

We consider the abrupt junction, see Fig. A.2.

$$\rho = qN_{D} \qquad -d_{n} \le x \le 0 \qquad (A.28)$$

$$\rho = -qN_{A} \qquad 0 \le x \le d_{p}$$

•----

Because of the constant space charge distribution a linear function expressing the E-field seems reasonable so, using the boundary conditions for  $-d_n \le x \le 0$  we quess:

$$\mathbf{E} = -\nabla \phi = \alpha (\mathbf{d}_{n} + \mathbf{x}) \tag{A.29}$$

$$\phi = -\alpha \, d_n x - \frac{1}{2} \, \alpha \, x^2 + C \qquad (A.30)$$

Now:

$$S = \frac{\epsilon \epsilon_{\varphi}}{2} \int_{d_n}^{0} (\nabla \phi)^2 \, dx - qN_D \int_{d_n}^{0} \phi \, dx \qquad (A.31)$$





- Fig. A.2 Abrupt junction in thermal equilibrium (a) space-charge distribution
  (b) electric field distribution
  (c) potential variation
- Fig. A.3 Linearly graded junction in thermal equilibrium.
  - (a) space-charge distribution
  - (b) electric field distribution.

$$=\frac{\epsilon \epsilon_0 d_n^3}{6} \alpha^2 + \frac{q N_d d^3}{3} \alpha + q N_d d_n^2 C \qquad (A.32)$$

Then we search the  $\alpha$  that gives the minimum integral value: . NT

$$\frac{\partial S}{\partial \alpha} = 0 \rightarrow \alpha = \frac{qN_d}{\epsilon \epsilon_0}$$
(A.33)

And we find:

$$E = \frac{qN_D}{\epsilon \epsilon_0} (d_n + x)$$
 (A.34)

$$\phi = \frac{-qN_D}{\epsilon \epsilon_0} \left( d_n x + \frac{1}{2} x^2 \right) + C$$
 (A.35)

which are the familiar expressions for E and  $\phi$ . The formulas for  $0 \le x \le d$  can be found in a similar way.

## The linearly graded junction

We consider only  $-W/2 \le x \le 0$  :

$$\rho = -qgx$$
 , see also Fig. A.3 (A.36)

For the electrical field equation we guess:

$$E = \alpha \left( \left( \frac{W}{2} \right)^2 - x^2 \right)$$
 (A.37)

Proceeding in a similar way as in the previous case:

$$\phi = -\alpha \ (\frac{W}{2})^2 x + \frac{1}{3} \alpha x^3 + C$$
 (A.38)

$$S = \frac{\epsilon \epsilon_0}{2} \int_{W/2}^{0} (\nabla \phi)^2 dx + qg \int_{W/2}^{0} dx$$
 (A.39)

$$=\frac{4}{15}\epsilon\epsilon_0(\frac{W}{2})^5\alpha^2-\frac{4}{15}qg(\frac{W}{2})^5\alpha+\frac{1}{2}C(\frac{W}{2})^2 \qquad (A.40)$$

Taking:

$$\frac{\partial S}{\partial \alpha} = 0 \quad \rightarrow \quad \alpha = \frac{qg}{2\epsilon\epsilon_0} \tag{A.41}$$

we obain:

$$\mathbf{E} = \frac{qg}{\epsilon \epsilon_0} \left( \frac{(W/2)^2 - x^2}{2} \right)$$
(A.42)

which is also a familiar result.

#### Approximation

The above mentioned examples show that it is possible to obtain the correct expressions of the electrical field and the electrical potential if we do know the correct structure of the formulas. If we do not know the correct structure of the electrical field the minimum principle formulation will give the best possible approximation at a given structure of the function. This is the basis of the Garlekin method [A.1]. It is illustrated by again considering the linearly graded junction however, instead of the quadratic function of the electrical field, a linear function is quessed for the electrical field:

We consider only  $-d_k \le x \le 0$  :

$$\rho = -qgx \tag{A.43}$$

We quess :

$$E = -\nabla \phi = \alpha (d_k + x)$$
 (A.44)

$$\phi = -\alpha \, d_k x - \frac{1}{2} \, \alpha \, x^2 + C$$
 (A.45)

Now applying the action formalism:

$$S = \frac{\epsilon \epsilon_0}{2} \int_{d_k}^{0} (\nabla \phi)^2 dx + qg \int_{d_k}^{0} \phi x dx \qquad (A.46)$$

Taking:

$$\frac{\partial S}{\partial \alpha} = 0 \rightarrow \alpha = \frac{5}{8} \frac{qg}{\epsilon \epsilon_0} d_k \qquad (A.48)$$

The results are illustrated in Fig. A.4.



Fig. A.4 The potential distribution for the linearly graded junction  $\phi$ , and the distribution  $\phi'$  that approximates the potential distribution  $\phi$ .

Appendix A

.

#### References:

## Chapter 1.

- [1.1] E P May, C L M van der Klauw, M Kleefstra, and E A Wolsheimer, 'Junction Charge-Coupled Logic (JCCL)', IEEE Journal of Solid-State Circuits, vol SC-18, no 6, pp 767-772, Dec 1983
- [1.2] C L M van der Klauw, 'Vertical Charge Transport In Junction Charge Coupled Devices', Ph D Dissertation, Delft University of Technology, Oct 1987
- [1.3] J G Nash, 'Combinatorial Digital Logic Using Charge-Coupled Devices', IEEE Journal of Solid-State Circuits, vol SC-17, no 5, pp 957-963, Oct 1980

## Chapter 2.

- [2.1] F J L Sangster, US Patent 3546490, 1966
- [2.2] W E Engeler, J J Tiemann, and R D Baertsch, 'Surface Charge Transport In Silicon', Applied Physics Letters, vol 17, pp 469-472, 1970
- [2.3] W S Boyle and G E Smith, 'Charge Coupled Semiconductor Devices', Bell System Technical Journal, pp 587-593, April 1970
- [2.4] W S Boyle and G E Smith, 'The Inception Of Charge-Coupled Devices', IEEE Trans on Electron Devices, vol ED-23, no 7, pp 661-663, July 1976
- [2.5] E I Gordon and M H Crowell, 'A Charge Storage Target For Electron Image Sensing', Bell System Technical Journal, pp 1855-1873, Nov 1968
- [2.6] C H S Sequin and M F Tompsett, 'Charge Transfer Devices', in <u>Advances in Electronics and Electron</u> <u>Physics</u>, Supplement B (New York: Academic Press), 1975
- [2.7] J D E Beynon and D R Lamb, <u>Charge-Coupled Devices</u> and their <u>Applications</u> (London: McGraw-Hill), 1980
- [2.8] M J Howes and D V Morgan, <u>Charge-Coupled Devices and</u> <u>Systems</u> (Chicester: Wiley), 1979
- [2.9] L J M Esser and F L J Sangster, 'Charge Transfer Devices', in <u>Handbook on Semiconductors</u>, vol 4 Device Physics, ed C Hilsum (Amsterdam: North-Holland), 1981
- [2.10] C J M Esser, 'Peristaltic Charge-Coupled Device: A New Type Of Charge-Transfer Device', Electronics Letters, vol 8, pp 620-621, 1972

```
Journal, vol 51, pp 1635-1640, 1972
```

- [2.12] F L Schuermeyer, R A Belt, C R Young, and J M Blasingame, 'New Structures For Charge-Coupled Devices', Proc IEEE, vol 60, pp 1444-1445, Nov 1972
- [2.13] W F Kosonocky and J E Carnes, 'Charge-Coupled Digital Circuits', IEEE Journal of Solid-State Circuits, vol SC-6, no 5, pp 314-322, Oct 1971
- [2.14] M F Tompsett, ' A Simple Charge Regenerator For Use With Charge-Transfer Devices And The Design Of Functional Logic Circuits', IEEE Journal of Solid-State Circuits, vol SC-7, no 3, pp 237-242, June 1972
- [2.15] T D Mok, C A T Salama, 'Logic Array Using Charge-Transfer Devices', Electronics Letters, vol 8, no 20, pp 495-496, Oct 1972
- [2.16] T A Zimmerman, R A Allen, R W Jacobs, 'Digital Charge-Coupled Logic (DCCL)', IEEE Journal of Solid-State Circuits, vol SC-21, no 5, pp 473-485, Oct 1977
- [2.17] J H Montgomery and H S Gamble, 'Basic CCD Logic Gates', The Radio and Electronic Engineer, vol 50, no 5, pp 258-268, May 1980
- [2.18] H G Kerkhoff and M L Tervoert, 'Multiple-Valued Logic Charge-Coupled Devices', IEEE Trans on Computers, vol C-30, no 9, pp 644-652, Sept 1981
- [2.19] E A Wolsheimer, 'Physics, Technology, And Applications Of Junction Charge-Coupled Devices', Ph D Dissertation, Delft University of Technology, June 1982
- [2.20] E P May, C L M van der Klauw, M Kleefstra, and E A Wolsheimer, 'Junction Charge-Coupled Logic (JCCL)', IEEE Journal of Solid-State Circuits, vol SC-18, no 6, pp 767-772, Dec 1983
- [2.21] C L M van der Klauw, 'Vertical Charge Transport In Junction Charge Coupled Devices', Ph D Dissertation, Delft University of Technology, Oct 1987
- [2.22] R A Allen et al., 'Charge-Coupled Devices In Signal Processing Systems, vol V, Final Report, U.S. Navy Contract no N0014-74-C0068, Dec 1979

- [2.23] J G Nash, 'Combinatorial Digital Logic Using Charge-Coupled Devices', IEEE Journal of Solid-State Circuits, vol SC-17, no 5, pp 957-963, Oct 1980
- [2.24] H T Kung and C E Leiserson, 'Algorithms For VLSI Processor Arrays' in <u>Introduction to VLSI Systems</u>, ed C A Mead and L Conway (New York: Addison-Wesley), pp 271-292, 1980
- [2.25] S Y Kung, 'On Supercomputing With Systolic-/Wavefront Array Processors', Proc of the IEEE, vol 72, no 7, pp 867-884, July 1984
- [2.26] S Y Kung, 'VLSI Array Processors', in <u>Systolic</u> <u>Arrays</u>, ed W Moore et al.(Bristol: Adam Hilger), pp 7-24, 1987
- [2.27] W Moore, A McCabe, and R Urquhart, 'Design Methods And Tools', in <u>Systolic Arrays</u>, ed W Moore et al. (Bristol: Adam Hilger), pp 3-6, 1987
- [2.28] J McCanny and J McWhirter, 'The Derivation And Utilisation Of Bit Level Systolic Array Architectures', in <u>Systolic Arrays</u>, ed W Moore et al.(Bristol: Adam Hilger), pp 47-59, 1987
- [2.29] S Y Kung, J Anneveling and P Dewilde, 'Hierarchical iterative flowgraph design for VLSI array processors', in IEEE Workshop on VLSI signal processing, L.A., nov 1984.

Chapter 3.

- [3.1] G C Herman, C D Hartgring, and M Kleefstra, 'Calculation of Potential Profiles in the Junction Charge-Coupled Device', IEEE Trans on Electron Devices, vol ed-25, no 7, pp 845-847, 1978
- [3.2] M Kleefstra, 'A Simple Analysis of CCDs Driven by pn Junctions', Solid State Electronics, vol 21, pp 1005-1011, 1978
- [3.3] E A Wolsheimer, 'Physics, Technology, And Applications Of Junction Charge-Coupled Devices', Ph D Dissertation, Delft University of Technology, June 1982
- [3.4] C L M van der Klauw, 'Vertical Charge Transport In Junction Charge Coupled Devices', Ph D Dissertation, Delft University of Technology, Oct 1987

- [3.4c] ibid. pp 89.
- [3.4d] ibid. pp 81-83.

<sup>[3.4</sup>b] ibid. pp 86

- [3.4e] ibid. pp 90.
- [3.4f] ibid. pp 81.
- [3.5] L D Landau, and E M Lifschitz, <u>Course of Theoretical</u> <u>Physics</u>, <u>vol 1, 'Mechanics'</u>, (Oxford, Pergamon Press), chapter 1, 1969
- [3.6] R D Feynman, R B Leighton, and M Sands, <u>'The Feynman Lectures on Physics, vol 2, 'Mainly Electromagnetism and Matter'</u>, (Reading, Massachusetts, Addison-Wesley), chapter 19, 1967
- [3.7] W R Th ten Kate, and C L M van der Klauw, 'A New Readout Structure for Radiation Silicon Strip Detectors', in IEDM Techn Digest, pp 647-650, Dec 1983
- [3.8] D A Antoniadis, S E Hansen and R W Dutton, 'SUPREM II, A Program for Process Modeling and Simulation', Report no. 5019-2, Stanford University, CA, USA, June 1978
- [3.9] S J Polak, A Wachters, H M J Vaes, A de Beer, and C den Heyer, 'A Continuation Method for the Calculation of Electrostatic Potentials in Semiconductors' in: <u>Numerical Analysis of</u> <u>Semiconductor Devices, Proc. NASECODE I</u>, (Dublin: Boole Press), pp 149-175, 1979
- [3.10] J D E Beynon and D R Lamb, <u>Charge-Coupled Devices</u> and their Applications (London: McGraw-Hill), 1980 pp 332.

Chapter 4.

- [4.1] E P May, C L M van der Klauw, M Kleefstra, and E A Wolsheimer, 'Junction Charge-Coupled Logic (JCCL)', IEEE Journal of Solid-State Circuits, vol SC-18, no 6, pp 767-772, Dec 1983
- [4.2] C L M van der Klauw, 'Vertical Charge Transport In Junction Charge Coupled Devices', Ph D Dissertation, Delft University of Technology, Oct 1987
- [4.3] S L Hurst: <u>The Logical Processing of Digital</u> <u>Signals</u>, (New York: Crane, Russak & Co, Inc), 1978
- [4.4] S Cohen and R O Winder, 'Threshold Gate Building Blocks', IEEE Trans on Computers, vol C-18, pp 816-823, Sept 1969
- [4.5] D C Rine, 'Computer Science and Multiple-Valued Logic', (Amsterdam: Elsevier Science Publ. BV), 1984

- [4.7] H G Kerkhoff, 'Theory, Design And Applications Of Digital Charge-Coupled Devices', Ph D Dissertation, Twente University of Technology, 1984
- [4.8] D Hampel, 'Multifunction Threshold Gates', IEEE, Trans on Computers, vol C-22, no 2, pp 197-203, 1973

Chapter 5.

- [5.1] C L M van der Klauw, 'Vertical Charge Transport In Junction Charge Coupled Devices', Ph D Dissertation, Delft University of Technology, Oct 1987
- [5.2] J Hoekstra, 'Simple JCCD Logic at 20 MHz', Electronics Letters, vol 20, pp 246-248, 1987.
- [5.3] J Hoekstra, 'Junction Charge-Coupled Logic Operating up to Clock Frequencies of 40 MHz, submitted to IEEE Electron Device Letters.
- [5.4] J Hoekstra, 'Possibilities for a Charge-Coupled Transistor Logic, a JCCL Compatible Logic Operating at Clock Voltages down to 2 V.', submitted to IEEE Electron Device Letters.
- [5.5] E P May, C L M van der Klauw, M Kleefstra, and E A Wolsheimer, 'Junction Charge-Coupled Logic (JCCL)', IEEE Journal of Solid-State Circuits, vol SC-18, no 6, pp 767-772, Dec 1983
- [5.6] H G Kerkhoff and M L Tervoert, 'Multiple-Valued Logic Charge-Coupled Devices', IEEE Trans on Computers, vol C-30, no 9, pp 644-652, Sept 1981
- [5.7] R A Allen et al., 'Charge-Coupled Devices In Signal Processing Systems, vol V, Final Report, U.S. Navy Contract no N0014-74-C0068, Dec 1979
- [5.8] J Hoekstra, 'Junction Charge-Coupled Devices for Bit-Level Systolic Arrays', IEE Proc. Pt. G. Electron. Circuits & Systems, vol 134, pp 194-199, 1987.
- [5.9] C L M van der Klauw, 'A Full Adder Using Junction Charge-Coupled Logic', IEEE Journal of Solid-State Circuits, vol SC-21, no 4, pp 584-587, Aug 1986.
- [5.10] T A Zimmerman, R A Allen, R W Jacobs, 'Digital Charge-Coupled Logic (DCCL)', IEEE Journal of Solid-State Circuits, vol SC-21, no 5, pp 473-485, Oct 1977

Chapter 6.

- [6.1] J McCanny and J McWhirter, 'The Derivation And Utilisation Of Bit Level Systolic Array Architectures', in <u>Systolic Arrays</u>, ed W Moore et al.(Bristol: Adam Hilger), pp 47-59, 1987
- [6.2] M Hatamian and G L Cash, 'A 70 MHz 8-bit x 8-bit parallel multiplier in 2.5 micron CMOS.', IEEE Journal of Solid-State Circuits, vol SC-21, no 4, pp 505-513, Aug 1986
- [6.3] D E Danielsson, 'Serial/Parallel Convolver', IEEE Trans. on Computers, C-33, 1984, pp 652-667
- [6.4] G J Li and B W Wah, 'The Design of Optimal Systolic Arrays', IEEE Trans. on Computers, C-34, 1985, pp 66-77
- [6.5] S Y Kung, '<u>VLSI\_Array Processors'</u>, (Prentice Hall), 1988
- [6.6] J V McCanny and J G McWhirter, 'Completely Iterative, Pipelined Multiplier Array Suitable for VLSI', IEEE Proc. G, Electron. Circuits & System, 1982, vol 129, pp 40-46
- [6.7] J Hoekstra, 'Systolic Multiplier', Electron. Lett., vol 20, 1984, pp 995-996
- [6.8] J Hoekstra, 'Junction Charge-Coupled Devices for Bit-Level Systolic Arrays', IEEE Proc. Pt. G Electron. Circuits & Systems, vol 134, 1987, pp 194-199
- [6.9] J V McCanny and J G McWhirter, 'Implementation of Signal Processing Functions Using 1-Bit Systolic Arrays', Electron. Lett., vol 18, pp 241-243, 1982
- [6.11] J R Jump and S R Ahuja, 'Effective pipelining of Digital Systems', IEEE Trans. on Comp., vol C-27, pp 855-865, 1978
- [6.12] S Y Kung, 'VLSI Array Processors', IEEE ASSP Magazine, July 1985, pp 4-22
- [6.13] J J F Cavanagh, 'Digital Computer Arithmetic', (McGraw Hill) 1984
- [6.14] K J Dean, 'Cellular Arrays for Binary Division', Proc. IEE, vol 117, 1970, pp 917-920
- [6.15] Nash J G: 'Combinatorial digital logic using charge-coupled devices', IEEE J. Solid-State Circuits, 1982, SC-17, pp 957-963

- [6.16] Allen R A, Anderson J M, Hamilton F G, Huber W H, Penberg M, Zimmerman T A, Nicas R, Schneier M: 'Charge-coupled devices in signal processing systems' vol 5, Final report Navy contract no N00014-74-C-0068, TWR System Group, Redondo beach CA USA, 1979
- [6.17] Evans R A, Wood D, Wood K, McCanny J V, McWhirter J G and McCabeA P H: 'A CMOS implementation of a systolic multi-bit convolver chip', Proceedings of VLSI Conference, Trondheim, Norway, 1983,pp 227-235
- [6.18] McCabe M M, McCabe A P M, Arambepola B, Robinson I N, Corry A G: 'New algorithms and architectures for VLSI', GEC J, Sci & Technol, 1982, 48, pp 68-75
- [6.19] Sequin C H and Tompsett M F: 'Charge transfer devices',(Academic Press, New York, 1975)
- [6.20] Howes M J, and Morgan D V: 'Charge-coupled devices and systems', (Wiley, New York, 1974)
- [6.21] Kleefstra M : 'A simple analysis of CCDs driven by pn junctions', Solid-State Electron. 1978, 21, pp 1005-1011
- [6.22] Kleefstra M : 'First experimental bipolar charge-coupled device', Microelectronics, 1975, 7, pp 68-69
- [6.23] May E P, van der Klauw C L M, Kleefstra M and Wolsheimer E : 'Junction charge-coupled logic (JCCL)', IEEE J. Solid-State Circuits, 1983, SC-18, pp 767-772
- [6.24] Van der Klauw C L M : 'A full adder using junction charge-coupled logic', IEEE J. Solid State Circuits, 1986, SC-21, pp 584-587
- [6.25] E Deprettere, P Dewilde and P Udo, 'Pipelined CORDIC architecture for fast VLSI filtering and array processing', in Proc. IEEE ICASSP'84, pp 41.A.61-41.A.64, San Diego, 1984.

## Appendix A

- [A.1] Rektorys K :<u>"Variational Methods in Mathematics.</u> <u>Science and Engineering'</u>, (London: D Reidel)
- [A.2] Zienkiewics O C, <u>"The Finite Element Method in</u> Engineering Science"
- [A.3] Landau L P, Lifschitz E M, <u>"Theoretical Physics.</u> vol 1 Mechanics", (Oxford: Pergamon Press), 1969

- [A.4] Landau L P, Lifschitz E M, <u>"Theoretical Physics,</u> <u>vol 2 Classical Theory of Fields"</u>, (Oxford: Pergamon Press), 1969
- [A.5] Feynman R D, Leighton R B, Sands M, <u>"The Feynman Lectures on Physics, vol 2, Electrodynamics",</u> (Reading: Addison-Wesley), 1967
- [A.6] Engl W L, Dirks H K, Meinerzhagen B, 'Device Modeling', Proceedings IEEE, vol 71, pp 10-33, 1983

# SUMMARY

This extended with logical JCCD devices. In this way junction digital information can be represented by the presence digital charge-coupled logic (JCCL) is obtained. transported through the JCCD. This memory function can be absence controlled manner across a filled with quantities sequence devices (JCCDs). Under the application of a proper thesis of logic functions using junction charge-coupled of a charge clock pulses, JCCDs move 'potential wells' quantities of electrical charge in a describes some models and implementations packet. The charge packet is semiconductor substrate. Now, of or r

applications, and provides him with tools for a concerc and precise description of the basic logical structures 0f The well-defined, local potential maximum can be created in well. the JCCD is introduced, which is based on the equalit the electrical potential in the driving and receiving gate voltage. concentration, thickness clock cycle per unit of gate area, possible to obtain a simple relation that expresses the amount of charge, that can be transported in a single structure is thus capable the epilayer substrate and an n-type epilayer with diffused p-gates. A, Basically a JCCD part is the physical principles thesis furnishes an analytical solution of a simplified JCCD. ayer by clocking a gate to a positive voltage. The e is thus capable of storing electrons. It is not to obtain a simple relation that expresses the A new description of the consists of a lightly doped p-type the reader with a working knowlegde which is based on the equality of the epilayer, and the applied of JCCDs as used in logic him with tools for a concise in terms of the donor charge transport о f in

be be vertically injected into the JCCD through the gates by vertical charge detection steps. The substrate PNP transistor, formed by the gate the epilayer, and the substrate, can be used for the applying a vertical NPN transistor (injector), which can the possibility Besides charge made epilayer, and the ection of surplus of without transport in potential wells, JCCDs offer any of vertical transport, logic devices are additional charge charge transport. in the potential fabrication processing well. realized. An Charge Using can

concept of bit-level systolic arrays are discussed. discussed. JCCL is a technology for bit-level systolic arrays. Some bit-level systolic arrays for multiplication and division, and the application of JCCDs within the introduction to the general description of JCCL is given. The basic logic structures are elements of both Boolean logic and threshold logic. Several JCCL full adders are

a most favorable improvement in performance can be obtained from developing đ operating up to 15 MHz while driven by clock voltages down operating possibility of experiment has been carried out to investigate the has been achieved at a clock frequency of 1.1 MHz. Another 40 MHz. The experiments with the AND and carry functions show that carry, and a full adder are implemented in a process necessary. The simple logic the first experiments with these cells, it was clear that application in bit-level systolic arrays. However, and from these logic which realization of Initially, more V. As the fundamental research on basic logic functions Satisfactory operation of a threshold the elimination of parasitic potential ty of a logic device, compatible with JCCL, at low voltages. An AND function is realized thickness of the epilayer is can be concluded from the the functions operate up to clock frequencies of research was directed towards logic building technology for digital JCCD applications functions, such as AND, OR, blocks suitable experiments, decreased the full for wells. ť an adder after ഗ was in  $\mu$ m.

# SAMENVATTING

realisaties aanwezigheid geheugen funktie afwezigheid van ladings pakketjes. Dit ladingspakketje Coupled Devices (JCCD's). Als klokpulsen op een correcte wijze worden aangeboden aan het JCCD, worden "potentiaal lading, boven een halfgeleider substraat verplaatst. De digitale informatie wordt voorgesteld door de aanwezigh d 0 gevuld zijn met hoeveelheden elektrische logische functies met behulp van Junction Charge-(JCCL) wordt door het JCCD getransporteerd. Deze geheugen 1 kan worden uitgebreid met logische JCCD bouwstenen. Dit proefschrift beschrijft enkele modellen en deze wyze wordt Junction Charge-Coupled Logic die verkregen. putten", van of

analytische oplossing van een vereenvoudigd JCCD centraal. Een JCCD is opgebouwd uit een p-type gedoteerde drager met daarop een n-type epilaag met hierin gediffundeerde p-gates. Een, goed gedefinieerd, lokaal potentiaal maximum kan in de epilaag worden gecreerd door de gate van een principes van JCCD's zoals ze gebruikt worden in logische in , r ladingstransport wordt gegeven. Uitgangspunt hierbij is enkele klokslag per eenheid van de aangewende spanning te voorzien. De structuur is aldus staat elektronen op te slaan. Het is niet mogelijk om de gate oppervlak, uit te drukken in termen van de donor die wordt Het proefschrift geeft een inleiding tot de fysische synthese van JCCL. In het eerste gedeelte staat een vanuit en de Waar schakelingen en verschaft hulpmiddelen voor de structuren in de put concentratie, de dikte van de epilaag en de a klokspanning. Een nieuwe beschrijving van het Waar simpele relatie de hoeveelheid lading, elektrische potentiaal in de put weg vloeit gelijk is aan die in d van de logische basis ading naar toe stroomt. getransporteerd in een beschrijving positieve lading de een dat

(injector), die gemaakt kan worden zonder een extra proces de epilaag en het substraat, kan worden gebruikt voor het door de gate, transport Lading kan verticaal door de gates worden geinjecteerd door gebruik te maken van een verticale NPN transistor Naast ladings transport in potentiaal putten, bieden JCCD's de mogelijkheid van verticaal ladings transpo De substraat PNP transistor, gevormd stap.

detecteren van overmaat lading in de potentiaal put. Met behulp van verticaal ladings transport worden logische schakelingen gerealiseerd. Een inleiding tot de beschrijving van JCCL wordt gegeven. De fundamentele logische structuren zijn onderdelen van zowel Boolse logica als van drempel logica. Verschillende optellers worden besproken. JCCL is een technologie voor bit-level systolic arrays. Een aantal bit-level systolic arrays voor vermenigvuldiging en deling, en de toepassing van JCCD's in systolic arrays worden bediscusieerd.

In eerste aanzet was het onderzoek gericht op het realiseren van logische bouwstenen die geschikt zijn voor toepassing in bit-level systolic arrays. Echter, na de eerste experimenten met deze cellen werd duidelijk dat een meer diepgaand onderzoek van de basis functies nodig was. De eenvoudige logische functies: EN, OF, carry en een opteller werden gerealiseerd in een proces waarin de epilaag werd verkleind tot 5  $\mu$ m. De experimenten met de EN- en de carry-functie laten zien dat deze werkzaam zijn tot een frequentie van 40 MHz. Een (threshold) opteller werkt correct bij een klokfrequentie van 1.1 MHz. Een ander experiment is uitgevoerd om te onderzoeken of het mogelijk is een logische structuur, die verenigbaar met JCCL is, te laten functioneren bij een lage klokspanning. Een EN-functie die werkt bij 15 MHz met een klokspanning van maar 2 V is gerealiseerd. Uit de experimenten kan worden afgeleid dat de prestaties van de circuits kunnen worden verbeterd als de fabricage technology wordt aangepast voor digitale JCCD toepassingen en als parasitaire potentiaal putten worden voorkomen.

### ACKNOWLEDGEMENT

The research described in this thesis was financially supported by the FOM, the Dutch organisation for fundamental research on matter.

The devices were fabricated at Philips Nijmegen by Drs. A. van Hout and Mr. H. Punter.

The research could not have been carried out without the help of many people. First of all, I want to thank Judy and Tom, Jeroen and Peter for their patience.

I wish to thank Mw. Zaat-Jones for correcting my linguistic errors.

I also express my appreciation to all my colleagues and friends in the laboratory of electrical materials with whom it was a pleasure to work, in particular, Jan van Staden, Jan Chris Staalenburg, Nico Spiekerman, Gertjan Ouwerling, Ronald van Oort, Bram van der Male, Rien Geerts, Johan van den Heuvel, Kees van der Klauw, Rick Wentinck, Jannie Vermeulen, Wilf Stefan and Leon Jansen.

# LIST OF MAIN SYMBOLS

| С                                           | capacitance                                                 | CV -1             |
|---------------------------------------------|-------------------------------------------------------------|-------------------|
| C <sub>eq</sub>                             | equivalent capacitance                                      | CV -1             |
| d(epi)                                      | epilayer thickness                                          | m                 |
| E E                                         | electric field                                              |                   |
| f                                           | frequency                                                   | s <sup>-1</sup>   |
| f <sub>C</sub>                              | clock frequency                                             | s -1              |
| I <sub>ct</sub>                             | collector transport current                                 | Cs -1             |
|                                             | injector current                                            | Cs <sup>-1</sup>  |
| I <sub>S</sub>                              | saturation current                                          | Cs <sup>-1</sup>  |
| IS<br>I <sub>O</sub>                        | overflow current                                            | Cs -1             |
| k                                           | Boltzmann constant                                          | JK -1             |
|                                             | i donor impurity concentration in epilayer                  | m <sup>-3</sup>   |
|                                             |                                                             | m-3               |
| -                                           | te acceptor ,, , , of gate<br>b acceptor ,, ,, of substrate | m-3               |
| n                                           | electron concentration                                      | m_3               |
| n                                           | number of transfers                                         | -                 |
|                                             | equivalent number of gates                                  | _                 |
| n <sub>eq</sub><br>n <sub>i</sub>           | intrinsic carrier concentration                             | m <sup>-3</sup>   |
| P                                           | power                                                       | CVs <sup>-1</sup> |
|                                             | hole concentration                                          | m <sup>-3</sup>   |
| Р<br>Q                                      | charge                                                      | C                 |
| Qs                                          | signal charge in JCCD                                       | Cm <sup>-2</sup>  |
| Qs,max                                      |                                                             |                   |
| Qsmax                                       | maximal signal charge                                       | Cm <sup>-2</sup>  |
| q                                           | elementary charge                                           | С                 |
| R                                           | resistance                                                  | VsC <sup>-1</sup> |
| T                                           | absolute temperature                                        | K                 |
| -<br>t                                      | time coordinate                                             | -                 |
| v                                           | potential                                                   | v                 |
| Vch                                         | channel potential (= potential maximum                      |                   |
|                                             | in the JCCD channel                                         | v                 |
| Vch(0,0)                                    | channel potential in absence of signal                      |                   |
|                                             | charge at a gate voltage of OV                              | v                 |
| Vch(Vg,Qs) channel potential in presence of |                                                             |                   |
|                                             | signal charge                                               | v                 |
| Vb                                          | built-in voltage                                            | v                 |
| Vcl                                         | clock voltage                                               | v                 |
| Vg                                          | gate voltage                                                | v                 |
| x,y,z                                       | space coordinates                                           | -                 |
| x,y,z                                       | logical inputs                                              | -                 |
| ,,,,,                                       | Ur                                                          |                   |