

**Delft University of Technology** 

# Device-Aware Test for Back-Hopping Defects in STT-MRAMs

Yuan, Sicong; Taouil, Mottaqiallah; Fieback, Moritz; Xun, Hanzhi; Marinissen, Erik Jan; Kar, Gouri Sankar; Rao, Sidharth; Couet, Sebastien; Hamdioui, Said

DOI

10.23919/DATE56975.2023.10137071

**Publication date** 2023

**Document Version** Final published version

## Published in

2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023 - Proceedings

#### Citation (APA)

Yuan, S., Taouil, M., Fieback, M., Xun, H., Marinissen, E. J., Kar, G. S., Rao, S., Couet, S., & Hamdioui, S. (2023). Device-Aware Test for Back-Hopping Defects in STT-MRAMs. In *2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023 - Proceedings* (Proceedings -Design, Automation and Test in Europe, DATE; Vol. 2023-April). IEEE. https://doi.org/10.23919/DATE56975.2023.10137071

#### Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

**Copyright** Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

# Green Open Access added to TU Delft Institutional Repository

# 'You share, we take care!' - Taverne project

https://www.openaccess.nl/en/you-share-we-take-care

Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

# Device-Aware Test for Back-Hopping Defects in STT-MRAMs

Sicong Yuan\*<sup>‡</sup> Mottaqiallah Taouil\*<sup>†</sup> Moritz Fieback\* Hanzhi Xun\* Erik Jan Marinissen<sup>‡</sup>

Gouri Sankar Kar<sup>‡</sup> Sidharth Rao<sup>‡</sup> Sebastien Couet<sup>‡</sup> Said Hamdioui<sup>\*†</sup>

\*TUDelft, Delft, The Netherlands <sup>†</sup> CognitiveIC, Delft, The Netherlands <sup>‡</sup>IMEC, Leuven, Belgium

{Erik.Jan.Marinissen, Gouri.Kar, Siddharth.Rao, Sebastien.Couet}@imec.be

Abstract—The development of Spin-transfer torque magnetic RAM (STT-MRAM) mass production requires high-quality dedicated test solutions, for which understanding and modeling of manufacturing defects of the magnetic tunnel junction (MTJ) is crucial. This paper introduces and characterizes a new defect called Back-Hopping (BH); it also provides its fault models and test solutions. The BH defect causes MTJ state to oscillate during write operations, leading to write failures. The characterization of the defect is carried out based on manufactured MTJ devices. Due to the observed non-linear characteristics, the BH defect cannot be modelled with a linear resistance. Hence, device-aware defect modeling is applied by considering the intrinsic physical mechanisms; the model is then calibrated based on measurement data. Thereafter, the fault modeling and analysis is performed based on circuit-level simulations; new fault primitives/models are derived. These accurately describe the way the STT-MRAM behaves in the presence of BH defect. Finally, dedicated march test and a Design-for-Test solutions are proposed.

#### I. INTRODUCTION

Spin-transfer torque magnetic RAM (STT-MRAM) has attracted considerable attention thanks to its competitive writing performance, endurance, retention, and low power consumption [1]. Since the first commercial MRAM product in 2006, world-leading foundries and producers, such as TSMC, Samsung, Intel, and Everspin have entered the market, leading to the single chip storage capacity increasing from 4 MB to 1 GB [1-5]. However, the further development of STT-MRAM mass production still faces critical challenges, one of which is its vulnerability to defects. Compared with the regular manufacturing process of CMOS, more defects are introduced during the STT-MRAM manufacturing process, since it involves several additional steps of the magnetic tunnel junction (MTJ) fabrication and integration [6]. Furthermore, the magnetic field and the spin-transfer torque (STT), which play an essential role in the STT-MRAM working mechanism, have introduced a variety of irregular defects [7]. Due to these additional defects, directly transplanting test methods from conventional memories to STT-MRAMs causes a high test escape rate and a high yield loss [6-9]. To overcome this challenge, it is critical to design accurate defect models, derive appropriate fault models and develop efficient test solutions.

STT-MRAM modeling and testing is in rapid development. In 2014, the multi-victim and kink fault models were proposed by Azevedo *et al.* for field-driven MRAMs [7]; yet these models are not applicable to current-driven STT-MRAMs. In 2015, Chintaluri et al. studied the fault modeling of STT-MRAM by simulating the impact of resistive defects in the layout and the netlist. Next year, the same group presented a built-in-self-test (BIST) system for the fault detection [9]. In 2018, Nair et al. offered the layout-aware defect injection and fault analysis, in which dynamic incorrect read faults were observed [6]. Nevertheless, all of these works assume that the STT-MARM defects can be accurately modeled with linear resistors or parasitic capacitors, while neglecting MTJ internal physical mechanisms that may cause additional non-linear defects. As a solution, Hamdioui et al. put forward the concept of 'device-aware test (DAT)' [8,10,11]. In their work, the way the MTJ behaved in the presence of defects was accurately modeled; the defect models were implemented into the circuitlevel simulations for the fault analysis and test solutions. The DAT approach has been applied for the unique defects such as the pinhole, the synthetic anti-ferromagnet flip (SAFF) and the intermediate state (IM) [12-14]. However, these present only a subset of the MTJ defects. In order to guarantee the completeness and higher outgoing product quality, it is critical to analyze all possible MTJ defects.

In this work, back-hopping (BH) in STT-MRAMs is reported based on measurement data, and the DAT approach is applied to model and detect this defect. Conventionally, the MTJ state is supposed to stay constant after a successful write operation. However, due to some physical imperfections, the reference layer (RL) becomes unstable and then leads to write failures, which is named as 'back-hopping'. The major contributions of this paper are as follows:

- Characterize the BH defect and explain its physics.
- Design the device-aware defect model for BH and calibrate it with the measured Write Error Rate (WER).
- Apply the device-aware defect model into circuit-level simulations and perform the device-aware fault modeling.
- Develop DAT solutions to detect BH.

The rest of this paper is organized as follows. Section II introduces the basic of STT-MRAMs. Section III presents the characterization of BH. Section IV analyzes the physical mechanism of BH, designs the BH defect model, and calibrates the model with measurement data. Section V applies the device aware fault modeling. Section VI discusses DAT solutions for BH. Finally, section VII concludes this paper.

<sup>{</sup>S.Yuan-4, M.Taouil, M.C.R.Fieback, H.Xun, S.Hamdioui}@tudelft.nl



Fig. 1. (a) Simplified MTJ stack, (b) 1T-1MTJ cell and its access operations.

#### II. BACKGROUND

#### A. MTJ Device Technology

The fundamental data-recording element in STT-MRAMs is the magnetic tunnel junction (MTJ); it demonstrates a onebit data by encoding two bi-stable resistance states. Fig. 1 (a) presents the simplified schematic of an MTJ. The critical diameter (CD) in this work is  $60 \,\mathrm{nm}$ . Typically the MTJ is a sandwich structure, consisting of an ultra-thin dielectric tunnel barrier (TB) between a free layer (FL) and a pinned layer (PL). The FL is a ferromagnetic layer whose magnetization can be switched through write operations, and the TB is a thin MgO insulator. The PL is a multiple-layer stack composed of: 1) a top reference layer  $(RL_t)$ , 2) a thin metal spacer, 3) a bottom reference layer  $(RL_b)$ , 4) a thin spacer, 5) a thick hard Layer (HL). The magnetization direction of two RLs is the same, with a ferromagnetic coupling in between. The  $RL_{\rm b}$  is anti-ferromagneticly coupled to the HL through the Ru spacer, resulting in their opposite magnetization directions. For the defect-free device, all ferromagnetic layers within the PL stack are stable, and their magnetization never switches. When a current flows through the device, it offers STT to the FL electrons, which may switch the FL magnetization to be either parallel or anti-parallel to that of  $RL_t$ . The MTJ resistance, which depends on the FL magnetization, presents itself to be either low (i.e. P state) or high (i.e. AP state).

### B. 1T-1MTJ Cell Design

The bottom-pinned 1 Transistor - 1 MTJ (1T-1M) bit cell structure and the related write/read operation is presented in Fig. 1 (b). The cell consists of an N-type Metal-Oxide-Semiconductor Field-Effect Transistor (MOSFET) selector and an MTJ device. Three terminals of the cell connect to the bit line (BL), the source line (SL), and the word line (WL) separately. During the write operation, the voltage of WL selects the cell and the voltage of BL and SL controls the operation type. For instance, during a '1w0' operation, the BL connects to  $V_{\rm DD}$  and the SL grounded, introducing a writing current  $I_{w0}$  flowing through the MTJ device from FL to PL. The tunneling electrons offer STT that switches the FL magnetization to be anti-parallel to that of  $RL_t$ . On the contrary, the process of '0w1' offers an opposite current  $I_{w1}$ by connecting the BL to the ground and the SL to  $V_{\rm DD}$ . The MTJ state is switched from P to AP by the reversed STT. A writing current  $I_{\rm w}$  larger than the critical current  $I_{\rm c}$  is



Fig. 2. Plot of MTJ switching under the application of voltage pulses

necessary to reach a high write success rate, and the switching time  $t_w$  is inversely proportional to  $I_w - I_c$ . Notice that the  $t_w$  in every write operation is intrinsically stochastic, which should be included in MTJ models. In read operations, a read current  $I_{rd}$  much smaller than  $I_c$  is offered to avoid unwanted state switches. The sense amplifier is employed to detect the device state, leading to a short read time  $t_{rd}$  of 5 ns.

#### III. DEFECT CHARACTERIZATION OF BH

In this section, the BH defect characterization is presented by studying the MTJ state oscillation under strong stress (e.g. a larg pulse voltage  $V_{\rm p}$  and a long pulse time  $t_{\rm p}$ ).

#### A. Identification of BH

Fig. 2 presents the defective MTJ switching under voltage pulses. After initializing the MTJ to the P state, a sequence of write-read operations is performed. The write operations are a series of voltage pulses with a constant  $t_p = 7 \text{ ns}$  and a staircase  $V_p$ , as shown in the left part of Fig. 2;  $V_p$  firstly sweeps from -1.3 V to 1.4 V with a stair-gap  $\Delta V_p = 50 \text{ mV}$ , and then sweeps back to -1.3 V with  $\Delta V_p = -50 \text{ mV}$  (the sweeping not totally shown in Fig. 2). After every write operation, a read operation is performed to detect the MTJ state. The right part of Fig. 2 shows the measurement data. During the positive oriented  $V_p$  sweep, the MTJ state is expected to stay at the P state after the successful AP to P switching. However, as shown in Fig. 2, the defective device undergoes an oscillation between AP and P after the first AP to P switching. The faulty behavior can be attributed to BH [15,16].

Fig. 3 shows the Write Error Rate (WER) measurement process and results. The WER is an important practice to evaluate the STT-MRAM writing performance. The requirement on the WER varies, in this work we expect a defect-free device with WER lower than  $10^{-6}$  by the presence of the write operation. For defect-free devices, stronger pulse stress should lead to a lower WER. However, for a defective MTJ with BH, the strong stress causes the MTJ state oscillation between P and AP during write operations, thus inciting a high WER instead. To obtain the WER of P to AP switch, the MTJ is firstly initialized to the P state. A read operation is necessary to validate a successful initialization. Then, a positive write pulse is performed on the device, trying to switch the MTJ state, followed by a second read operation detecting the final state of the device. Such 'initializationread-write-read' operation cycle (see Fig. 3 (a)) is performed



Fig. 3. (a) WER measurement process. (b) WER measurement of P to AP switch. (c) WER measurement of AP to P switch. (d) WER extraction.

1000 times for every different write pulse condition, and the WER is calculated by counting the number of unexpected P state detected by second read operations. A similar process is executed to extract the WER of the AP to P switch. Fig. 3 (b) and (c) present examples of the WER extraction for the two types of switching under strong stress, in which write errors are observed. The extracted WER is presented in Fig. 3 (d). For both two types of switching, write errors are firstly observed under weak stress (low  $V_{\rm p}$ ) and then under strong stress (high  $V_{\rm p}$ ). Under weak stress, insufficient energy causes unsuccessful switches, which is common for all MTJ devices. Conventionally, a strengthening of the stress can reduce the WER (as shown for the case of device-free). However, Fig. 3 (d) presents also a high WER under strong stress, which implies the unexpected MTJ state oscillation after the first successful switch. Therefore, while applying strong stress is a regular approach to guarantee a high write success rate for the defect-free device, this instead results in an increase of the WER for the MTJ suffering from BH defect.

#### B. Related Work and Potential Causes

In 2009, the word 'back-hopping' was defined by J. Z. Sun *et al.* [15], to define the phenomenon that MTJ state oscillates permanently under strong stress. In 2016, W. Kim *et al.* experimentally presented the physical mechanism of BH: it is the RL losing stability and getting switched that causeds the MTJ state to oscillate [17]. Several micro and macro models were proposed to describe the BH defect [18,19]. However, while offering high value in physical studying, these models are inappropriate for simulating the large STT-MRAM array due to their high computational complexity and low compatibility to circuit simulations. Therefore, a device-aware BH compact model is essential to describe the STT-MRAM faulty behaviors.

The physical mechanism of BH is concluded in Fig. 4, by the example of P to AP switching [17–19]. It is suggested by W. Kim *et al.* that, if the RL stack includes multiple ferromagnetic layers, usually only the magnetization of the most unstable layer may switch when BH occurs (in this work, the  $RL_t$ ) [17]. Therefore, only the magnetization of switchable layers is extracted and exhibited in Fig. 4. To clarify the magnetization switching, symbols of ' $\uparrow$ ' and ' $\downarrow$ ' are applied to reflect the magnetization direction. Assuming the magnetization direction of both FL and  $RL_t$  is ' $\uparrow$ ' before the write operation, Fig. 4 presents a loop with four phases, in which either the FL magnetization or the  $RL_t$  magnetization switches. When the  $RL_t$  is stable, only Phase 1 will occur, representing the defect-free case. However, if the  $RL_t$  loses its stability (due to some physical imperfections, like the poor interface quality between the  $RL_{t}$  and the  $RL_{b}$  [19]), four phases will take place in sequence and loop permanently. During the whole write operation, the MTJ state oscillates within this loop. Whether the write error occurs depends on which phase the MTJ stays at the end of the operation. The physics of AP to P switching process with BH is similar. The physical mechanism of BH implies that to reduce such problem, we may improve the  $RL_t$  stability, like increasing the thickness, or improve the quality of  $RL_{\rm t}$ /spacer/ $RL_{\rm b}$  stack, like a better annealing method.

#### IV. DEVICE-AWARE DEFECT MODELING OF BH

Due to the intrinsic nonlinear characteristics, regular defect models with linear resistors are inappropriate to represent BH. To model irregular defects, Wu *et al.* demonstrated a systematic device-aware defect modeling approach with three steps [8]: 1) physical defect analysis and modeling, 2) electrical defect modeling, 3) model optimization. In this section, the BH defect model is designed following these steps.

#### A. Physical Defect Analysis and Modeling

To physically model BH, three simplifications are proposed as preconditions: 1) For all layers of the PL stack in Fig. 1 (a), only the  $RL_t$  are switchable as it is the most vulnerable layer [17]. 2) In each of the four phases in Fig. 4, only one layer (i.e., either FL or  $RL_t$ ) is unstable, and the other is stable; actually, the magnetization of the 'stable layer' is also slightly disturbed by the STT from the current [18,19], but its impact is negligible, hence not considered here. 3) Ignore interface effects [19], since we find that in our experiment including them in the compact model introduces a large amount of calculation without improving accuracy. Then, the four physical phases can be analyzed one by one by considering: 1) which layer is switchable? 2) which issue affects the switching process? Next, we will go through the four phases of P to AP switching in the process of BH defect as shown in Fig. 4.



Fig. 4. The physical mechanism of BH in the P to AP switching process

In Phase 1, the FL is switchable, and the  $RL_t$  is stable. Under a positive write pulse, the STT switches the FL magnetization from ' $\uparrow$ ' to ' $\downarrow$ '. This is a common switching process for both defective and defect-free devices.

In Phase 2, the FL is stable, and the  $RL_{\rm t}$  is switchable. Two issues are involved in  $RL_{\rm t}$  magnetization switching process: 1) STT effect from the FL, 2) pinning effect from the  $RL_{\rm b}$ . The pinning effect means that  $RL_{\rm t}$  magnetization direction is forced to keep the same as that of  $RL_{\rm b}$  through the ferromagnetic coupling [20]. This pinning effect is strong for defect-free devices, but weak when BH occurs. To model this pinning effect, an effective pinning magnetic field  $H_{\rm p}$  is introduced [20]. The  $H_{\rm p}$  is an effective magnetic field that only works on the  $RL_{\rm t}$ , yet not affecting other layers.  $H_{\rm p}$  is defined as:  $H_p = E_{ex}/(M_s * t_{RLtop})$ .

 $E_{\rm ex}$  is the coupling energy,  $M_{\rm s}$  is the saturation magnetization of  $RL_{\rm t}$ , and  $t_{\rm RLtop}$  is the  $RL_{\rm t}$  thickness. In this phase, a competition exists between the STT effect and the pinning effect, which either advances or impedes the switch. Under strong stress, the STT effect wins, and switches  $RL_{\rm t}$ magnetization from ' $\uparrow$ ' to ' $\downarrow$ '. Notice that the depinning occurs in this phase, which suggests the  $H_{\rm p} = 0$  in Phase 3 and 4.

In Phase 3, the FL is switchable, and the  $RL_t$  is stable. Compared with Phase 1, both the magnetization direction of the FL and the STT offered by the  $RL_t$  is reversed. Therefore, FL magnetization switches from ' $\downarrow$ ' to ' $\uparrow$ '.

In Phase 4, the FL is stable, and the  $RL_t$  is switchable. In this phase,  $H_p = 0$  due to the depinning, and a stray field  $H_s$  caused by other ferromagnetic layers is introduced, which only works on the  $RL_t$ . Affected by the  $H_s$  and the STT from the FL,  $RL_t$  magnetization switches from ' $\downarrow$ ' to ' $\uparrow$ '. The end of Phase 4 indicates a new start of Phase 1, hence the four phases form a complete loop, and the MTJ state oscillates permanently in this loop.

After the end of the write operation, the voltage is removed, and the pinning effect recovers if it stops at Phase 3 or 4. When the  $H_p$  is larger than the anisotropy magnetic field of  $RL_t$ ,  $RL_t$  magnetization is pinned to ' $\uparrow$ '. Write errors occur when write operations end at Phase 1 and Phase 4 [17]. However, when the  $H_p$  is not large enough, write operation ending at Phase 1 and Phase 3 causes the write error [21]. Physics of AP to P switching wit BH can be analyzed by similar methods.

#### B. Electrical Modeling of MTJ Devices with the BH defect

Following the obtained four-phase loop physical defect model, the electrical modeling of the defective MTJ switching

TABLE I. KEY PARAMETER CALCULATIONS OF DEFECTIVE MTJ MODEL IN THE P TO AP SWITCHING PROCESS.

| Phase 1 | I <sub>c1</sub> | $I_{\rm c1} = \frac{1}{\eta_P} * \frac{\alpha \gamma e}{\hbar} * A * t_{FL} * M_s * H_k$                                     |
|---------|-----------------|------------------------------------------------------------------------------------------------------------------------------|
|         | $t_{w1}$        | $t_w = \frac{\left(C + \ln\left(\frac{\pi^2}{4}\Delta\right)\right) * e * m}{4 * \mu_b * \eta_P * (I_{MTJ} - I_{c1})}$       |
| Phase 2 | I <sub>c2</sub> | $I_{c2} = \frac{1}{\eta_{AP}} * \frac{\alpha \gamma e}{\hbar} * A * t_{RLtop} * M_s * (H_k + H_p)$                           |
|         | $t_{w2}$        | $t_{w2} = \frac{\left(C + \ln\left(\frac{\pi^2}{4}\Delta\right)\right) * e * m}{4 * \mu_b * \eta_{AP} * (I_{MTJ} - I_{c2})}$ |
| Phase 3 | I <sub>c3</sub> | $I_{c3} = \frac{1}{\eta_P} * \frac{\alpha \gamma e}{\hbar} * A * t_{FL} * M_s * H_k$                                         |
|         | $t_{\rm w3}$    | $t_{w3} = \frac{\left(C + \ln\left(\frac{\pi^2}{4}\Delta\right)\right) * e * m}{4 * \mu_b * \eta_P * (I_{MTJ} - I_{c3})}$    |
| Phase 4 | I <sub>c4</sub> | $I_{c4} = \frac{1}{\eta_{AP}} * \frac{\alpha \gamma e}{\hbar} * A * t_{RLtop} * M_s * (H_k + H_s)$                           |
| Thase 4 | $t_{w4}$        | $t_{w4} = \frac{\left(C + \ln\left(\frac{\pi^2}{4}\Delta\right)\right) * e * m}{4 * \mu_b * \eta_{AP} * (I_{MTJ} - I_{c4})}$ |

TABLE II. PARAMETERS IN TABLE I.

| _ |                                   |                    |                                  |
|---|-----------------------------------|--------------------|----------------------------------|
|   | Critical switching current        | $t_{\rm w}$        | Switching time                   |
|   | STT efficiency of P to AP switch  | $\eta_{AP}$        | STT efficiency of AP to P switch |
|   | Damping factor                    | $\gamma$           | Electron gyromagnetic ratio      |
|   | Thickness of the FL               | t <sub>BLtop</sub> | Thickness of the $RL_t$          |
|   | Saturation magnetization          | Hk                 | Anisotropy magnetic field        |
|   | Thermal stability                 | $\mu_{h}$          | Bohr magneton                    |
|   | Cross-area                        | I <sub>MTJ</sub>   | Current through the MTJ          |
|   | Effective coupling magnetic field | Hs                 | Stray field in Phase 4           |
|   | 1 0 0                             | -                  |                                  |

from P to AP can be realized by calculating the critical parameters in each of these phases.

In Fig. 4, four phases can be categorized into two groups based on which layer is switchable. Phase 1 and Phase 3 are in the same group in which the FL switches. The conventional MTJ compact model can be directly applied into these two phases [8]. Key parameters here are the critical switching current  $I_c$  and the switching time  $t_w$ , whose calculation is obtained from Khvalkovskiy's model and Sun's model respectively [22,23]. Phase 2 and Phase 4 form the other group, where the  $RL_t$  is switchable. However, to realize the calculation of  $I_c$  and  $t_w$ , it is crucial to find appropriate parameters for the  $RL_t$ . Because the  $RL_t$  and the FL are formed by the same material, and their thickness is close, we assume that all conventional parameters of the FL can be directly applied on the  $RL_t$  except the thickness. There are two evaluated parameters  $H_{\rm p}$  and  $H_{\rm s}$ ' as described in Part A of this section. Since the two parameters can both be equivalent to magnetic fields that only work on  $RL_t$ , they can be included in  $I_c$  calculations [8]. Here  $H_p$  is used to reflect the BH defect strength, where a lower  $H_{\rm p}$  refers to a stronger defect. The equations of  $I_c$  and  $t_w$  are summarized in TABLE I, with parameters listed TABLE II. Notice that due to the intrinsic stochasticity, a normal random function is applied on the  $t_w$ with 10% away from its nominal value at  $3\sigma$  corners.

The electrical model for the defective MTJ switching from AP to P can be approached in a similar manner.

 $I_{\rm c}$ 

 $\eta_{\mathrm{P}} \\ \alpha$ 

 $t_{\rm FL}$  $M_{\rm s}$  $\Delta$ A $H_{\rm p}$ 



Fig. 5. (a) Magnetization switching during write operations (b) BH fitting in the linear y-axis. (c) BH fitting in the log y-axis.

| $\langle S/F/R \rangle$ | Explanation            | Value                              |                        |  |
|-------------------------|------------------------|------------------------------------|------------------------|--|
| S                       | Sensitizing sequence   | 0, 1, 0w0, 0w1, 1w0, 1w1, 0r0, 1r1 |                        |  |
| F                       | Faulty effect          | L, 0, U, 1, H                      |                        |  |
| R                       | Readout value          | 0, 1, ?, -                         |                        |  |
| note in 'F':            |                        |                                    |                        |  |
| 'L'                     | MTJ extreme low state  | ·0'                                | MTJ normal low state   |  |
| 'U'                     | MTJ undefined state    | '1'                                | MTJ normal high state  |  |
| 'H'                     | MTJ extreme high state |                                    |                        |  |
| note in 'R':            |                        |                                    |                        |  |
| ·0'                     | Readout low state      | '1'                                | MTJ Readout high state |  |
| '?'                     | Readout random state   | ·_'                                | Readout not applicable |  |

| TABLE III | FAULT | PRIMITIVE | NOTATIONS. |
|-----------|-------|-----------|------------|
|-----------|-------|-----------|------------|

C. Fitting and Model Optimization

The fitting is carried out by Python with WER measurement data. The fitting process consists of two steps: 1) The fitting of the defect-free device performance [8]. 2) The fitting of defective device performance with parameters  $H_p$ . Fig. 5 (a) shows the example of magnetization switching during write operations in the presence of BH. Fig. 5 (b) and (c) presents the fitting result at  $H_s$  of 300 Oe and  $H_p$  of 4 kOe, with the linear and log y-axis separately. The fitting result implies that the model is able to accurately predicts the WER caused by BH. After the verification by Python, the model is moved to Verilog-A to make it compatible with circuit-level simulations.

#### V. DEVICE-AWARE FAULT MODELING OF BH

In this section, the device-aware fault modeling is performed by studying the impact of BH defect at the circuit level.

#### A. Simulation Set-up

This work has limited the analysis to single-cell faults. Fault primitive (FP) notations are applied to describe the memory faults [8] (as also in TABLE III):  $\langle S/F/R \rangle$ , S describes the sensitizing sequence, F describes the faulty effect, and R describes the readout value. For example,  $\langle 0r0/0/1 \rangle$  denotes a *r*0 operation on a cell that holds '0' (S=0*r*0), where the cell remains in its correct state '0' (F=0) yet the read output returns to '1' (R=1) instead of the expected '0'. This FP clarifies how the memory faulty behaviors deviate from expectations.

Cadence Spectre is adopted for circuit-level simulations. The simulation circuit consists of  $3 \times 3$  1T-1M cells and the peripheral circuits (i. g., the write drivers and the sense

| INDEE IV. I AUEI MODELING RESULTS OF DIT DEFECT. |
|--------------------------------------------------|
| INDEE IV. I AGEI MODELING RESCEIS OF DIF DEFECT  |

| Defect strength                    | Sensitized FP | FP name                                |
|------------------------------------|---------------|----------------------------------------|
|                                    | <0w0/~/->*    | Write 0 oscillating fault: w0OF $\sim$ |
| $H \in (0.5.2k\Omega_0)$           | <1w0/~/->     | Write 0 oscillating fault: w0OF $\sim$ |
| $n_{\rm p} \in (0, 0.260e)$        | <1w1/~/->     | Write 1 oscillating fault: w1OF $\sim$ |
|                                    | <0w1/~/->     | Write 1 oscillating fault: w1OF $\sim$ |
|                                    | <0w0/~/->*    | Write 0 oscillating fault: w0OF $\sim$ |
| $H_{\rm p} \in (5.2, 6.7kOe)$      | <1w0/~/->     | Write 0 oscillating fault: w0OF $\sim$ |
|                                    | <1w1/~/->     | Write 1 oscillating fault: w1OF $\sim$ |
| $H \in (6.7, 0.8h\Omega_{0})$      | <0w0/~/->*    | Write 0 oscillating fault: w0OF $\sim$ |
| $m_{\rm p} \in (0.1, 3.0kOe)$      | <1w0/~/->     | Write 0 oscillating fault: w0OF $\sim$ |
| $H_{\rm p} \in (9.8, 13.2kOe)$     | <0w0/~/->*    | Write 0 oscillating fault: w00F $\sim$ |
| $H_{\rm p} \in (13.2kOe, +\infty)$ | No fault      |                                        |

amplifiers). Variations introduced in the simulation include: 1) Process variations of transistors' threshold voltage  $V_{\rm th}$ . 2) Process variations of the MTJ. 3) Stochasticity of  $t_{\rm w}$ . To realize the variation, the normal random function is applied with 10% away from its nominal value at  $3\sigma$  corners. The BH defect injection is executed by substituting the defectfree MTJ models with the model of the defective device. The defect strength is modeled by sweeping  $H_{\rm p}$  from 0 to infinite. 1k-cycle Monte Carlo simulations are performed for the sensitizing operations given in TABLE III.

#### B. Device-Aware Fault Modeling and Analysis

TABLE IV presents simulation results; here, F of FP is extended with a new symbol '~' to describe the STT-MRAM oscillation state during write operations. This is an irregular faulty behavior, which can be described only by device-aware defect models; not by linear resistance defect models. In total four FPs are derived based on the simulation results:  $<0w0/\sim/->$ ,  $<0w1/\sim/->$ ,  $<1w0/\sim/->$ ,  $<1w1/\sim/->$ . We name these faults as 'Write Oscillating Fault'. TABLE IV suggests that BH has a stronger effect on w0 operations, and 0w0 is the most sensitive operation to BH; this operation will cause the MTJ to oscillate for all BH defect sizes and result in a fault. During w0 operations, the MTJ stays in the low resistance state in Phase 2. With a constant  $V_{\rm p}$ , the  $RL_{\rm t}$  suffers a larger current, and becomes easier to be switched. The case is the same for 0w0 and 0w1 operations.

#### VI. TEST SOLUTIONS OF THE BH DEFECT

Inspecting TABLE IV reveals that the detection of BH defect requires only the detection of FP:  $<0w0/\sim/->$ . A

straightforward test solution is presented following the march test algorithm:  $March - BH = \{ \ddagger (w0, r0)^i \}$ . The ' $\Uparrow$ ' indicates that addressing direction is irrelevant. The march test element firstly tries to sensitize the fault by w0, secondly reads the state by r0. Here, '*i*' refers to the number of times the march element should be repeated. If the WER =  $P_{wer}$ , then the detection probability of BH is  $P_{dt} = 1 - (1 - P_{wer})^i$ . However, since  $P_{wer}$  is always below 100%, only repeating can never guarantee a 100% detection of BH. E. g., if  $P_{wer} = 12\%$ , reaching  $P_{dt} = 99\%$  requires i = 35, and march test length = 2 \* i \* n = 70n, where n is the array size.

One way to reduce the test length is to increase WER; three possible approaches can be applied: 1) A higher  $V_p$  or a longer  $t_p$ ; 2) A higher temperature; 3) An external magnetic field. Yet limitations exist in all the three methods. It is reported by Tan *et al.* [24] that the WER gradually raises to a limitation with the stress strengthening (i.e. a longer  $t_p$  or a higher  $V_p$ ). Also, the temperature has little impact on WER under strong stress, since both the FL and  $RL_t$  work in the precessional regime, in which majorly the current, rather than the temperature, determines the  $t_w$  [23]. Besides, heating the device increases the risk of breakdown. Thanks to the ability of adjusting the  $t_w$  in different phases, the external magnetic field seems to be an ideal solution to increase the WER. Yet its impact on the WER is unpredictable because of the interface effects [19].

A more efficient way to detect BH is to deploy a Designfor-Testability (DFT) to directly detect the STT-MRAM oscillation, as in Fig. 6. Firstly, the port 'Sense (S)' and 'Detect (D)' are initialized to '0'. Then, we apply the 0w0 operation with a long pulse time to intrigue the BH, while extracting the write current to the DFT circuit during the operation. For a defect-free device, the 'S' is always '0', and the 'D' is static in '0'. However, once the BH occurs and the STT-MRAM state hops, the 'S' turns to '1', which further turns 'D' to '1'. Thereafter, even if 'S' turns back to '0', the 'D' will stay in '1' forever, as the logic presented in Fig. 6, and the BH defect can be defected by extracting the state of 'D'. This DFT structure can guarantee the detection of BH by sensing the first hop of the MTJ state, but it still has some limitations; extracting the writing current requires extra effort for circuit design, and the resistance ratio between P and AP states is small during the write operation. Here the DFT is just a method, which will be further explored in our future work.

#### VII. CONCLUSION

This paper characterizes the BH defect in STT-MRAMs. The device-aware BH model has been put forward to predict the WER of defective devices. The circuit-level simulation was conducted then, followed by the device-aware fault model designing. Four fault primitives were applied to describe the BH-induced faulty behaviors of STT-MRAMs. Test solutions were proposed in the end.

#### ACKNOWLEDGEMENTS

This work is supported by IMEC's Industrial Affiliation Program on STT-MRAM devices.



Fig. 6. DFT for defecting the BH defect

#### REFERENCES

- L. Wei *et al.*, "13.3 A 7Mb STT-MRAM in 22FFL FinFET Technology with 4ns Read Sensing Time at 0.9V Using Write-Verify-Write Scheme and Offset-Cancellation Sensing Technique," in *ISSCC*, 2019, pp. 214– 216.
- [2] S. Tehrani, "Status and Outlook of MRAM Memory Technology (Invited)," in *IEDM*, 2006, pp. 1–4.
- [3] K. Lee *et al.*, "1Gbit High Density Embedded STT-MRAM in 28nm FDSOI Technology," in *IEDM*, 2019, pp. 2.2.1–2.2.4.
- [4] W.J. Gallagher *et al.*, "22nm STT-MRAM for Reflow and Automotive Uses with High Yield, Reliability, and Magnetic Immunity and with Performance and Shielding Options," in *IEDM*, 2019, pp. 2.7.1–2.7.4.
- [5] S. Aggarwal *et al.*, "Demonstration of a Reliable 1 Gb Standalone Spin-Transfer Torque MRAM For Industrial Applications," in *IEDM*, 2019, pp. 2.1.1–2.1.4.
- [6] S.M. Nair *et al.*, "Defect injection, fault modeling and test algorithm generation methodology for STT-MRAM," in *ITC*, 2018, pp. 1–10.
- [7] J. Azevedo et al., "A Complete Resistive-Open Defect Analysis for Thermally Assisted Switching MRAMs," *IEEE Trans. Very. Large.* Scale. Integr. VLSI Syst., vol. 22, pp. 2326–2335, 2014.
- [8] L. Wu et al., "MFA-MTJ Model: Magnetic-Field-Aware Compact Model of pMTJ for Robust STT-MRAM Design," *IEEE Trans. Comput. Des. Integer. Circuit. Syst.*, pp. 1–1, 2022.
- [9] A. Chintaluri et al., "A model study of defects and faults in embedded spin transfer torque (STT) MRAM arrays," in ATS, 2015, pp. 187–192.
- [10] M. Fieback *et al.*, "Device-Aware Test: a New Test Approach Towards DPPB," in *ITC*, 2019, pp. 1–10.
- [11] M. Taouil *et al.*, "Device Aware Test for Memory Units," 2021, in European patent EP4026128A1.
- [12] L. Wu *et al.*, "Electrical Modeling of STT-MRAM Defects," in *ITC*, 2018, pp. 1–10.
- [13] L. Wu *et al.*, "Characterization, Modeling and Test of Synthetic Anti-Ferromagnet Flip Defect in STT-MRAMs," in *ITC*, 2020, pp. 1–10.
- [14] L. Wu et al., "Characterization, Modeling, and Test of Intermediate State Defects in STT-MRAMs," Trans. Comput., pp. 1–1, 2021.
- [15] J. Sun et al., "High-bias backhopping in nanosecond time-domain spintorque switches of MgO-based magnetic tunnel junctions," J. Appl. Phys., vol. 105, p. 07D109, 2009.
- [16] S. Rao *et al.*, "STT-MRAM array performance improvement through optimization of Ion Beam Etch and MTJ for Last-Level Cache application," in *IMW*, 2021, pp. 1–4.
- [17] W. Kim et al., "Experimental Observation of Back-Hopping With Reference Layer Flipping by High-Voltage Pulse in Perpendicular Magnetic Tunnel Junctions," *IEEE Trans. Magn.*, vol. 52, pp. 1–4, 2016.
- [18] C. Abert *et al.*, "Back-Hopping in Spin-Transfer-Torque Devices: Possible Origin and Countermeasures," *Phys. Rev. Applied*, vol. 9, p. 054010, 2018.
- [19] C. Safranski *et al.*, "Interface moment dynamics and its contribution to spin-transfer torque switching process in magnetic tunnel junctions," *Phys. Rev. B*, vol. 100, p. 014435, 2019.
- [20] J. Nogués et al., "Exchange bias," J. Magn. Magn. Mater., vol. 192, pp. 203–232, 1999.
- [21] Z. Hou *et al.*, "Dynamics of the reference layer driven by spin-transfer torque: Analytical versus simulation model," *J. Appl. Phys.*, vol. 109, p. 113914, 2011.
- [22] A. Khvalkovskiy *et al.*, "Basic principles of STT-MRAM cell operation in memory arrays," *J. Phys. D*, vol. 46, p. 074001, 2013.
- [23] J. Sun, "Current-driven magnetic switching in manganite trilayer junctions," J. Magn. Magn. Mater., vol. 202, pp. 157–162, 1999.
- [24] J. Tan *et al.*, "Role of temperature, MTJ size and pulse-width on STT-MRAM bit-error rate and backhopping," *Solid-State Electronics*, vol. 183, p. 108032, 2021.