

# Spin Wave Normalization Toward All Magnonic Circuits

Mahmoud, Abdulqader Nael; Vanderveken, Frederic; Adelmann, Christoph; Ciubotaru, Florin; Cotofana, Sorin: Hamdioui, Said

10.1109/TCSI.2020.3028050

**Publication date** 2021

**Document Version** 

Accepted author manuscript

Published in

IEEE Transactions on Circuits and Systems I: Regular Papers

Citation (APA)

Mahmoud, A. N., Vanderveken, F., Adelmann, C., Ciubotaru, F., Cotofana, S., & Hamdioui, S. (2021). Spin Wave Normalization Toward All Magnonic Circuits. IEEE Transactions on Circuits and Systems I: Régular Papers, 68(1), 536-549. Article 9226456. https://doi.org/10.1109/TCSI.2020.3028050

Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Takedown policy
Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

# Spin Wave Normalization Towards all Magnonic Circuits

Abdulqader Mahmoud,<sup>1</sup> Frederic Vanderveken,<sup>2,3</sup> Christoph Adelmann,<sup>3</sup> Florin Ciubotaru,<sup>3</sup> Sorin Cotofana,<sup>1</sup> and Said Hamdioui<sup>1</sup>

- <sup>1)</sup> Delft University of Technology, Department of Quantum and Computer Engineering, 2628 CD Delft, The Netherlands
- <sup>2)</sup> KU Leuven, Department of Materials, SIEM, 3001 Leuven, Belgium
- <sup>3)</sup>Imec, 3001 Leuven, Belgium

The key enabling factor for Spin Wave (SW) technology utilization for building ultra low power circuits is the ability to energy efficiently cascade SW basic computation blocks. SW Majority gates, which constitute a universal gate set for this paradigm, operating on phase encoded data are not input output coherent in terms of SW amplitude. Thus, their cascading requires information representation conversion from SW to voltage and back, which is by no means energy effective. In this paper, a novel conversion free SW gate cascading scheme is proposed that achieves SW amplitude normalization by means of a directional coupler. After introducing the normalization concept, we utilize it in the implementation of three simple circuits and, to demonstrate its bigger scale potential, of a 2-bit inputs SW multiplier. The proposed structures are validated by means of the Object Oriented Micromagnetic Framework (OOMMF) and GPU-accelerated Micromagnetics (MuMax3). Furthermore, we assess the normalization induced energy overhead and demonstrate that the proposed approach consumes 1.25x to 1.5x less energy when compared with the transducers based conventional counterpart. Finally, we introduce a normalization based SW 2-bit inputs multiplier design and compare it with functionally equivalent SW transducer based and 16 nm CMOS designs. Our evaluation indicates that the proposed approach provided 1.34x and 6.25x energy reductions when compared with the conventional approach and 16 nm CMOS counterpart, respectively, which demonstrates that our proposal is energy effective and opens the road towards the full utilization of the SW paradigm potential and the development of SW only circuits.

### I. INTRODUCTION

The information technology revolution resulted in a huge amount of data that need to be processed. The processing of these data requires efficient computing platforms, which are usually implemented in CMOS technology<sup>1</sup>. By the continuous CMOS downscaling, the performance requirements were met<sup>2</sup>. However, CMOS downscaling became more difficult due to: (i) leakage wall<sup>3</sup>, (ii) reliability wall<sup>4</sup>, and (iii) cost wall<sup>3,4</sup>, which suggests that Moore's law will soon come to its end. Therefore, new technologies, such as graphene devices<sup>5</sup>, memristors<sup>6-11</sup>, and spintronics<sup>12-17</sup> are explored. Spintronics technologies based on magnetization switching<sup>18</sup>, generation of skyrmions<sup>19,20</sup>, rectified tunnel magnetoresistance<sup>21</sup>, anomalous Hall effect and the negative differential resistance (NDR) phenomenon<sup>22</sup> in magnetic tunnel junctions, require very large current densities of the order of  $10^{11}$  to  $10^{12}$  A/m<sup>2</sup> to operate. A potentially more energy efficient spintronic technology relies on voltage driven generation and manipulation of Spin Waves<sup>2,17,23,24</sup>. It has three main features, which make it very promising and potentially suitable for ultra-low power consumption applications<sup>2,17</sup>: (i) Ultra-low power consumption because no current flows and thus no Joule heating is present, (ii) acceptable delay, (iii) scalability as SW wavelength can reach down to few nano-meters at rf-frequencies. Therefore, new design methodologies appropriate for spinwave based technology circuits, e.g., gate cascading, which is the enabling factor towards the construction of complex SW circuits, are of great interest.

Up to date, various SW based logic gates have been proposed<sup>23,25-41</sup>. The Mach-Zehnder interferometer was used to design the first experimental SW logic gate<sup>25</sup>. The same approach was used to design XNOR, NAND, and NOR gates<sup>26-28</sup>. Also, a transmission line based three terminal device was employed to build NOT, OR, and AND gates<sup>29303132</sup>. In addition, voltage-controlled XNOR and NAND gates were presented using a re-configurable nano-channel SW device<sup>33</sup>, and two magnon transistors were embedded between the Mach-Zehnder interferometer arms to build an XOR gate<sup>34</sup>. As opposed to the previous mentioned schemes, which encode information in SW amplitude, alternative buffer, inverter, (N)AND, (N)OR, XOR and Majority gate designs were proposed that are encoding the information in SW phase instead.<sup>23</sup>. Moreover, Majority gate designs that optimize SWs transmission efficiency by decreasing their back propagation<sup>35-37</sup>, a crossbar structure appropriate for (N)OR gate implementations<sup>38</sup>, and Majority gate physical realizations<sup>39-41</sup> were reported.

However, the direct cascading of two or more such logic gates within the spin wave domain is not straightforward because of the fact that they are not input-output consistent, i.e., the amplitude at the output SW originating from the input SWs interference is input data dependent, which can induce wrong results at the following gate outputs. Note that although SW based circuits e.g., counter<sup>42</sup>, prime factorization<sup>43</sup> and multiplexer<sup>44</sup>, were recently published, all of them rely on the assumption that cascading can be performed without providing actual solutions for it. They even disregarded the issue and considered that SW gates can be directly connected, which in some cases generates wrong results as gate output SWs have input data dependent amplitude levels, or assumed that it can be achieved by forth-and-back conversions between SW and voltage domains, which is a power hungry process that may nullify the SW based computation paradigm energy efficiency promise.

In this paper, we enable direct gate cascading within the SW domain by introducing a conversion free SW normalization approach, which opens the road towards magnetic domain only circuit designs. The contribution of this paper can be summarized as follows:

- Enabling spin wave gate cascading through directional coupler: a properly designed directional coupler<sup>45</sup> is utilized to achieve logic gate SW output amplitude normalization and to pass it to the next gate.
- Proposing and analyzing different logic gate cascading structures: Domain conversion free cascading schemes for in-line<sup>41</sup> and fanout enabled ladder shaped<sup>46,47</sup> Majority gates.
- Building a SW based multiplier using directional coupler: We employed the cascading solution to build a 2-bit inputs spin wave multiplier.
- Validating the functionality: OOMMF and MuMax3 simulations are utilized to validate all the proposed structures and evaluate their delay and energy consumption.
- Assessing the structures: While the proposed gate cascading solution consumes negligible amount of energy, it induces an 150 ns delay overhead, which we reduced to 20 ns by structure down scaling and using a material with higher average SW group velocity. In comparison with the conversion based cascading our method provides a 1.25x to 1.5x gate level energy reduction, which for the 2-bit inputs SW multiplier results in



FIG. 1. Spin Wave Parameters.

1.34x and 6.25x energy reductions when compared with the SW conventional approach and 16 nm CMOS counterpart, respectively.

The paper consists of eight main sections as follows. Section III discusses the basics and background of spin wave technology. Section III introduces and analyzes the gate cascading problem, Section IV explains the proposed solution, and Section V illustrates the construction of cascaded gates and circuits. Section VI explains the simulation platform, the performed simulations, and the utilized metrics. Section VII illustrates the simulation results, provides a performance comparison when assuming the 2-bit inputs SW multiplier as discussion vehicle, and provides inside on variability and thermal effects on SW gates functionality. Finally, Section VIII concludes the paper.

#### II. SPIN WAVE BASICS AND BACKGROUND

This section provides basic insight into the spin-wave fundamentals and spin-wave based computation paradigm.

## A. Spin Wave Fundamentals

A spin wave is the collective excitation of the magnetization in the magnetic system<sup>48</sup>. The magnetization precessional motion can be described by using the Landau-Lifshitz-Gilbert equation<sup>4950</sup>:



FIG. 2. a) Spin Wave Device, b) Constructive and Destructive Interference.

$$\frac{d\vec{m}}{dt} = -|\gamma|\mu_0 \left(\vec{m} \times \vec{H}_{eff}\right) + \frac{\alpha}{M_s} \left(\vec{m} \times \frac{d\vec{m}}{dt}\right),\tag{1}$$

where  $\alpha$  is the damping constant,  $\gamma$  the gyromagnetic ratio,  $M_s$  the saturation magnetization,  $\vec{m}$  the magnetization, and  $H_{eff}$  the effective field. This effective field is the summation of all different field contributions that affect the magnetization. Considering the most common interactions, one obtains

$$H_{eff} = H_{ext} + H_{ex} + H_{demag} + H_{ani}, (2)$$

where  $H_{ext}$  is the external field,  $H_{ex}$  the exchange field,  $H_{demag}$  the demagnetizing field, and  $H_{ani}$  the magneto-crystalline field.

Spin waves can be characterized by amplitude A, phase  $\phi$ , frequency f (the time it takes for the spin to complete one round), wavelength  $\lambda$  (the shortest distance between two similar spins which exhibit the same behaviour), and wavenumber  $k = \frac{2\pi}{\lambda}$  (the number of waves in one cycle, which is one full spin precision) as it can be observed in Figure 1.

# B. Spin Wave Computing Paradigm

Figure 2a presents a spin-wave logic device. It consists of four regions: I, exciting stage where a spin wave is excited by, e.g., Antenna, Magneto-Electric (ME), B, waveguide through which the spin wave propagates, FR, functional region where the spin wave can be amplified, normalized, interferes with other spin waves, and O, the detection stage where the result is detected and converted into voltage by, e.g., Antenna, Magneto-Electric (ME)<sup>23,48,51</sup>. Note that SWs can be used as data carriers as during their excitation, information can be encoded into their amplitude or phase at different frequencies<sup>42,52</sup>. In addition, SWs interference can be utilized as underlying principle behind SW computing strategies that do not follow the well establish Boolean algebra paradigm. To get inside into this operation principle we make use of the interference of two SWs as discussion vehicle. Their interference is constructive if they are in phase  $\Delta \phi = 0$ , and destructive if they are out of phase  $\Delta \phi = \pi$ , as depicted in Figure 2b. Subsequently, assuming that logic 0/1 is represented by a spin wave with phase  $0/\pi$  and more than two waves coexist in the same waveguide, the majority principle governs their interference. Assuming for example that 3 SWs are reaching the FR and that at most one of them has a phase of  $\pi$ , then the resulting SW has a 0 phase and of  $\pi$  otherwise, which mimics the 3-input Majority gate behaviour. Note that while in the SW domain 3-input Majority can be evaluated with one device only its CMOS implementation requires 18 transistors<sup>23,53</sup>, which clearly indicates that SW based implementation is potentially speaking more compact and energy effective than CMOS counterparts.

# III. SPIN WAVE GATE CASCADING CHALLENGE

To evaluate complex Boolean functions, one needs to be able to interconnect spin wave gates to form the required circuit. However, directly cascading Majority or any other type of SW gates may produce wrong results. To clarify this issue let us assume the situation in Figure 3a where a 3-input Majority (MAJ3) gate output is connected to one of the inputs of another MAJ3 gate. All input SW are excited with the same amplitude A, frequency f, and a 0 phase corresponds to logic 0 and a  $\pi$  phase to logic 1. Given that MAJ3 operation is governed by SW interference both amplitude and phase of the SW gate inputs contribute to the output SW parameters. While from the point of view of an individual gate the output



FIG. 3. a) Cascaded MAJ3 Gates, Spin Wave Waveform Analysis at b)  $I_1I_2I_3I_4I_5$ =00011, c)  $I_1I_2I_3I_4I_5$ =00111.

value is solely determined by the output SW phase this is not any longer the case when that output is utilized as input for a follow-up gates. Figure 3b and c present the SW interferences within the circuit when  $I_1I_2I_3I_4I_5 = 00011$  and  $I_1I_2I_3I_4I_5 = 00111$ , respectively. As one can observe in Figure 3b the excited spin waves at  $I_1$ ,  $I_2$ , and  $I_3$  interfere constructively and produce on WG D a spin wave with the same phase as  $I_1$   $I_2$ , and  $I_3$ , but with a 3A amplitude (strong majority). Subsequently, WG D SW interacts with  $I_4$  and  $I_5$  SWs in the second MAJ3 gate, which produces an output SW with amplitude A and phase 0, which



FIG. 4. Cascaded In-Line MAJ3 Gates.



FIG. 5. Cascaded In-Line MAJ3 Gates Simulation Results.

is wrong given that MAJ3(0,1,1) = 1. This wrong results is induced by the fact that the MAJ3 gate can properly operates on equal amplitude SWs, which is not the case for  $I_1I_2I_3I_4I_5 = 00011$ . Figure 3c present the situation for  $I_1I_2I_3I_4I_5 = 00111$  case in which the first MAJ3 produces an A amplitude and phase 0 SW (weak majority) and the second gate produces the correct result as expected. Thus, cascading MAJ3 may induce wrong output results when the driving gate produces a strong majority 0 or 1 output.

To clarify things even more, we build the structure depicted in Figure 4 that corresponds to two cascaded MAJ3 gates and evaluated its behaviour by means of OOMMF simulations. Figure 5 presents the OOMMF results when the parameters mentioned in Section V are utilized. Three different cases were tested  $I_1I_2I_3I_4I_5 = 00000$ ,  $I_1I_2I_3I_4I_5 = 00111$ , and  $I_1I_2I_3I_4I_5 = 00011$ . In the Figure, red represents logic 0, and blue logic 1. As it can be observed from the figure,  $I_1I_2I_3I_4I_5 = 00000$  results in an output O = 0, while  $I_1I_2I_3I_4I_5 = 00111$  resulted in an output O = 1. However, in the case of  $I_1I_2I_3I_4I_5 = 00011$ , the output is between logic 0 and logic 1 as a result of the strong 0 generated by the first MAJ3 gate

(SW with 3A amplitude). Thus as the theoretical analysis also suggested wrong results are generated, which call for the MAJ3 gate augmentation with an amplitude normalizer able to enable SW gates cascading and, by implication, circuit design in the spin wave domain.

#### IV. PROPOSED SW GATE CASCADING SOLUTION

This section first introduces the proposed gate cascading concept and its operation principles. Thereafter, it demonstrates its capability to circumvent the problem presented in the previous section and illustrated in Figure 3.

# A. Proposed SW Gate Cascading Concept

The proposed gate cascading solution relies on the placement of a spin wave amplitude normalizer between the cascaded Majority gates. The normalizer is a properly designed directional coupler<sup>45</sup> able to adjust the driving Majority gate output SW amplitude to A in case of strong majority (3A) or to leave it unchanged for weak majority cases before passing it to the next Majority gate as presented in Figure 6a. This behaviour is achieved by making use of the nonlinear properties of high amplitude SWs, which cause a shift in the dispersion relation, which at its turn induces a wavelength shift. When placing two waveguides close to each other they are said to be dipolarly coupled and form a directional coupler as presented in Figure 6b, which enables a wavelength dependent energy transfer between the two waveguides. Thus, by properly controlling this energy transfer via the nonlinear characteristics, the spin wave amplitude can be normalised to the desired value, i.e., A in our case.

The equations describing the dispersion relations and energy transfer of the normaliser element are given in the following. A detailed derivation of the equations can be found in<sup>45,54,55</sup>. When two waveguides are placed close to each other, two spin wave modes exist. One mode has a symmetric profile over both waveguides whereas the other has an antisymmetric profile over the two waveguides. The dispersion relation of both modes is given by

$$f_o(k_x) = \frac{1}{2\pi} \sqrt{\Omega^{yy} \Omega^{zz}} \tag{3}$$

and

$$f_{s,as}(k_x) = \frac{1}{2\pi} \sqrt{(\Omega^{yy} \pm \omega_M F_{kx}^{yy}(d))(\Omega^{zz} \pm \omega_M F_{kx}^{yy}(d))},\tag{4}$$

where  $f_o(k_x)$  is the SW dispersion relation in a single waveguide,  $f_{s,as}(k_x)$  the symmetric and anti-symmetric dispersion relations for spin waves in coupled waveguides,  $\Omega^{ii} = \omega_H + \omega_M(\lambda_{ex}^2 k_x^2 + F_{kx}^{ii}(0))$ , i = y, z,  $\omega_H = \gamma B_{ext}$ ,  $\omega_M = \gamma \mu_o M_s$ ,  $M_s$  the magnetic saturation,  $\gamma$  the gyromagnetic ratio,  $\mu_o$  the vacuum permeability,  $\lambda_{ex} = 2A_{ex}/\mu_o M_s^2$ ,  $A_{ex}$  the exchange constant,  $d = w + \delta$  the distance between the two waveguides centres, w the waveguides width,  $\delta$  the gap between the two waveguides, and  $F_{kx}$  the tensor that describes the dynamical magneto-dipolar interaction (introduced in  $^{45,54,55}$ )

$$F_{kx}^{yy}(d) = \frac{1}{2\pi} \int \frac{|\sigma|^2 k_y^2}{\tilde{w}k^2} (1 - \frac{1 - e^{-kh}}{kh}) e^{ik_y d} dk_y, \tag{5}$$

$$F_{kx}^{zz}(d) = \frac{1}{2\pi} \int \frac{|\sigma|^2}{\tilde{w}} \frac{1 - e^{-kh}}{kh} e^{ik_y d} dk_y, \tag{6}$$

where  $k = \sqrt{k_x^2 + k_y^2}$ , h the material thickness,  $\sigma$  the Fourier transform of the spin wave profile across the width of the waveguide, and  $\tilde{w}$  the mode profile normalized constant. When the spins are fully unpinned at the waveguide edges,  $\tilde{w}$  equals the real waveguide width and  $\sigma = w \operatorname{sinc}(k_y w/2)$ .

When a spin wave is excited at frequencies higher than the anti-symmetric mode minimum frequency, two spin wave modes are excited at the same time. One symmetric mode with wavenumber  $k_s$  and antisymmetric mode with wavenumber  $k_{as}$ . As a result of the interference between them, the overall spin wave energy resonantly transfers from one waveguide to the other after SW's propagation over a particular distance  $L_c$  as depicted in Figure 6b<sup>45,56–58</sup>. This distance  $L_c$  is called coupling length, and depends on different parameters such as SW wavelength, applied magnetic field, space between waveguides, waveguide geometrical size, and SW amplitude<sup>45</sup>. The coupling length is given by<sup>45</sup>

$$L_c = \frac{\pi}{|k_s - k_{as}|}. (7)$$

The distribution of SW energy over the two waveguides at the end of the normaliser depends on the coupling length  $L_c$  and the length of the coupled waveguides  $L_w$ . The proportion of energy in the first waveguide after a distance  $L_w$  is given by<sup>45</sup>

$$\frac{O_1}{O_1 + O_2} = \cos^2\left(\frac{\pi L_w}{2L_c}\right),\tag{8}$$



FIG. 6. a) Proposed Spin Wave Gate Cascading Solution, b) Directional Coupler, c) Dispersion Relation (DR) of Isolated (I), Symmetric (S) and Asymmetric (As) Spin Wave Waveguide (WG) Modes at the Linear Region, d) Energy Transmission Ratio between Coupled Waveguides with  $L_w$ =3  $\mu$  m, e) Dispersion Relation of Single, Symmetric and Asymmetric Spin Wave Waveguide modes at the Non-linear region (with Frequency Shift Effect).

where  $O_1$  and  $O_2$  are the output energies of the first and second waveguide, as also graphically visualised in Figure 6d.

As long as the SW amplitude is low, the nonlinear effects are limited. However, as the spin wave amplitude increases, the nonlinearity affects the spin wave dispersion relation, and causes a frequency shift. This dispersion relation corresponding to nonlinear spin waves is given by

$$f_{s,as}^{(nl)} = f_{s,as}^{(0)}(k_x) + T_{kx}|a_{kx}|^2, (9)$$

where  $a_{kx}$  is the spin wave amplitude and  $T_{kx}$  the spin wave nonlinear frequency shift coefficient, which can be calculated by  $^{45,59,60}$ 

$$T_{kx} = \frac{w_H - A_{kx} + \frac{B_{kx}^2}{2\omega_o^2} (\omega_M (4\lambda^2 k_x^2 + F_{2kx}^{xx}(0)) + 3\omega_H)}{2\pi}$$
 (10)

with

$$A_{kx} = \omega_H + \frac{\omega_H}{2} (2\lambda_{ex}^2 k_x^2 + F_{kx}^{yy}(0) + F_{kx}^{zz}(0)), \qquad (11)$$

$$B_{kx} = \frac{\omega_M}{2} (F_{kx}^{yy}(0) - F_{kx}^{zz}(0)), \qquad (12)$$

and

$$F_{2kx}^{xx}(d) = \frac{1}{2\pi} \int \frac{|\sigma|^2 4k_x^2}{\tilde{w}k^2} \left(1 - \frac{1 - e^{-kh}}{kh}\right) e^{ik_y d} dk_y \tag{13}$$

with 
$$k = \sqrt{4k_x^2 + k_y^2}$$
.

This is also graphically presented in Figure  $6e^{45,54}$ . Note that the parameters we utilize for determining these dispersion relations are summarized in Table I.

The nonlinear frequency shift also affects the distribution of the energies over the two waveguides as indicated by

$$\frac{O_1}{O_1 + O_2} = \cos^2\left(\frac{\pi L_w}{2L_c} - \frac{\pi L_w}{2L_c^2} \frac{\partial L_c}{\partial f} T_{kx} |a_{kx}|^2\right). \tag{14}$$

As it is clear from Equation (14), the nonlinear effects of the spin waves strongly influence the power distribution over the two waveguides. Hence, the directional coupler exhibits high sensitivity to spin wave amplitude changes. As a result, if a strong coupling and high sensitivity to the spin wave amplitude change are required, the directional coupler must be long and the gap between the two directional couplers must be small. For example, if 0%, 50%, and 100% of the input spin wave energy should transfer to the second waveguide when

its amplitude is 2A, 3A, and 4A, respectively,  $L_w$  should be equal to  $3 \mu m$ , the distance between the coupled waveguide (DW) 10 nm, Yttrium Iron Garnet (YIG) waveguide thickness 30 nm and width 100 nm, wavelength 340 nm, and frequency  $2.282 \text{ GHz}^{45}$ . These values are material dependent, thus they change when another material is utilized<sup>45</sup>.

Note that such a directional coupler can be utilized as frequency multiplexer and others<sup>45</sup>. However, in this paper, we concentrate on its utilization as amplitude normalizer to enable gate cascading within spin wave domain.

### B. DC based SW Gate Cascading Implementation

Figure 7a revisits the situation in Figure 3a and augments the waveguide connecting the two majority gates with a directional coupler as amplitude normalizer. The spin waves excited at  $I_1$ ,  $I_2$ ,  $I_3$  interfere constructively or destructively depending on their phases and the output of the first MAJ3 gate is normalized or not on case it signals a strong or a weak majority by the directional coupler. If the output SW amplitude is greater than a predefined threshold, in our case the inputs amplitude value A, then it is normalized to A while preserving the SW phase. Otherwise, no normalization occurs and only a tinny portion of the SW power is transferred to the second waveguide due to the coupling effect. The two input combinations we previously utilized explain the gate cascading issue, i.e.,  $I_1I_2I_3I_4I_5$ =00011 and  $I_1I_2I_3I_4I_5$ =00111, are revisited to demonstrate that the directional coupler enables proper gate cascading. Assuming that all input spin waves are excited with the same amplitude A and frequency ones excited at  $I_1$ ,  $I_2$ , and  $I_3$  interfere constructively in the first case resulting in a spin wave with 0 phase and 3A amplitude as depicted by WG D BN in Figure 7b. Given that SW amplitude is greater than A it is normalized by the directional coupler to A producing WG D AN in Figure 7b. At the second majority gate WG E and WG F interfere constructively which result destructively interfere with WG D AN. As a result of the overall interference process the output SW corresponds to a logic 1 as it should. In the other case,  $I_1$  SW constructively interferes with  $I_2$  SW which result destructively interferes with  $I_3$  SW resulting in a spin wave with 0 phase and amplitude A in WG D BN. Since the amplitude equals to the threshold, no normalization occurs and the WG D AN spin wave approximately equals WG D BN SW as depicted in Figure 7c. Then the spin wave excited at  $I_4$  and  $I_5$  interfere constructively with each other and destructively



FIG. 7. (a) Proposed Gates Cascading Solution. Spin Wave Waveform Analysis (b)  $I_1I_2I_3I_4I_5$ =00011, (c)  $I_1I_2I_3I_4I_5$ =00111.

with spin wave in WG D AN, which result in a  $\pi$  phase and amplitude A SW, i.e., a logic 1 as expected.

Note that the above holds true for all logic gate types, i.e., (N)AND, and (N)OR, and the proposed solution can be utilized to normalize the output of these gates if cascaded with other gates.



FIG. 8. In-Line MAJ3 Cascaded Gates.

#### V. BUILDING CASCADED SW GATES AND CIRCUITS

In order to validate our proposal and demonstrate its potential towards building spin wave circuits, we design three complex gates that make use of it. While most of the time, circuit design requires the utilization of one gate output as input for only one follow-up gate there are situations when that output has to drive more than one gate input. To cover the most common situations encountered in logic circuit implementations we selected three different structures for demonstration purpose, as follows: (i) Single output MAJ3 gate and (ii) Fully / Partially cascadable dual output MAJ3 gates. While the first structure (Figure 8) can provide only one output, the second (Figure 9) and third structure (Figure 10) can provide two outputs. In addition, the three inputs in the first structure have similar contribution approximately to the output which is not the case in the second and third structures which might result in the excitation of different inputs at different energy levels in the second and third structures. Note that the introduced approach is scalable and can be applied to SW gates with more outputs but such designs are beyond the goal of this manuscript. Further, the proposed structures can mimic (N)AND, (N)OR, and X(N)OR gate behavior as indicated in<sup>47</sup>. Additionally, in order to assess the cascading approach potential at circuit level we instantiate a 2-bit inputs spin wave multiplier presented in Figure 11, which spin wave domain only design is not possible without the proposed approach.

# A. Cascaded In-Line MAJ3 Gates

The structure in Figure 7a provides a generic gate cascading solution containing multiple bent regions, which are not SW propagation "friendly". To minimize them, we implemented the two in-line majority cascaded gates compound with one bent region as depicted in Figure 8. Note that the normalized output of the first Majority gate acts as the third input of he second Majority gate.

To guarantee proper results, the structure dimensions must be fulfilled certain constraints as follows. If SWs should constructively interfere when they have the same phase and destructively otherwise,  $d_1 = d_2 = \ldots = d_5 = n \times \lambda$ , where  $n = 0, 1, 2, 3, \ldots$  If the opposite behaviour is desired, i.e., SWs constructively interfere if they are out of phase and destructively otherwise,  $d_1 = d_2 = \ldots = d_5 = (n + \frac{1}{2}) \times \lambda$ .

The output of the first Majority gate must be normalized to the amplitude of the second Majority gate inputs. Assuming that all input SWs have an amplitude of A the output of the first Majority gate must be normalized to A in case it reports a strong majority result, i.e., a 3A amplitude SW. Therefore, if the output amplitude is A no normalization is required, whereas if the output amplitude is 3A a normalization is performed such that 66% of the spin wave power moves into the second waveguide towards X and only 33% of it passes to the second Majority gate. To obtain this bahaviour, the directional coupler is designed by making use of Equations (3)-(14) while taking into consideration different parameters including applied magnetic field, spaces between waveguides, dimension of the waveguides, static magnetization orientation, and spin wave wavelength, frequency, and amplitude.

The output position must be determined accurately to obtain the desired results, i.e., MAJ3 and inverted MAJ3 are obtained when  $d_6 = n \times \lambda$  and  $d_6 = (n + \frac{1}{2}) \times \lambda$ , respectively. Moreover, depending on a predefined phase, the output value can be phase detected, i.e.,  $\Delta \phi = 0$  represents logic 0 and  $\Delta \phi = \pi$  logic 1. By following the same line of reasoning as in Section IV.B one can easily check the correct behaviour of the two in-line cascaded gates, which is also demonstrated by the simulation results presented in Section VII Figure 12.

# B. Fully Cascaded Ladder MAJ3 Gates

As the efficient implementation of real life circuits requires gates with fanout capabilities a fanout of 2 ladder shaped MAJ3 gate has been introduced in<sup>46</sup>. Before discussing the augmentation of such a gate with directional couplers we briefly discuss its operation principle.

The upper part of the structure presented in Figure 9 constitutes a MAJ3 gate that



FIG. 9. Fully Cascaded Ladder MAJ3 Gates.

is able to parallelly evaluate  $MAJ(I_1, I_2, I_3)$  and  $MAJ(I_1, I_2, I_4)$ , thus if  $I_3 = I_4$  the two values are equal and the gate exhibits a fanout of 2. As discussed in<sup>46</sup> the waveguide topology and dimensions are determined in such a way that the input SWs can properly interfere and generate the correct output values, according with the Majority function true table, and the SW present in the left/right arm before the directional coupler carries the  $MAJ(I_1, I_2, I_3)/MAJ(I_1, I_2, I_4)$  value. Simply speaking, the MAJ3 gate operates as follows: (i) At  $I_1$ ,  $I_2$ ,  $I_3$ , and  $I_4$ , SWs are excited with suitable phase, i.e., phase 0 for logic 0 and phase  $\pi$  for logic 1, (ii) Excited SWs propagate through the horizontal and vertical waveguides, (iii) At the "meeting" points, they interfere constructively or destructively depending on their phases, and (iv) Finally, the resultant SWs propagate downwards through the left and right arms. Note that while the ladder structure is meant to compute a Majority function can also evaluate basic Boolean functions. If output based phase detection is in place,



FIG. 10. Partially Cascaded Ladder MAJ3 Gates.

which means that the output phase is compared with a predefined phase and  $0/\pi$  phase difference means logic 0/1,  $(N)AND = MAJ(I_1, I_2, 0)$  and  $(N)OR = MAJ(I_1, I_2, 1)$ . In contrast, if threshold detection is utilized such that if the output spin wave magnetization is greater than a predefined threshold, the output is logic 1, and it is logic 0, otherwise, then  $XOR = MAJ(I_1, I_2, 0)$ .

To make the FO2 MAJ3 gate outputs directly connectable as inputs to following SW gates they have to be normalized by means of 2 directional couplers as presented in Figure 9. The circuit in the Figure operates as follows: (i) At  $I_1$ ,  $I_2$ ,  $I_3$ ,  $I_4$ ,  $I_5$ ,and  $I_6$ , SWs are excited with suitable phase, (ii) The excited spin waves propagate horizontally and vertically and at the intersection point, they interfere constructively or destructively depending on the excited SWs phases in both arms, (iii) The resulted spin waves from the first Majority gate propagate toward the couplers to be normalized, (iv) The normalized SWs propagate

downward to interfere with the spin waves excited at  $I_5$  and  $I_6$ , and (v) Finally, the resulted SWs propagate toward  $O_1$  and  $O_2$  such that  $O_1 = MAJ(MAJ(I_1, I_2, I_3), I_5, I_6)$  and  $O_2 = MAJ(MAJ(I_1, I_2, I_4), I_5, I_6)$  and that  $I_3 = I_4$ . Note that in case  $I_3 = I_4$  the two outputs are equal, thus the gate compound exhibits a fanout of 2, but when  $I_3 \neq I_4$  the circuit evaluates two different functions that benefit circuit complexity.

To guaranty correct behaviour the input SWs must have the same amplitude and wavelength  $\lambda$ , which, to simplify the interference pattern, must be greater than the waveguide width w. The structure dimension  $d_i$ ,  $i=1,2,\ldots,6$  must be determined in terms of  $\lambda$ . For instance, if SWs have to constructively interfere when they have the same phase and destructively interfere when they are out of phase,  $d_1, d_2, \ldots, d_6$  must be equal with  $n\lambda$ , where  $n=1,2,3,\ldots$ . However, if the other way around is desired, i.e., SWs with the same phase should interfere destructively and constructively when they are out of phase,  $d_1, d_2, \ldots, d_6$  must be equal with  $(n+\frac{1}{2})\lambda$ , where  $n=1,2,3,\ldots$  Additionally, the outputs can be captured at  $O_1$  and  $O_2$  located at  $d_7$  and  $d_8$  from the last interference point, which should be  $n\lambda$  or  $(n+\frac{1}{2})\lambda$  if the non-inverted or inverted output is desired, respectively. Note that the couplers which are needed to normalize the outputs of the first Majority gates are designed in same way as described in the previous section.

### C. Partially Cascaded Ladder MAJ3 Gates

In this situation the FO2 MAJ3 gate is providing input to one follow up MAJ3 gate while its second output constitutes a circuit primary output, i.e., it is read out by a SW detection cell. Consequently, only one directional coupler is required as depicted in Figure 10, while the operation principle and the design steps are the same as for the previously discussed structures.

# D. 2-bit Inputs Spin Wave Multiplier

Figure 11 presents a 2-bit inputs SW multiplier that makes use of the proposed normaliser. The multiplier inputs are the operands  $X = (X_1, X_0)$  and  $Y = (Y_1, Y_0)$  and the control signals  $C_1$  and  $C_2$ . The structure requires 18 excitation cells and generates a 4-bit output  $Q = (Q_0, Q_1, Q_2, Q_3)$ . Following the multiplication algorithm  $Q_0 = AND(X_0, Y_0)$  and  $Q_1 = AND(X_0, Y_0)$  and  $Q_1 = AND(X_0, Y_0)$ 



FIG. 11. 2-bit Inputs Spin Wave Multiplier.

 $XOR(AND(X_1, Y_0), AND(X_0, Y_1))$  and as depicted in Figure 11, the two AND gate outputs are normalized by 2 directional couplers to enable their cascading such that the XOR gate can correctly and detect  $Q_1$ . Further,  $Q_2 = XOR(AND(X_1, Y_1), AND(X_0, Y_0, X_1, Y_1))$ , and again 2 directional couplers are required to normalize the outputs of the  $AND(X_0, Y_0, X_1, Y_1)$  and  $AND(X_0, Y_0)$  and enable their cascading such that the follow-up XOR gate can correctly evaluate and detect  $Q_2$ . Finally,  $Q_3 = AND(X_0, Y_0, X_1, Y_1)$  as it can be observed in Figure 11.

As previously discussed, the distances depend of the chosen SW wavelength and must be accurately determined, i.e.,  $d_i = n\lambda$ , where  $i \in \{1, 2, ..., 35\}$ , n = 0, 1, 2, ... and  $n \neq \{5, 16, 33, 35\}$  as the required interference has to interfere constructively if the SWs have the same phase, and destructively if they are out of phase  $\Delta \phi = \pi$ .

Moreover, as the circuit includes AND and XOR gates, phased based detection, briefly explained in Section V.A, is required for  $Q_0$  and  $Q_3$  and threshold based detection for  $Q_1$ 

and  $Q_2$ . The threshold based detection relies on comparing the spin wave amplitude with a given value in order to discriminate between the two logic values, i.e., greater than the threshold corresponds to logic 1 and lower to logic 0. To ensure correct output detection  $d_5$  and  $d_{35}$  must be  $n\lambda$  to read the non-inverted output. In contrast,  $Q_1$  and  $Q_2$  should be located as near as possible to the interference point to minimize SW amplitude attenuation.

### VI. SIMULATION SETUP

In the following lines, the simulation platform, the utilized parameters, and the performed simulations and performance evaluation metrics are described.

## A. Simulation Platform

We make use of Object Oriented Micro Magnetic Framework (OOMMF)<sup>61</sup> and MuMax3<sup>62</sup> to validate the correct functionality of the proposed normalization solution and gate structures. In the simulations, blue represents a logic 1 and red a logic 0.

The parameters provided to the micromagnetic software are presented in Table I<sup>45</sup>. The dimension of the structures is equal to a spin wave wavelength multiple. Therefore, dimension of the structure in Figure 8 are  $d_1=d_2=d_4=340\,\mathrm{nm},\ d_3=3.74\,\mu\mathrm{m},\ d_5=4.08\,\mu\mathrm{m},\ \mathrm{and}$  $d_6$ =340 nm, whereas the dimension of the structure in Figure 9 and 10 are  $d_1$ = $d_2$ = $d_3$ = $d_4$ = $d_5$ =  $d_6 = d_7 = d_8 = \ 340 \ \mathrm{nm} \ \ \mathrm{and} \ \ d_1 = d_2 = d_3 = d_4 = d_5 = d_6 = d_7 = d_8 = d_9 = 340 \ \mathrm{nm}. \quad \mathrm{Moreover}, \ \mathrm{as} \ \mathrm{further}$ discussed in the simulation results subsection, when making use of a YIG wave guide the directional coupler induced delay is 150 ns, which can be decreased by scaling down the structure or by utilizing another material with higher spin wave group velocity. In this work,  $Fe_{60}Co_{20}B_{20}$  was utilized as waveguide material with Perpendicular Magnetic Anisotropy (PMA). The material parameters are: magnetic saturation  $M_s=1.1\times10^6\mathrm{A/m}$ , exchange stiffness  $A_{ex}=18.5\,\mathrm{pJ/m}$ , damping constant  $\alpha=2\times10^{-4}$ , and perpendicular anisotropy constant  $k_{ani} = 8.3177 \times 10^5 \text{J/m}^{363}$ . The waveguide with is 30 nm and its thickness 1 nm. SWs are excited at a frequency of 15 GHz and have a wavelength of 100 nm. In addition, as the waveguide length should be equal to a wavelength multiple we have chosen it to be 5 times the wavelength, i.e., 500 nm, to decrease mutual effects of gate arms and directional couplers on each others. By making use of Equations (3)-(14) we determined the directional

TABLE I. Simulation Parameters

| Parameters                  | Values                         |  |  |
|-----------------------------|--------------------------------|--|--|
| Magnetic saturation $M_s$   | $1.4 \times 10^5 \mathrm{A/m}$ |  |  |
| Damping constant $\alpha$   | 0.0002                         |  |  |
| Waveguide thickness $t$     | $30\mathrm{nm}$                |  |  |
| Exchange stiffness $A_{ex}$ | $3.5\mathrm{pJ/m}$             |  |  |
| $L_w$                       | $3\mu\mathrm{m}$               |  |  |
| DW                          | 8 nm                           |  |  |
| λ                           | $340\mathrm{nm}$               |  |  |
| Frequency $f$               | $2.282\mathrm{GHz}$            |  |  |

coupler dimensions as  $L_w=2.55 \,\mu\mathrm{m}$  and  $DW=8 \,\mathrm{nm}$ .

### B. Performed Simulation and Evaluation Metrics

We performed simulations on the 4 structures introduced in Section VII.

Delay, power, and energy consumption are metrics of interest to evaluate the gate cascading structures and the multiplier. The energy and delay of transducers are based on the estimation in  $^{64}$  and the SW delay through waveguides was estimated directly from OOMMF and MuMax3 simulation results. The following assumptions are made: i) The excitation and detection cells are ME cell, i.e.,  $C_{ME}=1$  fF,  $V_{ME}=119$  mV, Energy= $k \times C_{ME} \times V_{ME}^2$  (where k is the number of excitation cells), and 0.42 ns ME cell switching delay  $^{64}$ , ii) SW consumes tiny energy in the waveguide and directional coupler when compared to the energy consumed by the transducers, and iii) SWs are excited by means of pulse signals. We note that due to the early stage development of the SW technology, these assumptions might not be accurate and the assumed values may change in the close future.

### VII. SIMULATION RESULTS AND DISCUSSION

In this section simulation results for the gate cascading structures and the spin wave multiplier are presented and commented upon. In addition, delay, power, and energy overhead are assessed and compared with domain conversion and 16 nm CMOS based functionally



FIG. 12. Cascaded In-line MAJ3 Gates: (a)  $I_1I_2I_3I_4I_5 = 00000$ , (b)  $I_1I_2I_3I_4I_5 = 00111$ , and (c)  $I_1I_2I_3I_4I_5 = 00011$ .

equivalent counterpart designs. Finally, variability and thermal effects are discussed.

# A. MAJ3 Gate Cascading

#### In-Line MAJ3 Gates

Figure 12 (a), (b), and (c) presents the simulation results of the two MAJ3 inline cascaded gates (see Figure 8 for the input patterns  $I_1I_2I_3I_4I_5 = 00000$ ,  $I_1I_2I_3I_4I_5 = 00111$ , and  $I_1I_2I_3I_4I_5 = 00011$ , respectively). By inspecting the Figures, it is clear the output results are as expected, i.e., the output corresponding to  $I_1I_2I_3I_4I_5 = 00000$  is logic 0 because all inputs are logic 0 and logic 1 in the other cases because two inputs of the second Majority gate are logic 1 and one input is logic 0, due to the proper amplitude correction induced by the directional coupler.

### Fully Cascaded Ladder MAJ3 Gates

Figure 13 (a), (b), and (c) presents the MuMax3 simulation results for the structure in Figure 9 corresponding to 2 fully cascaded ladder MAJ3 gates for the input combinations  $I_1I_2I_3I_4I_5I_6 = 000000$ ,  $I_1I_2I_3I_4I_5I_6 = 001111$ , and  $I_1I_2I_3I_4I_5I_6 = 000011$ , respectively. It is clear from the Figure that the outputs  $O_1$  and  $O_2$  are correct, i.e.,  $O_1 = O_2 = 0$  when  $I_1I_2I_3I_4I_5I_6 = 00000$  because all circuit inputs are logic 0, while  $O_1 = O_2 = 1$  when



FIG. 13. Fully Cascaded Ladder MAJ3 Gates: (a)  $I_1I_2I_3I_4I_5 = 00000$ , (b)  $I_1I_2I_3I_4I_5 = 00111$ , and (c)  $I_1I_2I_3I_4I_5 = 00011$ .

 $I_1I_2I_3I_4I_5I_6 = 001111$  and  $I_1I_2I_3I_4I_5I_6 = 000011$  because two inputs of the second MAJ3 gate are logic 1 and the other logic 0, which demonstrates the correct behaviour of the circuit.

## Partially Cascaded Ladder MAJ3 Gates

Figure 14 (a), (b), and (c) presents the MuMax3 simulation results for the structure in Figure 10 corresponding to the partial cascading of 2 ladder MAJ3 gates for the input combinations  $I_1I_2I_3I_4I_5I_6 = 000000$ ,  $I_1I_2I_3I_4I_5I_6 = 001111$ , and  $I_1I_2I_3I_4I_5I_6 = 000011$ , respectively. By inspecting the figures, it is clear that all cases  $O_1$  assumes the correct value



FIG. 14. Partially Cascaded Ladder MAJ3 Gates: (a)  $I_1I_2I_3I_4I_5 = 00000$ , (b)  $I_1I_2I_3I_4I_5 = 00111$ , and (c)  $I_1I_2I_3I_4I_5 = 00011$ .

(for  $I_1I_2I_3I_4I_5I_6 = 00000$  is logic 0 because all inputs are logic 0 and logic 1 in the other cases because two inputs of the second MAJ3 gate are logic 1 and the third one logic 0). On the other hand, the second arm, which is not cascaded with the second MAJ3 gate,  $O_2$  is not normalized and correct results are obtained  $O_2$  (logic 0 in all cases as  $I_5$  and  $I_6$  do not affect its behaviour).



FIG. 15.  $Q_0$  Output Simulation (a)  $X_0Y_0 = 00$ , (b)  $X_0Y_0 = 01$ , (c)  $X_0Y_0 = 10$ , and (d)  $X_0Y_0 = 11$ .

# 2-bit Inputs Spin Wave Multiplier

The 2-bit inputs spin wave multiplier in Figure 11 is validated by MuMax3 using the same parameters as for the 30nm width  $Fe_{60}Co_{20}B_{20}$  waveguide in the previous subsection.

Figure 15 presents the first output  $Q_0$  simulation results. Note that  $Q_0 = AND(X_0, Y_0) = MAJ(0, X_0, Y_0)$  thus  $C_1$  in Figure 11 should be asserted to 0.

Inspecting Figure 15 reveals  $Q_0$ 's correct behaviour. Note that  $Q_0$  is placed at d5 = 510nm (n = 5).

As  $Q_1$  and  $Q_2$  are computed as XOR functions threshold detection is required to determine their values and as such Table II presents  $Q_1$  and  $Q_2$  normalized spin wave magnetization for different inputs combinations  $X_0Y_0X_1Y_1 = 0000$ ,  $X_0Y_0X_1Y_1 = 0001$ , ..., and  $X_0Y_0X_1Y_1 =$ 1111. Note that to achieve proper circuit functionality  $C_2$  SW amplitude has to be higher that the one of input SW by a factor of 2.25, which is the required value the realization of the 4-input AND over the input bits. In order to implement the threshold detection,

TABLE II. Normalized Second and Third Spin Wave Multiplier Outputs.

|            | Ca | ses |    | $Q_1$ | $Q_2$  |
|------------|----|-----|----|-------|--------|
| <i>X</i> 1 | Y1 | X0  | Y0 |       |        |
| 0          | 0  | 0   | 0  | 0.03  | 0.06   |
| 0          | 0  | 0   | 1  | 0.08  | 0.03   |
| 0          | 0  | 1   | 0  | 0.22  | 0.016  |
| 0          | 0  | 1   | 1  | 0.15  | 0.04   |
| 0          | 1  | 0   | 0  | 0.38  | 0.17   |
| 0          | 1  | 0   | 1  | 0.03  | 0.3    |
| 0          | 1  | 1   | 0  | 0.46  | 0.09   |
| 0          | 1  | 1   | 1  | 0.74  | 0.09   |
| 1          | 0  | 0   | 0  | 0.32  | 0.3    |
| 1          | 0  | 0   | 1  | 1     | 0.16   |
| 1          | 0  | 1   | 0  | 0.1   | 0.006  |
| 1          | 0  | 1   | 1  | 0.54  | 0.0003 |
| 1          | 1  | 0   | 0  | 0.002 | 1      |
| 1          | 1  | 0   | 1  | 0.52  | 0.7    |
| 1          | 1  | 1   | 0  | 0.52  | 0.33   |
| 1          | 1  | 1   | 1  | 0.22  | 0.2    |

an appropriate threshold is determined for each output, i.e., the normalized threshold for  $Q_1$  is 0.42, and for  $Q_2$  is 0.315. As presented in the table, as the inputs combinations  $X_0Y_0X_1Y_1 = 0000$ ,  $X_0Y_0X_1Y_1 = 0001$ ,  $X_0Y_0X_1Y_1 = 0010$ ,  $X_0Y_0X_1Y_1 = 0011$ ,  $X_0Y_0X_1Y_1 = 0100$ ,  $X_0Y_0X_1Y_1 = 0101$ ,  $X_0Y_0X_1Y_1 = 1100$ , and  $X_0Y_0X_1Y_1 = 1111$  results in output magnetization less than the threshold, thus  $Q_1 = 0$ , and  $Q_1 = 1$  for  $X_0Y_0X_1Y_1 = 0110$ ,  $X_0Y_0X_1Y_1 = 0111$ ,  $X_0Y_0X_1Y_1 = 1110$ ,  $X_0Y_0X_1Y_1 = 1101$ , and  $X_0Y_0X_1Y_1 = 1101$  because these input combinations result in output spin wave amplitudes larger than the threshold. Also, as the inputs combinations  $X_0Y_0X_1Y_1 = 0011$ ,  $X_0Y_0X_1Y_1 = 0111$ , and  $X_0Y_0X_1Y_1 = 1011$  result in output magnetization greater than the threshold, thus  $Q_2 = 1$ , and  $Q_2 = 0$  for the rest cases. Note that the normalized thresholds average for  $Q_1$  and  $Q_2$  are obtained by averaging the normalized



FIG. 16. Fourth Spin Wave Multiplier Output (a) X1Y1X0Y0 = 0000, (b) X1Y1X0Y0 = 0001, and (p) X1Y1X0Y0 = 1111.

magnetization for  $Q_1$  and  $Q_2$  between inputs 0001 and 1001 for  $Q_1$  and inputs 1011 and 0101 for  $Q_2$ . Note that the main reason of the quasi-continuous distribution of  $Q_1$  is that the normalization is not occurring as ideally wanted because there will be some SW energy transfer to the second waveguide even if no normalization is required. Relying on different coupling effect like exchange coupling might improve the performance and make the design more reliable.

Figure 16 presents the forth output  $Q_3$  simulation results for  $X_0Y_0X_1Y_1 = 0000$ ,  $X_0Y_0X_1Y_1 = 0001$ , ..., and  $X_0Y_0X_1Y_1 = 1111$ . As it can be observed in the Figure  $Q_3$ , which is  $AND(X_0, Y_0, X_1, Y_1)$ , is correctly evaluated.

## B. Performance Evaluation

Whereas normalization based cascading doesn't consume a noticeable amount of energy, in comparison with transducers based counterpart (no ME cells for domain conversion are required and the electrons are not moving but just spin and affect each other by the dipolar coupling effect), it induces a significant delay overhead. To estimate the delay, i.e., the maximum time it takes for the SW outputs to become available for further processing, we make use of the numerical simulation results and for all YIG waveguides based considered structures we computed a coupler induced delay of 150 ns.

Although this delay overhead is rather large, it can be decreased by structure downscaling and by relying on alternative materials with higher SW group velocity. Additionally, a promising method to decrease the delay is by utilizing another coupling effect than the dipolar one, which is slow by its nature. The potential utilization of exchange coupling, which is significantly faster is currently under investigation. To get an indication on the scaling effect, we validated by means of MuMax3 simulations the cascading of FO2 MAJ3 gates constructed with  $Fe_{60}Co_{20}B_{20}$  waveguides of 30 nm width. Simulation results for  $I_1I_2I_3I_4I_5I_6 = 000000$ ,  $I_1I_2I_3I_4I_5I_6 = 001111$ , and  $I_1I_2I_3I_4I_5I_6 = 000011$  are presented in Figure 17 and one can easily check that the output values are correct. Remarkable is the fact that scaling and material change diminished the delay overhead from 150 ns to 20 ns as the SW group velocity is faster in the other material and the structure becomes smaller, which indicates that the overhead can potentially be further decreased towards the ps range.

In order to evaluate the practical implications of our proposal we evaluate coupler-based and conversion-based cascading and compare them in terms of delay, power, and energy consumption. The conversion-based circuits are obtained by replacing each directional coupler in Figures 8, 9, and 10 with two transducers able to convert SW to charge domain and back to SW domain. Given the assumptions in Section VI.B the following conjectures are utilized in the evaluations: (i) Transducers (MEs) are the main contributor to the circuit power consumption while the power consumption related to SWs propagation trough waveguide and directional coupler is insignificant, (ii) SW propagation delay in the waveguide is neglected, (iii) ME transducer power consumption and delay are  $34.3 \,\mu\text{W}$  and  $0.42 \,\text{ps}$ , respectively<sup>64</sup>, and iv) SWs are excited by means of pulse signals. For delay calculations we identify the critical path length through each considered structure. As this spans over 2 ME cells and one directional coupler, and 4 ME cells for coupler and conversion based designs, respectively, the delay sums up to  $20.84 \,\text{ns}$  and  $1.68 \,\text{ns}$ , respectively.

As SW propagation, interference, and normalization are assumed to happen at zero power costs the power consumed by each design is determined by the number of ME cells it includes. Given that conversion based designs require 8, 12, and 10 ME cells, the power sums up to



FIG. 17. Scaled Down Fully Cascaded MAJ3 Gates at (a)  $I_1I_2I_3I_4I_5 = 00000$ , (b)  $I_1I_2I_3I_4I_5 = 00111$ , and (c)  $I_1I_2I_3I_4I_5 = 00011$ .

 $274.4 \,\mu\text{W}$ ,  $411.6 \,\mu\text{W}$ , and  $343 \,\mu\text{W}$  for the in-line, ladder fully, and ladder partially cascaded structures, respectively. On the other hand, coupler based structures require 6, 8, and 8 ME cells which results in  $205 \,\mu\text{W}$ ,  $274.4 \,\mu\text{W}$ , and  $274.4 \,\mu\text{W}$  for the in-line, ladder fully, and ladder partially cascaded structures, respectively.

Finally, the energy consumption can be derived as the power-delay product. We note however that due to pulse operation paradigm ME activation follows the domino behaviour. Thus, each of them is active for a short period of time necessary for its output SW creation, i.e., assuming that the ME cell delay of  $0.42\,\mathrm{ns}^{64}$ , and idle for the rest of the calculation. As the power consumed by the SW propagation through the waveguides can be neglected the overall power consumption is determined by the number of ME cells in the circuit and the ME cell power consumption. While in general the energy is computed as the overall

TABLE III. Comparison with cascading based conversion

|                     | Conversion cascading |       |      | Coupler cascading |       |       |
|---------------------|----------------------|-------|------|-------------------|-------|-------|
| Structure           | IL                   | LFC   | LPC  | IL                | LFC   | LPC   |
| Power               |                      |       |      |                   |       |       |
| $(\mu W)$           | 274.4                | 411.6 | 343  | 205               | 274.4 | 274.4 |
| Delay               |                      |       |      |                   |       |       |
| (ns)                | 1.68                 | 1.68  | 1.68 | 20.84             | 20.84 | 20.84 |
| Energy <sup>1</sup> |                      |       |      |                   |       |       |
| (aJ)                | 115.2                | 172.8 | 144  | 86.4              | 115.2 | 115.2 |

<sup>&</sup>lt;sup>1</sup> Due to pulse mode operation each ME is active for the time necessary for its output SW creation and idle for the rest of the calculation. Thus, regardless of the overall circuit delay, the energy is evaluated as the product of power consumption and the ME cell delay (0.42 ns).

power and circuit delay product this is not the case for pulse mode operation as each ME cell is only active once per circuit input evaluation and for a period of time corresponding to its latency, i.e., 0.42 ns under our assumptions. In view of this, the energy consumption can be determined by multiplying the overall power consumption with the ME cell delay without considering the directional coupler delay. This means that the energy consumption is actually independent of the overall circuit delay, which nullifies the coupler delay overhead contribution to the energy consumption. Therefore, the energy for the coupler-based cascading is calculated by multiplying the total power with the delay of a single ME cell, which is 0.42 ns. By following this procedure, the energy consumed by conversion-based in-line, ladder fully, and ladder partially cascaded structures is derived as 115.2 aJ, 172.8 aJ, and 144 aJ, respectively, and 86.4 aJ, 115.2 aJ, and 115.2 aJ for the coupler-based counterparts.

Table III presents the comparison of the coupler-based and conversion-based implementations in terms of power, delay, and energy consumption. In the Table IL, LFC, and LPC, stand for In-Line, Ladder Fully Cascaded, Ladder Partially Cascaded structures, respectively. As expected, the coupler-based approach provides a power reduction of 1.33x, 1.5x,

and 1.25x for in-line, ladder fully, and ladder partially cascaded circuits, respectively. Moreover, given that pulse SW operation is utilized the directional coupler delay overhead is not negatively affecting the energy consumption and the same savings are obtained in terms of energy. Note that the coupler-based cascading may become more delay effective by further scaling down the structure, and the utilization of other materials and/or faster coupling effects.

To get more inside into the potential implications of our proposal we compare the proposed 2-bit inputs multiplier with SW conversion-based and 16 nm CMOS implementation counterparts.

The CMOS implementation requires 6 AND and 2 XOR gates and its area, delay and energy consumption are estimated based on the figures reported in<sup>65</sup>. The SW implementation for coupler-based cascading is the one described in Figure 11 and the implementation for the conversion-based cascading is designed by replacing each directional coupler with two transducers to convert SW to charge domain and back. The assumptions and calculation methodology utilized for 2 MAJ3 circuits comparison are in place.

Table IV presents the comparison of the 3 considered 2-bit inputs multiplier implementations in terms of energy, delay, and area. As it can be observed in the Table, spin wave implementations are more energy efficient than the 16 nm CMOS counterpart, i.e., 6.25× and 4.65× less energy for coupler-based and conversion-based cascading, respectively. Moreover, the proposed solution consumes 1.34x less energy than the approach relying on forth and back conversion between spin wave and charge domains, while having 12.5× and 4× larger delay and area, respectively. Although the proposed solution is much slower and requires large area, its main strong point is the ultra-low energy consumption enabled by the directional coupler utilization. As previously mentioned, the delay can be further reduced by scaling and the utilization of other materials and/or faster coupling effect, thus we are still far from reaching the ultimate energy consumption reduction horizon. Also, note that the area can be decreased by relying on different coupling effects that can substantially reduce it while obtaining the same normalization effect.

TABLE IV. 2-bit Input Multiplier Performance.

|                         |       | $30\mathrm{nm}$ | $30\mathrm{nm}$ |
|-------------------------|-------|-----------------|-----------------|
|                         | 16 nm | waveguide       | waveguide       |
| Technology              | CMOS  | width SW        | width SW        |
|                         |       | Conversion-     | Coupler-        |
| Implementation          |       | based           | based           |
| methodology             | -     | Cascading       | Cascading       |
| Energy (fJ)             | 2     | 0.43            | 0.32            |
| Delay (ns)              | 0.1   | 1.68            | 21              |
| Area $(\mu \text{m}^2)$ | 6     | 5               | 21              |

# Variability and Thermal Noise Effects

The main goal of this paper is to provide the means towards energy effective spin wave gate cascading and enable the design of spin wave domain circuits. In view of this we validated our proposal as a proof of the concept without taking into account the influence of edge roughness, waveguide dimension variations, spin wave strength variation, and thermal noise effect. However, edge roughness and waveguide trapezoidal cross section effects have been investigated and their small impact demonstrated, as the considered gates continued to correctly function even under their presence<sup>45,66</sup>. Furthermore, the thermal noise effect was investigated<sup>45</sup>. The simulation results indicated that the thermal noise have limited effect on the gate functionality, and that the gate functions correctly at different temperature. The investigation of variability and thermal noise effects one our proposal constitutes future work, even-though we expect that they will have limited impact on spin wave circuit designs.

# VIII. CONCLUSIONS

In conclusion, we proposed a novel conversion free SW gate cascading scheme that achieves SW amplitude normalization by means of a directional coupler. After introducing the normalization concept, we utilized if for the implementation of three simple 2 cascaded Majority gate circuits and of a 2-bit inputs SW multiplier. We validated the proposed structures by means Object Oriented Micromagnetic Framework (OOMMF) and

GPU-accelerated Micromagnetics (MuMax3) simulations. Furthermore, we assessed the normalization induced energy overhead and demonstrated that the proposed approach provides a 1.25x to 1.5x energy reduction when compared with the transducers based conventional gate cascading counterpart. Finally, we introduced a normalization based SW 2-bit inputs multiplier design and compare it with functionally equivalent state-of-the-art designs. Our evaluation indicated that the proposed scheme provided 1.34x and 6.25x energy reductions when compared with transducers based and 16 nm CMOS counterpart, respectively, which demonstrated the energy effectiveness of our proposal and its significant contribution towards the full utilization of the SW paradigm potential and the development of SW only circuits.

### ACKNOWLEDGMENTS

This work has received funding from the European Union's Horizon 2020 research and innovation program within the FET-OPEN project CHIRON under grant agreement No. 801055. It has also been partially supported by imec's industrial affiliate program on beyond-CMOS logic. F.V. acknowledges financial support from the Research Foundation—Flanders (FWO) through grant No. 1S05719N.

#### REFERENCES

- <sup>1</sup>N. D. Shah, E. W. Steyerberg, and D. M. Kent, "Big Data and Predictive Analytics: Recalibrating Expectations," JAMA, vol. 320, no. 1, pp. 27–28, 07 2018. [Online]. Available: https://doi.org/10.1001/jama.2018.5602
- <sup>2</sup>S. Agarwal, G. Burr, A. Chen, S. Das, E. Debenedictis, M. P. Frank, P. Franzon, S. Holmes, M. Marinella, and T. Rakshit, "International roadmap of devices and systems 2017 edition: Beyond cmos chapter." Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), Tech. Rep., 2018.
- <sup>3</sup>D. Mamaluy and X. Gao, "The fundamental downscaling limit of field effect transistors," Applied Physics Letters, vol. 106, no. 19, p. 193503, 2015. [Online]. Available: https://doi.org/10.1063/1.4919871

- <sup>4</sup>N. Z. Haron and S. Hamdioui, "Why is cmos scaling coming to an end?" in 2008 3rd International Design and Test Workshop, 2008, pp. 98–103.
- <sup>5</sup>Y. Jiang, N. Cucu Laurenciu, and S. D. Cotofana, "On basic boolean function graphene nanoribbon conductance mapping," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 5, pp. 1948–1959, 2019.
- <sup>6</sup>F. Corinto and M. Forti, "Memristor circuits: Flux—charge analysis method," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 63, no. 11, pp. 1997–2009, 2016.
- <sup>7</sup>F. Corinto, A. Ascoli, and M. Gilli, "Nonlinear dynamics of memristor oscillators," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, no. 6, pp. 1323–1336, 2011.
- <sup>8</sup>D. Yu, H. H. Iu, Y. Liang, T. Fernando, and L. O. Chua, "Dynamic behavior of coupled memristor circuits," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 62, no. 6, pp. 1607–1616, 2015.
- <sup>9</sup>A. Ascoli, S. Slesazeck, H. Mahne, R. Tetzlaff, and T. Mikolajick, "Nonlinear dynamics of a locally-active memristor," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 62, no. 4, pp. 1165–1174, 2015.
- <sup>10</sup>M. Abu Lebdeh, H. Abunahla, B. Mohammad, and M. Al-Qutayri, "An efficient heterogeneous memristive xnor for in-memory computing," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 64, no. 9, pp. 2427–2437, 2017.
- <sup>11</sup>Y. Halawani, B. Mohammad, M. Al-Qutayri, and S. F. Al-Sarawi, "Memristor-based hardware accelerator for image compression," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 12, pp. 2749–2758, 2018.
- <sup>12</sup>V. Calayir, D. E. Nikonov, S. Manipatruni, and I. A. Young, "Static and clocked spin-tronic circuit design and simulation with performance analysis relative to cmos," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 61, no. 2, pp. 393–406, 2014.
- <sup>13</sup>H. Farkhani, I. L. Prejbeanu, and F. Moradi, "Las-ncs: A laser-assisted spintronic neuromorphic computing system," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 66, no. 5, pp. 838–842, 2019.
- <sup>14</sup>X. Jia, J. Yang, P. Dai, R. Liu, Y. Chen, and W. Zhao, "Spinbis: Spintronics-based bayesian inference system with stochastic computing," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 4, pp. 789–802, 2020.

- <sup>15</sup>R. Rajaei and A. Amirany, "Nonvolatile low-cost approximate spintronic full adders for computing in memory architectures," IEEE Transactions on Magnetics, vol. 56, no. 4, pp. 1–8, 2020.
- <sup>16</sup>Y. Halawani, B. Mohammad, D. Homouz, M. Al-Qutayri, and H. Saleh, "Modeling and optimization of memristor and stt-ram-based memory for low-power applications," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 3, pp. 1003–1014, 2016.
- <sup>17</sup>D. E. Nikonov and I. A. Young, "Overview of beyond-cmos devices and a uniform methodology for their benchmarking," Proceedings of the IEEE, vol. 101, no. 12, pp. 2498–2533, Dec 2013.
- <sup>18</sup>A. Lyle, J. Harms, S. Patil, X. Yao, D. J. Lilja, and J.-P. Wang, "Direct communication between magnetic tunnel junctions for nonvolatile logic fan-out architecture," Applied Physics Letters, vol. 97, no. 15, p. 152504, 2010. [Online]. Available: https://doi.org/10.1063/1.3499427
- <sup>19</sup>S. Luo, M. Song, X. Li, Y. Zhang, J. Hong, X. Yang, X. Zou, N. Xu, and L. You, "Reconfigurable skyrmion logic gates." Nano letters, vol. 18 2, pp. 1180–1184, 2018.
- <sup>20</sup>Z. Zhang, Y. Zhu, Y. Zhang, K. Zhang, J. Nan, Z. Zheng, Y. Zhang, and W. Zhao, "Skyrmion-based ultra-low power electric-field-controlled reconfigurable (super) logic gate," IEEE Electron Device Letters, vol. 40, no. 12, pp. 1984–1987, 2019.
- <sup>21</sup>K. Zhang, K. Cao, Y. Zhang, Z. Huang, W. Cai, J. Wang, J. Nan, G. Wang, Z. Zheng, L. Chen, Z. Zhang, Y. Zhang, S. Yan, and W. Zhao, "Rectified tunnel magnetoresistance device with high on/off ratio for inmemory computing," IEEE Electron Device Letters, vol. 41, no. 6, pp. 928–931, 2020.
- <sup>22</sup>Z. Luo, Z. Lu, C. Xiong, T. Zhu, W. Wu, Q. Zhang, H. Wu, X. Zhang, and X. Zhang, "Reconfigurable magnetic logic combined with nonvolatile memory writing." Advanced materials, vol. 29 4, 2017.
- <sup>23</sup>A. Khitun and K. L. Wang, "Non-volatile magnonic logic circuits engineering," Journal of Applied Physics, vol. 110, no. 3, p. 034306, 2011. [Online]. Available: https://doi.org/10.1063/1.3609062
- <sup>24</sup>A. Mahmoud, F. Vanderveken, F. Ciubotaru, C. Adelmann, S. Cotofana, and S. Hamdioui, "n-bit data parallel spin wave logic gate," in 2020 Design, Automation Test in Europe Conference Exhibition (DATE), 2020, pp. 642–645.

- <sup>25</sup>M. P. Kostylev, A. A. Serga, T. Schneider, B. Leven, and B. Hillebrands, "Spin-wave logical gates," Applied Physics Letters, vol. 87, no. 15, p. 153501, 2005. [Online]. Available: https://doi.org/10.1063/1.2089147
- <sup>26</sup>T. Schneider, A. A. Serga, B. Leven, B. Hillebrands, R. L. Stamps, and M. P. Kostylev, "Realization of spin-wave logic gates," Applied Physics Letters, vol. 92, no. 2, p. 022505, 2008. [Online]. Available: https://doi.org/10.1063/1.2834714
- <sup>27</sup>K.-S. Lee and S.-K. Kim, "Conceptual design of spin wave logic gates based on a mach–zehnder-type spin wave interferometer for universal logic functions," Journal of Applied Physics, vol. 104, no. 5, p. 053909, 2008. [Online]. Available: https://doi.org/10.1063/1.2975235
- <sup>28</sup>I. A. Ustinova, A. A. Nikitin, A. B. Ustinov, B. A. Kalinikos, and E. Lahderanta, "Logic gates based on multiferroic microwave interferometers," in 2017 11th International Workshop on the Electromagnetic Compatibility of Integrated Circuits (EMCCompo), July 2017, pp. 104–107.
- <sup>29</sup>A. Khitun and K. L. Wang, "Nano scale computational architectures with spin wave bus," Superlattices and Microstructures, vol. 38, no. 3, pp. 184 200, 2005. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0749603605000716
- <sup>30</sup>Y. Wu, M. Bao, A. Khitun, J.-Y. Kim, A. Hong, and K. L. Wang, "A three-terminal spin-wave device for logic applications," Journal of Nanoelectronics and Optoelectronics, vol. 4, no. 3, pp. 394–397, December 2009.
- <sup>31</sup>A. Khitun, D. E. Nikonov, M. Bao, K. Galatsis, and K. L. Wang, "Feasibility study of logic circuits with a spin wave bus," Nanotechnology, vol. 18, no. 46, p. 465202, 2007. [Online]. Available: http://stacks.iop.org/0957-4484/18/i=46/a=465202
- <sup>32</sup>A. Khitun, M. Bao, Y. Wu, J. Kim, A. Hong, A. Jacob, K. Galatsis, and K. L. Wang, "Spin wave logic circuit on silicon platform," in Fifth International Conference on Information Technology: New Generations (itng 2008), April 2008, pp. 1107–1110.
- <sup>33</sup>B. Rana and Y. Otani, "Voltage-controlled reconfigurable spin-wave nanochannels and logic devices," Physical Review Applied, vol. 9, p. 014033, Jan 2018. [Online]. Available: <a href="https://link.aps.org/doi/10.1103/PhysRevApplied.9.014033">https://link.aps.org/doi/10.1103/PhysRevApplied.9.014033</a>
- <sup>34</sup>A. Chumak, A. Serga and B. Hillebrands, "Magnon transistor for all-magnon data processing," Nature Communication, vol. 5, no. 4700, 2014. [Online]. Available: https://doi.org/10.1038/ncomms5700

- <sup>35</sup>S. Klingler, P. Pirro, T. Bracher, B. Leven, B. Hillebrands, and A. V. "Chumak, "Design of a spin-wave majority gate employing mode selection," Applied Physics Letters, vol. 105, no. 15, p. 152410, 2014. [Online]. Available: https://doi.org/10.1063/1.4898042
- <sup>36</sup>S. Klingler, P. Pirro, T. Bracher, B. Leven, B. Hillebrands, and A. V. Chumak, "Spin-wave logic devices based on isotropic forward volume magnetostatic waves," Applied Physics Letters, vol. 106, no. 21, p. 212406, 2015.
- <sup>37</sup>O. Zografos, S. Dutta, M. Manfrini, A. Vaysset, B. Soree, A. Naeemi, P. Raghavan, R. Lauwereins, and I. P. Radu, "Non-volatile spin wave majority gate at the nanoscale," AIP Advances, vol. 7, no. 5, p. 056020, 2017. [Online]. Available: https://doi.org/10.1063/1.4975693
- <sup>38</sup>K. Nanayakkara, A. Anferov, A. P. Jacob, S. J. Allen, and A. Kozhanov, "Cross junction spin wave logic architecture," IEEE Transactions on Magnetics, vol. 50, no. 11, pp. 1–4, Nov 2014.
- <sup>39</sup>T. Fischer, M. Kewenig, D. A. Bozhko, A. A. Serga, I. I. Syvorotka, F. Ciubotaru, C. Adelmann, B. Hillebrands, and A. V. Chumak, "Experimental prototype of a spin-wave majority gate," Applied Physics Letters, vol. 110, no. 15, p. 152401, 2017. [Online]. Available: https://doi.org/10.1063/1.4979840
- <sup>40</sup>P. Shabadi, A. Khitun, P. Narayanan, M. Bao, I. Koren, K. L. Wang, and C. A. Moritz, "Towards logic functions as the device," in 2010 IEEE/ACM International Symposium on Nanoscale Architectures, June 2010, pp. 11–16.
- <sup>41</sup>F. Ciubotaru, G. Talmelli, T. Devolder, O. Zografos, M. Heyns, C. Adelmann, and I. P. Radu, "First experimental demonstration of a scalable linear majority gate based on spin waves," in 2018 IEEE International Electron Devices Meeting (IEDM), Dec 2018, pp. 36.1.1–36.1.4.
- <sup>42</sup>P. SHABADI, S. N. RAJAPANDIAN, S. KHASANVIS, and C. A. MORITZ, "Design of spin wave functions-based logic circuits," SPIN, vol. 02, no. 03, p. 1240006, 2012. [Online]. Available: https://doi.org/10.1142/S2010324712400061
- <sup>43</sup>Y. Khivintsev, M. Ranjbar, D. Gutierrez, H. Chiang, A. Kozhevnikov, Y. Filimonov, and A. Khitun, "Prime factorization using magnonic holographic devices," Journal of Applied Physics, vol. 120, no. 12, p. 123901, 2016. [Online]. Available: https://doi.org/10.1063/1.4962740

- <sup>44</sup>K. Vogt, F. Fradin, J. Pearson, T. Sebastian, S. Bader, B. Hillebrands, A. Hoffmann, and H. Schultheiss, "Realization of a spin-wave multiplexer," Nature Communications, vol. 5, Apr. 2014.
- <sup>45</sup>Q. Wang, P. Pirro, R. Verba, A. Slavin, B. Hillebrands, and A. V. Chumak, "Reconfigurable nanoscale spin-wave directional coupler," Science Advances, vol. 4, no. 1, 2018. [Online]. Available: https://advances.sciencemag.org/content/4/1/e1701517
- <sup>46</sup>A. Mahmoud, F. Vanderveken, C. Adelmann, F. Ciubotaru, S. Hamdioui, and S. Cotofana, "Fan-out enabled spin wave majority gate," AIP Advances, vol. 10, no. 3, p. 035119, 2020. [Online]. Available: https://doi.org/10.1063/1.5134690
- <sup>47</sup>A. Mahmoud, F. Vanderveken, C. Adelmann, F. Ciubotaru, S. Cotofana, and S. Hamdioui, "2-output spin wave programmable logic gate," in 2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2020, pp. 60–65.
- <sup>48</sup>A. V. Chumak, A. A. Serga, and B. Hillebrands, "Magnonic crystals for data processing," Journal of Physics D: Applied Physics, vol. 50, no. 24, p. 244001, 2017. [Online]. Available: http://stacks.iop.org/0022-3727/50/i=24/a=244001
- <sup>49</sup>L. LANDAU and E. LIFSHITZ, "On the theory of the dispersion of magnetic permeability in ferromagnetic bodies reprinted from physikalische zeitschrift der sowjetunion 8, part 2, 153, 1935." in Perspectives in Theoretical Physics, L. PITAEVSKI, Ed. Amsterdam: Pergamon, 1992, pp. 51 65. [Online]. Available: http://www.sciencedirect.com/science/article/pii/B9780080363646500089
- <sup>50</sup>T. L. Gilbert, "A phenomenological theory of damping in ferromagnetic materials," IEEE Transactions on Magnetics, vol. 40, no. 6, pp. 3443–3449, Nov 2004.
- <sup>51</sup>V. V. Kruglyak, S. O. Demokritov, and D. Grundler, "Magnonics," Journal of Physics D: Applied Physics, vol. 43, no. 26, p. 264001, 2010. [Online]. Available: http://stacks.iop.org/0022-3727/43/i=26/a=264001
- <sup>52</sup>A. Khitun, "Multi-frequency magnonic logic circuits for parallel data processing," Journal of Applied Physics, vol. 111, no. 5, p. 054307, 2012. [Online]. Available: https://doi.org/10.1063/1.3689011
- <sup>53</sup>O. Zografos, L. Amaru, P. Gaillardon, P. Raghavan, and G. D. Micheli, "Majority logic synthesis for spin wave technology," in 2014 17th Euromicro Conference on Digital System Design, Aug 2014, pp. 691–694.

- <sup>54</sup>R. Verba, G. Melkov, V. Tiberkevich, and A. Slavin, "Collective spin-wave excitations in a two-dimensional array of coupled magnetic nanodots," Phys. Rev. B, vol. 85, p. 014427, Jan 2012. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevB.85.014427
- <sup>55</sup>M. Beleggia, S. Tandon, Y. Zhu, and M. D. Graef], "On the magnetostatic interactions between nanoparticles of arbitrary shape," Journal of Magnetism and Magnetic Materials, vol.278, no. 1, pp. 270 284, 2004. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0304885304000186
- <sup>56</sup>H. G. Bauer, P. Majchrak, T. Kachel, C. H. Back, and G. Woltersdorf, "Nonlinear spinwave excitations at low magnetic bias fields," Nature Communications, vol. 6, 2015.
- <sup>57</sup>A. V. Sadovnikov, E. N. Beginin, S. E. Sheshukova, D. V. Romanenko, Y. P. Sharaevskii, and S. A. Nikitov, "Directional multimode coupler for planar magnonics: Side-coupled magnetic stripes," Applied Physics Letters, vol. 107, no. 20, p. 202405, 2015. [Online]. Available: https://doi.org/10.1063/1.4936207
- <sup>58</sup>A. V. Sadovnikov, S. A. Odintsov, E. N. Beginin, S. E. Sheshukova, Y. P. Sharaevskii, and S. A. Nikitov, "Toward nonlinear magnonics: Intensity-dependent spin-wave switching in insulating side-coupled magnetic stripes," Phys. Rev. B, vol. 96, p. 144428, Oct 2017. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevB.96.144428
- <sup>59</sup>R. Verba, M. Carpentieri, G. Finocchio, V. Tiberkevich, and A. Slavin, "Excitation of propagating spin waves in ferromagnetic nanowires by microwave voltage-controlled magnetic anisotropy," Scientific Reports, vol. 6, 2016.
- <sup>60</sup>P. Krivosik and C. E. Patton, "Hamiltonian formulation of nonlinear spin-wave dynamics: Theory and applications," Physical Review B, vol. 82, p. 184428, Nov 2010. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevB.82.184428
- <sup>61</sup>M. J. Donahue and D. G. Porter, "Oommf user's guide, version 1.0," Interagency Report NISTIR 6376, Sept 1999. [Online]. Available: http://math.nist.gov/oommf
- <sup>62</sup>A. Vansteenkiste, J. Leliaert, M. Dvornik, M. Helsen, F. GarciaSanchez, and B. Van Waeyenberge, "The design and verification of mumax3," AIP Advances, vol. 4, no. 10, p. 107133, 2014. [Online]. Available: https://doi.org/10.1063/1.4899186
- <sup>63</sup>T. Devolder, J.-V. Kim, F. Garcia-Sanchez, J. Swerts, W. Kim, S. Couet, G. Kar, and A. Furnemont, "Time-resolved spin-torque switching in mgo-based perpendicularly magnetized tunnel junctions," Physical Review B, vol. 93, p. 024420, Jan 2016. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevB.93.024420

- <sup>64</sup>O. Zografos, B. Soree, A. Vaysset, S. Cosemans, L. Amar ´u, P. Gaillardon, G. D. Micheli, R. Lauwereins, S. Sayan, P. Raghavan, I. P. Radu, and A. Thean, "Design and benchmarking of hybrid cmos-spin wave device circuits compared to 10nm cmos," in 2015 IEEE 15th International Conference on Nanotechnology (IEEE-NANO), July 2015, pp. 686–689.
- <sup>65</sup>Y. Chen, A. Sangai, M. Gholipour, and D. Chen, "Schottky-barrier-type graphene nanoribbon field-effect transistors: A study on compact modeling, process variation, and circuit performance," in 2013 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), July 2013, pp. 82–88.
- <sup>66</sup>Q. Wang, B. Heinz, R. Verba, M. Kewenig, P. Pirro, M. Schneider, T. Meyer, B. Lagel, C. Dubs, T. Br "acher, and A. V. Chumak, "Spin pinning and spin-wave dispersion in nanoscopic ferromagnetic waveguides," Physical Review Letter, vol. 122, p. 247202, Jun 2019. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.122.247202