AM

A.N.N. Mahmoud

info

Please Note

17 records found

Journal article (2023) - Abdulqader Nael Mahmoud, Florin Ciubotaru, Frederic Vanderveken, Christoph Adelmann, Sorin Cotofana, Said Hamdioui
Spin Waves (SWs), by their nature, are excited by means of voltage driven or current driven cells under two modes: Continuous Mode Operation (CMO), and Pulse Mode Operation (PMO). Moreover, the low throughput of the SW technology (caused by its high latency) can be enhanced by wavepipelining which is supported inherently by the SW under the two modes. Therefore, we propose wavepipelined SW based two cascaded Majority gates (SWMGs) circuit and validate it by means of micromagnetic simulations working under CMO and PMO. Our evaluation results indicate that PMO SWMGs circuit consumes 6.7x less energy than CMO SWMGs circuit. In addition, the evaluation shows that the wavepipelined PMO and CMO SWMGs circuit have the same throughput, while they are better than the non-wavepipelined circuit by 2x. ...
Doctoral thesis (2022) - A.N.N. Mahmoud
CMOS downscaling has provided the means to efficiently process the huge raw data resulted from the information technology revolution. However, this becomes more difficult because of leakage, reliability, and cost walls. To keep the pace with the exploding market needs at affordable cost, novel alternative technologies are under investigation; one of them is Spin Wave (SW), which is the collective excitation of the electron spins in the ferromagnetic materials. SW stands apart as one of the most promising avenues because of its ultra-low energy consumption and high scalability. This thesis: a) develops and designs spin wave based logic gates and circuits, and b) investigates the requirements for spin wave technology to outperform CMOS technology from energy efficiency point of view. Logic gate: SW circuit design requires the availability of SW logic gates to possess fan-out capabilities. Therefore, we propose and validate novel fan-out enabled spin wave logic gates including (N)AND, (N)OR, X(N)OR, and majority gates. In addition, we present and validate novel n-bit multi-frequency data parallel spin wave logic gates, i.e., SWs with different frequencies propagate in the same waveguide while interfering with similar frequency SWs only. Moreover, we examine a SW 3-input Majority gate working under continuous and pulse mode operation regimes. Furthermore, we present and validate how pulse mode operation enables Wave Pipelining (WP) within SW. Circuits: We develop, design, and validate three major circuits; namely an adder, a multiplier, and a compressor. These make use of SW gate cascading. Firstly, we introduce and validate SW accurate and approximate full adders; the approximate full adder consumes 55% less energy than the accurate full adder but it has 25% error rate making it suitable for error tolerant applications. We also propose a non-binary SW computing paradigm which we use to build a non-binary SW adder. Then we develop SW accurate and approximate 4:2 compressor; the approximate compressor consumes 46% less energy than the accurate compressor but it has 31% error rate. Finally, we design 2-bit inputs accurate and approximate multiplier; the approximate multiplier consumes 64% less energy than the accuratemultiplier but it has 25% error rate. SW Technology Requirements: We are interested in assessing the technological development horizon that needs to be reached to make SW circuits outperform CMOS counterparts in terms of energy efficiency. We performace reverse engineering alike analysis to determine transducer delay and power consumption upper bounds that can place SW circuits in the leading position. To this end, we compute the maximum transducer delay and power consumption of a 32-bit Brent-Kung adder that could potentially enable a SW implementation able to outperform its 7 nm CMOS counterpart. Our evaluations indicate that 31nW is the maximum transducer power consumption for which a 32-bit Brent-Kung SW implementation can outperform its 7nm CMOS counterpart in term of energy efficiency. ...
Conference paper (2022) - Abdulqader Mahmoud, Nicoleta Cucu-Laurenciu, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Sorin Cotofana, Said Hamdioui
In the early stages of a novel technology development, it is difficult to provide a comprehensive assessment of its potential capabilities and impact. Nevertheless, some preliminary estimates can be drawn and are certainly of great interest and in this paper we follow this line of reasoning within the framework of the Spin Wave (SW) based computing paradigm. In particular, we are interested in assessing the technological development horizon that needs to be reached in order to unleash the full SW paradigm potential such that SW circuits can outperform CMOS counterparts in terms of energy consumption. In view of the zero power SWs propagation through ferromagnetic waveguides, the overall SW circuit power consumption is determined by the one associated to SWs generation and sensing by means of transducers. While current antenna based transducers are clearly power hungry recent developments indicate that magneto-electric (ME) cells have a great potential for ultra-low power SW generation and sensing. Given that MEs have been only proposed at the conceptual level and no actual experimental demonstration has been reported we cannot evaluate the impact of their utilization on the SW circuit energy consumption. However, we can perform a reverse engineering alike analysis to determine ME delay and power consumption upper bounds that can place SW circuits in the leading position. To this end, we utilize a 32-bit Brent-Kung Adder (BKA) as discussion vehicle and compute the maximum ME delay and power consumption that could potentially enable a SW implementation able to outperform its 7nm CMOS counterpart. We evaluate different BKA SW implementations that rely on conversion- or normalization-based gate cascading and consider continuous or pulsed SW generation scenarios. Our evaluations indicate that 31nW is the maximum transducer power consumption for which a 32-bit Brent-Kung SW implementation can outperform its 7nm CMOS counterpart in terms of energy consumption. ...
Journal article (2022) - Abdulqader Nael Mahmoud, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Said Hamdioui, Sorin Cotofana
By their very nature, Spin Waves (SWs) excited at the same frequency but different amplitudes, propagate through waveguides and interfere with each other at the expense of ultra-low energy consumption. In addition, all (part) of the SW energy can be moved from one waveguide to another by means of coupling effects. In this paper we make use of these SW features and introduce a novel non Boolean algebra based paradigm, which enables domain conversion free ultra-low energy consumption SW based computing. Subsequently, we leverage this computing paradigm by designing a non-binary spin wave adder, which we validate by means of micro-magnetic simulation. To get more inside on the proposed adder potential we assume a 2-bit adder implementation as discussion vehicle, evaluate its area, delay, and energy consumption, and compare it with conventional SW and 7 nm CMOS counterparts. The results indicate that our proposal diminishes the energy consumption by a factor of 3.14 × and 6 ×, when compared with the conventional SW and 7 nm CMOS functionally equivalent designs, respectively. Furthermore, the proposed non-binary adder implementation requires the least number of devices, which indicates its potential for small chip real-estate realizations. ...

Seeking the most energy-efficient digital computing paradigm

Journal article (2022) - Abdulqader Nael Mahmoud, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Said Hamdioui, Sorin Cotofana
In this article, we propose an energy-efficient spin wave (SW)-based approximate 4:2 compressor including three- and five-input majority gates. We validate our proposal by means of micromagnetic simulations and assess and compare its performance with state-of-the-art SW 45-nm CMOS and spin-CMOS counterparts. The evaluation results indicate that the proposed compressor consumes 31.5% less energy than its accurate SW-design version. Furthermore, it has the same energy consumption and error rate as a directional coupler (DC)-based approximate compressor, but it exhibits a 3× shorter delay. In addition, it consumes 14% less energy while having a 17% lower average error rate than its approximate 45-nm CMOS counterpart. When compared with other emerging technologies, the proposed compressor outperforms the approximate spin-CMOS-based compressor by three orders of magnitude in terms of energy consumption while providing the same error rate. Finally, the proposed compressor requires the smallest chip real estate measured in terms of devices. ...
Journal article (2022) - Abdulqader Mahmoud, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Said Hamdioui, Sorin Cotofana
By their very nature Spin Waves (SWs) enable the realization of energy efficient circuits, as they propagate and interfere within waveguides without consuming noticeable energy. However, SW computing can be even more energy efficient by taking advantage of the approximate computing paradigm as many applications, e.g., multimedia and social media, are error-tolerant. In this paper, we propose an ultra-low energy Approximate Full Adder (AFA) and an Approximate 2-bit inputs Multiplier (AMUL). AFA consists of one Majority gate whereas AMUL is built by means of 3 AND gates. We validate the correct functionality of our proposal by means of micromagnetic simulations and evaluate AFA's figures of merit against state-of-the-art accurate SW, 7nm CMOS, Spin Hall Effect (SHE), Domain Wall Motion (DWM), accurate and approximate 45nm CMOS, Magnetic Tunnel Junction (MTJ), and Spin-CMOS FA implementations. Our results indicate that AFA consumes 38% and 6% less energy than state-of-the-art accurate SW and 7nm CMOS FA implementations, respectively. Moreover, it saves 56% and 20% energy when compared with accurate and approximate 45nm CMOS counterparts, respectively. Furthermore, it provides 2 orders of magnitude energy reduction when compared with accurate SHE, accurate and approximate DWM, MTJ, and Spin-CMOS, counterparts. In addition, it achieves the same error rate as approximate 45nm CMOS and Spin-CMOS FAs whereas it exhibits 50% less error rate than the approximate DWM FA. Last but not least, it outperforms its contenders in terms of area by saving at least 29% chip real-estate. AMUL is evaluated and compared with state-of-the-art SW and 16nm CMOS accurate and approximate designs. The evaluation results indicate that AMUL energy consumption is at least 2.8x and 2.6x smaller than the one of state-of-the-art SW and 16nm CMOS accurate and approximate designs, respectively. AMUL has an error rate of 25%, whereas the approximate CMOS multiplier has an error rate of 38%, and requires at least 64% less chip real-estate than the CMOS counterpart. ...
Journal article (2021) - Abdulqader Nael Mahmoud, Frederic Vanderveken, Christoph Adelmann, Florin Ciubotaru, Said Hamdioui, Sorin Cotofana
By their very nature, spin waves (SWs) with different frequencies can propagate through the same waveguide, while mostly interfering with their own species. Therefore, more SW encoded data sets can coexist, propagate, and interact in parallel, which opens the road toward hardware replication-free parallel data processing. In this article, we take advantage of these features and propose a novel data parallel SW-based computing approach. To explain and validate the proposed concept, byte-wide 2-input XOR and 3-input majority gates are implemented and validated by means of Object-Oriented MicroMagnetic Framework (OOMMF) simulations. Furthermore, we introduce an optimization algorithm meant to minimize the area overhead associated with multifrequency operation and demonstrate that it diminishes the byte-wide gate area by 30% and 41% for XOR and majority implementations, respectively. To get inside on the practical implications of our proposal, we compare the byte-wide gates with conventional functionally equivalent scalar SW gate-based implementations in terms of area, delay, and power consumption. Our results indicate that the area optimized 8-bit 2-input XOR and 3-input majority gates require 4.47x and 4.16x less area, respectively, at the expense of 5% and 7% delay increase, respectively, without inducing any power consumption overhead. Finally, we discuss factors that are limiting the currently achievable parallelism to 8 for phase-based gate output detection and demonstrate by means of OOMMF simulations that this can be increased 16 for threshold-based detection-based gates. ...
Journal article (2021) - Abdulqader Nael Mahmoud, Frederic Vanderveken, Christoph Adelmann, Florin Ciubotaru, Sorin Cotofana, Said Hamdioui
The key enabling factor for Spin Wave (SW) technology utilization for building ultra low power circuits is the ability to energy efficiently cascade SW basic computation blocks. SW Majority gates, which constitute a universal gate set for this paradigm, operating on phase encoded data are not input output coherent in terms of SW amplitude. Thus, their cascading requires information representation conversion from SW to voltage and back, which is by no means energy effective. In this paper, a novel conversion free SW gate cascading scheme is proposed that achieves SW amplitude normalization by means of a directional coupler. After introducing the normalization concept, we utilize it in the implementation of three simple circuits and, to demonstrate its bigger scale potential, of a 2-bit inputs SW multiplier. The proposed structures are validated by means of the Object Oriented Micromagnetic Framework (OOMMF) and GPU-accelerated Micromagnetics (MuMax3). Furthermore, we assess the normalization induced energy overhead and demonstrate that the proposed approach consumes 1.25 times to 1.5 times less energy when compared with the transducers based conventional counterpart. Finally, we introduce a normalization based SW 2-bit inputs multiplier design and compare it with functionally equivalent SW transducer based and 16nm CMOS designs. Our evaluation indicates that the proposed approach provided 1.34 times and 6.25 times energy reductions when compared with the conventional approach and 16nm CMOS counterpart, respectively, which demonstrates that our proposal is energy effective and opens the road towards the full utilization of the SW paradigm potential and the development of SW only circuits. ...
Conference paper (2021) - Abdulqader Mahmoud, Frederic Vanderveken, Christoph Adelmann, Florin Ciubotaru, Said Hamdioui, Sorin Cotofana
By their very nature, voltage/current excited Spin Waves (SWs) propagate through waveguides without consuming noticeable power. If SW excitation is performed by the continuous application of voltages/currents to the input, which is usually the case, the overall energy consumption is determined by the transducer power and the circuit critical path delay, which leads to high energy consumption because of SWs slowness. However, if transducers are operated in pulses the energy becomes circuit delay independent and it is mainly determined by the transducer power and delay, thus pulse operation should be targeted. In this paper, we utilize a 3-input Majority gate (MAJ) to investigate the Continuous Mode Operation (CMO), and Pulse Mode Operation (PMO). Moreover, we validate CMO and PMO 3-input Majority gate by means of micromagnetic simulations. Furthermore, we evaluate and compare the CMO and PMO Majority gate implementations in term of energy. The results indicate that PMO diminishes MAJ gate energy consumption by a factor of 18. In addition, we describe how PMO can open the road towards the utilization of the Wave Pipelining (WP) concept in SW circuits. We validate the WP concept by means of micromagnetic simulations and we evaluate its implications in term of throughput. Our evaluation indicates that for a circuit formed by four cascaded MAJ gates WP increases the throughput by 3.6x. ...
Conference paper (2021) - Abdulqader Mahmoud, Christoph Adelmann, Frederic Vanderveken, Sorin Cotofana, Florin Ciubotaru, Said Hamdioui
Having multi-output logic gates saves much energy because the same structure can be used to feed multiple inputs of next stage gates simultaneously. This paper proposes novel triangle shape fanout of 2 spin wave Majority and XOR gates; the Majority gate is achieved by phase detection, whereas the XOR gate is achieved by threshold detection. The proposed logic gates are validated by means of micromagnetic simulations. Furthermore, the energy and delay are estimated for the proposed structures and compared with the state-of-the-art spin wave, and 16 nm and 7 nm CMOS logic gates. The results demonstrate that the proposed structures provide energy reduction of 25%–50% in comparison to the other 2-output spin-wave devices while having the same delay, and energy reduction of 43x-0.8x when compared to the 16 nm and 7 nm CMOS counterparts while having delay overhead of 11x-40x. ...
Conference paper (2021) - Abdulqader Mahmoud, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Sorin Cotofana, Said Hamdioui
Spin Waves (SWs) propagate through magnetic waveguides and interfere with each other without consuming noticeable energy, which opens the road to new ultra-low energy circuit designs. In this paper we build upon SW features and propose a novel energy efficient Full Adder (FA) design consisting of 1 Majority and 2 XOR gates, which outputs Sum and Carry − out are generated by means of threshold and phase detection, respectively. We validate our proposal by means of MuMax3 micromagnetic simulations and we evaluate and compare its performance with state-of-the-art SW, 22 nm CMOS, Magnetic Tunnel Junction (MTJ), Spin Hall Effect (SHE), Domain Wall Motion (DWM), and Spin-CMOS implementations. Our evaluation indicates that the proposed SW FA consumes 22.5% and 43% less energy than the direct SW gate based and 22 nm CMOS counterparts, respectively. Moreover it exhibits a more than 3 orders of magnitude smaller energy consumption when compared with state-of-the-art MTJ, SHE, DWM, and Spin-CMOS based FAs, and outperforms its contenders in terms of area by requiring at least 22% less chip real-estate. ...
Conference paper (2021) - Abdulqader Mahmoud, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Sorin Cotofana, Said Hamdioui
By their very nature, Spin Waves (SWs) consume ultra-low amounts of energy, which makes them suitable for ultra-low energy consumption applications. In addition, a compressor can be utilized to further reduce the energy consumption and enhance the speed of a multiplier. Therefore, we propose a novel energy efficient SW based 4-2 compressor consisting of 4 XOR gates and 2 Majority gates. The proposed compressor is validated by means of micromagnetic simulations and compared with the state-of-the-art SW, 22 nm CMOS, Magnetic Tunnel Junction (MTJ), Domain Wall Motion (DWM), and Spin-CMOS technologies. The performance evaluation shows that the proposed compressor consumes 2.5x less and 1.25× less energy than the 22 nm CMOS and the conventional SW compressor, respectively, whereas it consumes at least 3 orders of magnitude less energy than the MTJ, DWM, and Spin-CMOS designs. Furthermore, the compressor achieves the smallest chip real-estate. In summary, the performance evaluation of our proposed compressor shows that the SW technology has the potential to progress the state-of-the-art circuit design in terms of energy consumption and scalability. ...
Conference paper (2020) - Abdulqader Mahmoud, Frederic Vanderveken, Christoph Adelmann, Florin Ciubotaru, Said Hamdioui, Sorin Cotofana
To bring Spin Wave (SW) based computing paradigm into practice and develop ultra low power Magnonic circuits and computation platforms, one needs basic logic gates that operate and can be cascaded within the SW domain without requiring back and forth conversion between the SW and voltage domains. To achieve this, SW gates have to possess intrinsic fanout capabilities, be input-output data representation coherent, and reconfigurable. In this paper, we address the first and the last requirements and propose a novel 4-output programmable SW logic gate. First, we introduce the gate structure and demonstrate that, by adjusting the gate output detection method, it can parallelly evaluate any 4-element subset of the 2-input Boolean function set {(N)AND, (N)OR, and X(N)OR}. Furthermore, we adjust the structure such that all its 4 outputs produce SWs with the same energy and demonstrate that it can evaluate Boolean function sets while providing fanout capabilities ranging from 1 to 4. We validate our approach by instantiating and simulating different gate configurations such as 4-output AND/OR, 4-output XOR/XNOR, output energy balanced 4-output AND/OR, and output energy balanced 4-output XOR/XNOR by means of Object Oriented Micromagnetic Framework (OOMMF) simulations. Finally, we evaluate the performance of our proposal in terms of delay and energy consumption and compare it against existing state-of-the-art SW and 16 nm CMOS counterparts. The results indicate that for the same functionality, our approach provides 3× and 16× energy reduction, when compared with conventional SW and 16 nm CMOS implementations, respectively. ...
Journal article (2020) - Abdulqader Mahmoud, Florin Ciubotaru, Frederic Vanderveken, Andrii V. Chumak, Said Hamdioui, Christoph Adelmann, Sorin Cotofana
This paper provides a tutorial overview over recent vigorous efforts to develop computing systems based on spin waves instead of charges and voltages. Spin-wave computing can be considered a subfield of spintronics, which uses magnetic excitations for computation and memory applications. The Tutorial combines backgrounds in spin-wave and device physics as well as circuit engineering to create synergies between the physics and electrical engineering communities to advance the field toward practical spin-wave circuits. After an introduction to magnetic interactions and spin-wave physics, the basic aspects of spin-wave computing and individual spin-wave devices are reviewed. The focus is on spin-wave majority gates as they are the most prominently pursued device concept. Subsequently, we discuss the current status and the challenges to combine spin-wave gates and obtain circuits and ultimately computing systems, considering essential aspects such as gate interconnection, logic level restoration, input-output consistency, and fan-out achievement. We argue that spin-wave circuits need to be embedded in conventional complementary metal-oxide-semiconductor (CMOS) circuits to obtain complete functional hybrid computing systems. The state of the art of benchmarking such hybrid spin-wave-CMOS systems is reviewed, and the current challenges to realize such systems are discussed. The benchmark indicates that hybrid spin-wave-CMOS systems promise ultralow-power operation and may ultimately outperform conventional CMOS circuits in terms of the power-delay-area product. Current challenges to achieve this goal include low-power signal restoration in spin-wave circuits as well as efficient spin-wave transducers. ...
Journal article (2020) - Abdulqader Mahmoud, Frederic Vanderveken, Christoph Adelmann, Florin Ciubotaru, Said Hamdioui, Sorin Cotofana
By its very nature, Spin Wave (SW) interference provides intrinsic support for Majority logic function evaluation. Due to this and the fact that the 3-input Majority (MAJ3) gate and the inverter constitute a universal Boolean logic gate set, different MAJ3 gate implementations have been proposed. However, they cannot be directly utilized for the construction of larger SW logic circuits as they lack a key cascading mechanism, i.e., fanout capability. In this paper, we introduce a novel ladder-shaped SW MAJ3 gate design able to provide a maximum fanout of 2 (FO2). The proper gate functionality is validated by means of micromagnetic simulations, which also demonstrate that the amplitude mismatch between the two outputs is negligible, proving that an FO2 is properly achieved. Additionally, we evaluate the gate area and compare it with SW state-of-the-art and 15 nm CMOS counterparts working under the same conditions. Our results indicate that the proposed structure requires a 12× less area than the 15 nm CMOS MAJ3 gate and that at the gate level, the fanout capability results in 16% area savings, when compared to the state-of-the-art SW majority gate counterparts. ...
Conference paper (2020) - Abdulqader Mahmoud, Frederic Vanderveken, Christoph Adelmann, Florin Ciubotaru, Sorin Cotofana, Said Hamdioui
This paper presents a 2-output Spin-Wave Programmable Logic Gate structure able to simultaneously evaluate any pair of AND, NAND, OR, NOR, XOR, and XNOR Boolean functions. Our proposal provides the means for fanout achievement within the Spin Wave computation domain and energy and area savings as two different functions can be simultaneously evaluated on the same input data. We validate our proposal by means of Object Oriented Micromagnetic Framework (OOMMF) simulations and demonstrate that by phase and magnetization threshold output sensing {AND, OR, NAND, NOR} and {XOR and XNOR} functionalities can be achieved, respectively. To get inside into the potential practical implications of our approach we use the proposed gate to implement a 3-input Majority gate, which we evaluate and compare with state of the art equivalent implementations in terms of area, delay, and energy consumptions. Our estimations indicate that the proposed gate provides 33% and 16% energy and area reduction, respectively, when compared with spin-wave counterpart and 42% energy reduction while consuming 12x less area when compared to a 15 nm CMOS implementation. ...
Conference paper (2020) - A.N.N. Mahmoud, Frederic Vanderveken, Florin Ciubotaru, Christoph Adelmann, Sorin Cotofana, Said Hamdioui
Due to their very nature, Spin Waves (SWs) created in the same waveguide, but with different frequencies, can coexist while selectively interacting with their own species only. The absence of inter-frequency interferences isolates input data sets encoded in SWs with different frequencies and creates the premises for simultaneous data parallel SW based processing without hardware replication or delay overhead. In this paper we leverage this SW property by introducing a novel computation paradigm, which allows for the parallel processing of n-bit input data vectors on the same basic SW based logic gate. Subsequently, to demonstrate the proposed concept, we present 8-bit parallel 3-input Majority gate implementation and validate it by means of Object Oriented MicroMagnetic Framework (OOMMF) simulations. To evaluate the potential benefit of our proposal we compare the 8-bit data parallel gate with equivalent scalar SW gate based implementation. Our evaluation indicates that 8-bit data 3-input Majority gate implementation requires 4.16x less area than the scalar SW gate based equivalent counterpart while preserving the same delay and energy consumption figures. ...