



Delft University of Technology

## European Test Symposium Teams an Anniversary Snapshot

Jenihhin, M.; Raik, J.; Jutman, A.; Mir, S.; Taouil, M.; Fieback, M.; Bishnoi, R.; Hamdioui, S.; Ma, K.; More Authors

**DOI**

[10.1109/ETS63895.2025.11049652](https://doi.org/10.1109/ETS63895.2025.11049652)

**Publication date**

2025

**Document Version**

Final published version

**Published in**

Proceedings - 2025 IEEE European Test Symposium, ETS 2025

**Citation (APA)**

Jenihhin, M., Raik, J., Jutman, A., Mir, S., Taouil, M., Fieback, M., Bishnoi, R., Hamdioui, S., Ma, K., & More Authors (2025). European Test Symposium Teams: an Anniversary Snapshot. In *Proceedings - 2025 IEEE European Test Symposium, ETS 2025* (Proceedings of the European Test Workshop). IEEE. <https://doi.org/10.1109/ETS63895.2025.11049652>

**Important note**

To cite this publication, please use the final published version (if applicable).  
Please check the document version above.

**Copyright**

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

**Takedown policy**

Please contact us and provide details if you believe this document breaches copyrights.  
We will remove access to the work immediately and investigate your claim.

**Green Open Access added to [TU Delft Institutional Repository](#)  
as part of the Taverne amendment.**

More information about this copyright law amendment  
can be found at <https://www.openaccess.nl>.

Otherwise as indicated in the copyright section:  
the publisher is the copyright holder of this work and the  
author uses the Dutch legislation to make this work public.

# European Test Symposium Teams: an Anniversary Snapshot

M. Jenihhin<sup>2</sup>, J. Raik<sup>2</sup>, A. Jutman<sup>2</sup>, N. Cherezova<sup>2</sup>, R. Ubar<sup>2</sup>, L. Miclea<sup>3</sup>, S. Enyedi<sup>3</sup>, I. Stefan<sup>3</sup>, O. Stan<sup>3</sup>, C. Corches<sup>3</sup>, Z. Peng<sup>4</sup>, P. Eles<sup>4</sup>, R. Drechsler<sup>5A</sup>, S. Eggersglüß<sup>5B</sup>, G. Fey<sup>5C</sup>, A. Glowatz<sup>5B</sup>, D. Tille<sup>5B</sup>, G. Gielen<sup>6A</sup>, A. Coyette<sup>6A,B</sup>, W. Dobbelaere<sup>6B</sup>, R. Vanhooren<sup>6B</sup>, P.-Y. Chuang<sup>7</sup>, E. J. Marinissen<sup>7</sup>, G. Di Natale<sup>8</sup>, M. Barragan<sup>8</sup>, P. Maistri<sup>8</sup>, S. Mir<sup>8</sup>, E.-I. Vatajelu<sup>8</sup>, P. Bernardi<sup>9</sup>, S. Di Carlo<sup>9</sup>, P. Prinetto<sup>9</sup>, M. Sonza Reorda<sup>9</sup>, M. Violante<sup>9</sup>, H.-G. Stratigopoulos<sup>10</sup>, M. K. Michael<sup>11</sup>, S. Neophytou<sup>11</sup>, S. Hadjitheophanous<sup>11</sup>, K. Christou<sup>11</sup>, M. Skitsas<sup>11</sup>, A. Bosio<sup>12A</sup>, B. Deveautour<sup>12B</sup>, P. Girard<sup>12C</sup>, M. Traiola<sup>13A</sup>, A. Virazel<sup>12C</sup>, F. Fernandes dos Santos<sup>13A</sup>, A. Kritikakou<sup>13A,B</sup>, G. Casagranda<sup>14</sup>, M. Vallero<sup>14</sup>, F. Vella<sup>14</sup>, P. Rech<sup>14</sup>, L. M. Bolzani Poehls<sup>15A</sup>, M. Krstic<sup>15A,B</sup>, M. Andjelkovic<sup>15A</sup>, F. Vargas<sup>15A</sup>, G. Tshagharyan<sup>16</sup>, G. Harutyunyan<sup>16</sup>, V. Vardanian<sup>16</sup>, S. Shoukourian<sup>16</sup>, Y. Zorian<sup>16</sup>, J. Dworak<sup>17A</sup>, K. Nepal<sup>17B</sup>, T. Manikas<sup>17A</sup>, M. Taouil<sup>18</sup>, M. Fieback<sup>18</sup>, A. Gebregiorgis<sup>18</sup>, R. Bishnoi<sup>18</sup>, S. Hamdioui<sup>18</sup>, A. Chatterjee<sup>19A</sup>, A. Saha<sup>19A</sup>, S. Komarraju<sup>19B</sup>, K. Ma<sup>19C</sup>, C. Amarnath<sup>19D</sup>, M. Tahoori<sup>20</sup>, M. Mayahinia<sup>20</sup>, M. Rajabalipanah<sup>21</sup>, K. Basharkhah<sup>21</sup>, N. Nosrati<sup>21</sup>, Z. Jahanpeima<sup>21</sup>, Z. Navabi<sup>21</sup>, H.-J. Wunderlich<sup>22A</sup>, S. Hellebrand<sup>22B</sup>

<sup>2</sup>Department of Computer Systems, Tallinn University of Technology, Tallinn, Estonia,

<sup>3</sup>Technical University of Cluj-Napoca, Romania, <sup>4</sup>Embedded Systems Lab. (ESLAB), Linköping Univ., Sweden,

<sup>5A</sup>Group of Computer Architecture, University of Bremen, Bremen, Germany, <sup>5B</sup>Siemens EDA, Tessent, Hamburg, Germany, <sup>5C</sup>Institute of Embedded Systems, Hamburg University of Technology, Hamburg, Germany,

<sup>6A</sup>ESAT-MICAS, KU Leuven, Belgium, <sup>6B</sup>onsemi Belgium, Mechelen, Belgium,

<sup>7</sup>Advanced Reliability, Robustness, and Test (AR<sup>2</sup>T), imec, Leuven, Belgium,

<sup>8</sup>Univ. Grenoble Alpes, CNRS, Grenoble INP, TIMA, France, <sup>9</sup>Politecnico di Torino, Italy,

<sup>10</sup>Sorbonne Université, CNRS, LIP6, Paris, France, <sup>11</sup>Department of Electrical and Computer Engineering, University of Cyprus, Nicosia, Cyprus, <sup>12A</sup>Centrale Lyon, INSA Lyon, CNRS, Université Claude Bernard Lyon 1, CPE Lyon, INL, UMR5270, France, <sup>12B</sup>Nantes Université, CNRS, IETR UMR 6164, F-44000 Nantes, France,

<sup>12C</sup>LIRMM, University of Montpellier/CNRS, Montpellier 34000 France, <sup>13A</sup>Univ Rennes, Inria, CNRS, IRISA, Rennes, France, <sup>13B</sup>Institut universitaire de France (IUF), France, <sup>14</sup>University of Trento, Italy,

<sup>15A</sup>IHP – Leibniz Institute for High Performance Microelectronic, Frankfurt Oder, Germany,

<sup>15B</sup>University of Potsdam, Germany, <sup>16</sup>Synopsys Armenia, Embedded Test & Repair Group, Yerevan, Armenia,

<sup>17A</sup>Southern Methodist University, Dallas, Texas, USA, <sup>17B</sup>University of St. Thomas, St. Paul, Minnesota, USA, <sup>18</sup>Delft University of Technology, The Netherlands,

<sup>19A</sup>Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA,

<sup>19B</sup>Intel Corporation, <sup>19C</sup>Rebellions AI, South Korea, <sup>19D</sup>Google Inc, Sunnyvale, CA, USA,

<sup>20</sup>Faculty of Computer Science, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany,

<sup>21</sup>School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran,

<sup>22A</sup>University of Stuttgart, Germany, <sup>22B</sup>Paderborn University, Germany

**Abstract**—The IEEE European Test Symposium (ETS) has been facilitating progress in electronic systems testing since its launch in 1996. On the occasion of its 30th anniversary, this collaborative paper gathers sections by 21 ETS teams to outline their influential ideas and milestones. Each team’s section highlights historical perspective, current research, frameworks and projects as well as forward-looking research agendas in the area of electronic-based circuits and systems testing, reliability, safety, security and validation. This anniversary summary documents how research of various ETS teams, exemplifying the test community, has been evolving and transitioning from concepts to practical standards and Electronic Design Automation (EDA) tools and flows. This legacy is a strong base to drive the next generation of advances in electronic systems testing.

## 1. INTRODUCTION

Electronic systems testing has undergone an impressive evolution over the past three decades, keeping pace with the exponential growth in semiconductor complexity. Since the establishment of the IEEE European Test Symposium (started as the IEEE European Test Workshop) in the mid-1990s, the ETS community’s research has been leading and reflecting major shifts in how we design for testability, generate and apply test patterns, and ensure the reliability and security of integrated circuits. This journey spans from foundational design-for-test (DFT) breakthroughs that made large designs testable, to sophisticated techniques today that safeguard cutting-edge

hardware AI accelerators and even upcoming quantum devices. Along this path, ETS served as a global forum uniting diverse research teams from Europe and all continents to collaborate on advancing the state of the art. In this joint paper, we review the milestones and the key research themes of the community based on representative contributions of several ETS teams who submitted their sections in response to an open call. A summary of the ETS history from 1996 to 2019 is also outlined in [1].

**The rise of standardized test and DFT.** In the 1990s, ETS witnessed a transformative period marked by the standardization of DFT techniques. The adoption of *IEEE 1149.1 Boundary-Scan ("JTAG")* revolutionized test access by providing a universal interface for testing and debugging integrated circuits. This was followed by advances in *full-scan design*, where all flip-flops were made scannable, enabling *automated test pattern generation (ATPG)* tools to achieve high fault coverage. Boundary scan became the foundation of test infrastructure and paved the way for later standards (e.g., *IEEE 1500* for core wrappers and *IEEE 1687 "LJTAG"* for on-chip instruments) that extended the scan paradigm. *IDDQ (I<sub>DD</sub> Quiescent)* testing became a critical method for beyond stuck-at-fault defect screening, leveraging quiescent supply current measurements to identify faulty ICs, while *at-speed testing* emerged as a solution to detect timing-related defects in high-frequency circuits.

**Built-in self-test and hybrid testing techniques.** With the growth of complexity, *built-in self-test (BIST) techniques* have gained prominence. *Memory BIST* became essential for embedded SRAM and ROM testing, significantly reducing test time and using on-chip pattern generators and signature analyzers. For logic testing, *hybrid BIST* combined pseudorandom patterns with deterministic ATPG vectors hard-to-detect faults, optimizing fault coverage while minimizing test time. *Software-based self-test (SBST)* emerged as a versatile solution for processors, allowing in-field testing using self-generated software routines. Initially proposed for periodic in-field testing of mission-critical systems, SBST soon became interesting for manufacturing test of processors where traditional DFT was limited, at the same time enabling at-speed testing with zero extra hardware.

**The Core-Based SoC testing paradigm.** The transition to system-on-chip (SoC) designs in the late 1990s brought new challenges, and ETS played a pivotal role in developing *core-based testing*. Here, rather than treating the chip as a single entity, each embedded core could be isolated with a test wrapper and accessed via a shared *test access mechanism (TAM)* bus. The *IEEE 1500 standard* formalized core wrappers and test access mechanisms, allowing modular testing of complex SoCs. Optimizing test access and scheduling for multi-core systems became a key research focus, with ETS teams supporting innovations in TAM design and *test scheduling algorithms*.

**Compressing test data and controlling power.** In the 2000s, test data volume became a critical challenge, leading to the development of *test compression techniques*. Methods like *scan compression* reduced requirements for data storage and transfer, while *low-power testing techniques* minimized

the risk of excessive switching activity during testing. ETS teams pioneered numerous solutions for *X-filling* (assigning “don’t-care” bits in test patterns to reduce toggling), *power-aware test scheduling*, and *segmented scan*, making high-quality testing feasible even for power-sensitive designs. The community led several of these developments and influenced standards (e.g., the *IEEE P1838 standard* for 3D test access specifically considering power-aware features).

**Advanced fault models and formal methods.** ETS research has also driven advances in fault modeling and test generation. From *decision diagram assisted techniques*, such as Structurally Synthesized Binary Decision Diagrams (SSBDDs) in the late 1980s, to the practical application of *SAT-based ATPG*, these methods have expanded the scope of testable faults, especially in complex designs. Formal methods have enabled efficient test pattern generation and fault diagnosis, ensuring that even hard-to-detect faults can be addressed.

**Analog, mixed-signal, and non-digital testing.** While digital test techniques dominated, ETS teams also expanded their focus to support *analog and mixed-signal testing*, an inherently challenging area. The *IEEE 1149.4 standard* for analog boundary scan and *oscillation-based BIST* for analog circuits self-test demonstrated that analog testing could be simplified to a digital-like measurement.

**Reliability and resilience in the Era of AI and beyond.** In recent years, the focus of electronics testing has broadened from manufacturing defect detection to ensuring *reliability* and *resilience* of systems in the field. As feature sizes shrink, the community has strengthened addressing soft errors, aging effects, and weak-defect tolerance in designs. A significant trend has been *cross-layer reliability approaches*, where vulnerabilities are analyzed and mitigated across different abstraction levels (from devices up to software) and system layers. The concept of *self-healing* and *self-health aware hardware* is gaining traction, advocating electronic systems that can monitor their own “health” and adapt or reconfigure themselves to tolerate faults. This vision extends the traditional scope of testing into *runtime monitoring* and maintenance, aligning with emerging standards in *silicon lifecycle management*. The advent of AI hardware has posed new reliability challenges and has become a focal point for recent contributions of the ETS teams. These include methods for *fault resilience assessment* and *selective hardening* of DNN models considering the heterogeneous vulnerability of the neural network’s components. Looking even further ahead, the ETS community is addressing *emerging technologies* like quantum computing, where quantum devices are inherently fragile, suffering from decoherence and noise, along with the threats in focus for conventional technologies, e.g. soft errors.

Over the past 30 years, the ETS teams’ collective innovations have shaped the practice of electronic testing. Many of the highlighted breakthroughs transitioned from academic papers to industry standards and EDA tools, emphasizing collaboration between researchers across sectors. The ETS teamwork is evident in numerous synergistic multi-partner projects and cross-border research initiatives. This legacy is a strong base for the next generation of advances in electronic systems testing.

## 2. TALTECH: FROM DECISION DIAGRAMS TO EDGE AI RELIABILITY<sup>1,2</sup>

*Maksim Jenihhin, Jaan Raik, Artur Jutman, Natalia Cherezova, Raimund Ubar.*

*Tallinn University of Technology, Estonia*

This section outlines the evolution of test research at Tallinn University of Technology (TalTech), spanning from pioneering work on structural decision diagrams to cutting-edge reliability solutions for Edge AI chips. We detail our historical contributions, current advancements, and future directions, emphasizing our longstanding ties with the IEEE European Test Symposium.

### 2.A. Historical perspective of related research

The origins of our test research at TalTech date back to the 1970s, a transformative era in digital testing when foundational methodologies were being established to address the growing complexity of electronic systems. During this period, decision diagrams emerged as a powerful tool for representing and manipulating Boolean functions, largely due to the seminal work of Sheldon Akers introducing Binary Decision Diagrams (BDDs)<sup>3</sup> as a compact and canonical representation of logic functions. In the 1970s, Raimund Ubar introduced Alternative Graphs (AGs)<sup>4</sup>, later refined as Structurally Synthesized Binary Decision Diagrams (SSBDDs), to model both the logic functions and gate-level structures of digital circuits (Fig. 2.1) [2]. Unlike traditional BDDs, which focus solely on functional representation, SSBDDs establish a one-to-one mapping between graph nodes and circuit signal paths, enabling efficient fault simulation and test generation. This structural awareness facilitates detailed fault simulation, test generation, and analysis of fault propagation phenomena, such as fault masking and detectability. For example, SSBDDs can represent multiple stuck-at faults as "fault packages," allowing parallel simulation<sup>5</sup> of fault effects across a circuit. This capability drastically improves efficiency compared to sequential fault injection methods, which struggle with the combinatorial explosion of multiple fault scenarios [2].

To address the growing complexity of digital systems, we extended this framework to High-Level Decision Diagrams (HLDDs). HLDDs generalize SSBDDs for higher abstraction levels, such as register transfer level (RTL) and instruction set architecture (ISA), facilitating cross-level diagnostic modelling. This uniform formalism enabled the development of

<sup>1</sup>The authors thank their current academic colleagues in the TalTech team: Prof. Gert Jervant, Prof. Masoud Daneshthalab, Prof. Samuel Pagliarini, Prof. Peeter Ellerjee, Dr. Tara Ghasempouri, Dr. Levent Aksoy, Dr. Anton Tsertov, Dr. Sergei Devadze, Dr. Mahdi Taheri, Dr. Mohammad H. Ahmadilivani as well all our great postdocs, PhD and MSc students and engaged researchers.

<sup>2</sup>We acknowledge the support by the Estonian Research Council grant PRG1467 CRASHLESS, CoE "Foundations of the Universe" (TK202), and EU Grants TAICHP (#101160182) and TIRAMISU (#101169378).

<sup>3</sup>S. B. Akers, "Binary Decision Diagrams," IEEE Trans. Comput., vol. C-27, no. 6, pp. 509–516, 1978.

<sup>4</sup>R. Ubar (1976) Test generation for digital circuits with alternative graphs. Proc Tallinn Technical Univ 409:75–81 (in Russian)

<sup>5</sup>R. Ubar, S. Devadze, J. Raik and A. Jutman, "Ultra Fast Parallel Fault Analysis on Structurally Synthesized BDDs," ETS, 2007, pp. 131-136.

implementation-independent test generation algorithms, achieving high stuck-at-fault coverage for RISC-type processors<sup>6</sup> and even modelling and verification of temporal assertions<sup>7</sup>. Our comprehensive theory of structural DDs, detailed in [2], integrates these advancements, providing a robust foundation for test-related methods across abstraction levels. Our early adoption of SSBDDs in automated test pattern generators (ATPGs) in the 1980s, deployed within the Soviet defence and computer industries, and the global licensing of the Turbo-Tester software underscore the historical impact.



Fig. 2.1. Representation of a gate-level circuit by SSBDD and BDD

Among other notable research directions of the TalTech team are board-level test solutions employing JTAG and IJTAG [3] technologies that facilitated the creation in 2005 of a spin-off company Testonica Lab<sup>8</sup> led by Artur Jutman. The team has studied Built-In Self-Test<sup>9</sup>, testing of NoCs<sup>10</sup> and nanoelectronics ageing (NBTI) and mitigation [4]. It has contributed to establishing of the open-source RTL analysis EDA tool zamiaCAD<sup>11</sup>, automated RTL debug<sup>12</sup> and cross-level design dependability analysis.

Beyond technical innovation, our historical perspective reflects a collaborative spirit. The team's engagement with the international test community, particularly through the ETS, began in the 1990s and has grown steadily by hosting the symposium in 2005, 2020 and 2025, VLSI-SoC in 2016, editions of DDECS in 2012 and 2023 and numerous other events by the test community. The team led several successful European multi-partner projects RIA DIAMOND (2009-2012, J. Raik), RIA BASTION (2014-2017, Jutman), RIA IMMORTAL (2015-2018, J. Raik), Twinning TUTORIAL (2016-2018, J. Raik), MSCA DN RESCUE (2017-2021, M. Jenihhin), Twinning SAFEST (2021-2023, S. Pagliarini).

<sup>6</sup>A.S. Oyeniran, R. Ubar, M. Jenihhin, C.C. Gürsoy and J. Raik, "High-Level Combined Deterministic and Pseudo-exhaustive Test Generation for RISC Processors," ETS, 2019, pp. 1-6.

<sup>7</sup>M. Jenihhin, J. Raik, A. Chepurov and R. Ubar, "Temporally Extended High-Level Decision Diagrams for PSL Assertions Simulation," ETS, 2008.

<sup>8</sup>Testonica Lab OÜ <https://testonica.com/>

<sup>9</sup>G. Jervan, P. Eles, Z. Peng, R. Ubar, M. Jenihhin, "Test time minimization for hybrid BIST of core-based systems," ATS, 2003, pp. 318-323

<sup>10</sup>J. Raik, R. Ubar and V. Govind, "Test Configurations for Diagnosing Faulty Links in NoC Switches," ETS, 2007, pp. 29-34.

<sup>11</sup>A. Tsepurov, G. Bartsch, R. Dorsch, M. Jenihhin, J. Raik and V. Tihhomirov, "A scalable model based RTL framework zamiaCAD for static analysis," VLSI-SoC, 2012, pp. 171-176.

<sup>12</sup>M. Jenihhin et al., "Automated Design Error Localization in RTL Designs," in IEEE Design & Test, vol. 31, no. 1, pp. 83-92, Feb. 2014.

## 2.B. State of play and assets in related research

In the present day, our research has shifted to address one of the most pressing challenges in modern computing: ensuring cost-efficient reliability for Edge AI chips, with a particular emphasis on Deep Neural Network (DNN) hardware accelerators. The shift to Edge AI, driven by real-time inference demands in resource-constrained environments, has heightened the need for reliable hardware in safety- and mission-critical applications. Our current efforts focus on developing innovative tools and techniques<sup>1</sup> to tackle these challenges.

A recent survey<sup>2</sup> categorizes the state-of-the-art DNN reliability assessment methods into fault injection, analytical and hybrid methods. DeepVigor [5] is a semi-analytical method that assesses DNN fault resiliency by providing vulnerability value ranges for neurons and vulnerability factors for layers and models (Fig. 2.2). This approach leverages analytical pruning of the fault space, reducing the need for exhaustive simulation while pinpointing areas requiring selective hardening with minimal overhead. The tool is at the core of open-source frameworks for DNN reliability assessment and enhancement<sup>3,4</sup>.



Fig. 2.2. DeepVigor's semi-analytical DNN reliability assessment through neuron vulnerability value ranges [5]

Recently, we have been exploring approximate computing (AxC) and quantization as a strategy for balancing reliability and efficiency in DNN accelerators<sup>5,6</sup>. Our current work includes in-field error correction techniques, such as the adaptive fault-tolerant approximate multiplier architecture (AdAM) [6], which achieves "negative-overhead" reliability by repurposing unutilized resources in approximate multipliers. These advancements align with industry needs for systolic-array- and data-flow-based accelerators on ASICs and FPGAs in resource-constrained environments.

Industry-scale solutions for today's (FPGA-)SoCs demand a comprehensive cross-layer analysis in the systems fault-

<sup>1</sup>M. Jenihhin et al., "Keynote: Cost-Efficient Reliability for Edge-AI Chips," LATS, 2024.

<sup>2</sup>M. H. AhmadiLivan, M. Taheri, J. Raik, M. Daneshlalab, and M. Jenihhin, "A systematic literature review on hardware reliability assessment methods for deep neural networks," ACM Comput. Surv., 2024.

<sup>3</sup><https://github.com/mhahmadiLivan/DeepVigor>

<sup>4</sup><https://github.com/mhahmadiLivan/SentinelNN>

<sup>5</sup>N. Cherezova et al., "Heterogeneous Approximation of DNN HW Accelerators based on Channels Vulnerability," VLSI-SoC, 2024, pp. 1-4.

<sup>6</sup>S. Nazari, M. Taheri, et al., "FORTUNE: A Negative Memory Overhead Hardware-Agnostic Fault Tolerance Technique in DNNs", ATS, 2024.

tolerance vendor- and user-spaces<sup>7</sup>. Here, the process for fault detection, localization and recovery may be sophisticated and application-dependant, implying on-chip monitors' data analysis and SoC health management protocols<sup>8</sup>, e.g. supported by IJTAG-based infrastructure.

## 2.C. Perspectives and outlook

Looking ahead to the era dominated by AI-centric and highly integrated chips, we foresee growing demand for cost-efficient reliability solutions as Edge-AI applications expand into safety-critical domains. We are exploring advanced techniques, including in-field error correction and cross-layer reliability management, to ensure dependable DNN accelerators.

One key trend is the increasing complexity and heterogeneity of computing architectures: next-generation chips will combine general-purpose processors with AI accelerators, distributed across cloud and edge, and fabricated in advanced technologies that can be more susceptible to faults. TalTech's team is building strong assets for collaboration through participation in European research initiatives, providing a bridge between academia and industry. Notably, the group recently started coordinating EU projects MSCA DN TIRAMISU (2024-2028, M. Jenihhin) and Twinning TAICHP (2024-2027, M. Jenihhin), which focus on energy-efficient and reliable Edge AI hardware. Through these projects, TalTech researchers are partnering with universities and industry players across Europe to develop cross-layer reliability and energy-efficiency solutions from circuit-level up to model- and system-levels for safety-critical applications.

We are eager to address future challenges in digital test and reliability of Edge AI, reinforcing our commitment to the ETS ecosystem. Here, an important direction is improving the autonomy and resiliency of electronic systems. The concept of self-healing or self-aware hardware is emerging, where chips can monitor their own health and adapt to failures. This calls for extending the silicon lifecycle management solutions through SoC-level system health management that TalTech develops in collaboration with the Testonica team.

The interplay between the EU Chips Act and the EU AI Act creates a powerful synergy and amplifies the potential of open-source HW/EDA frameworks by strengthening Europe's semiconductor infrastructure. TalTech is advancing such frameworks through collaboration with the RISC-V community. It also contributes to establishing a new national chips competence centre supported by Chips JU KIIP (2025, J. Raik).

Our vision for the future is deeply collaborative. The ETS community is already very successful, yet it can even further strengthen ties by sharing tools, datasets and expertise, and co-leading joint research projects on tomorrow's computing systems resilience, and contributing to open-source initiatives.

<sup>7</sup>N. Cherezova, K. Shabin, M. Jenihhin, and A. Jutman, "Understanding fault-tolerance vulnerabilities in advanced SoC FPGAs for critical applications," Microelectronics Reliability, vol. 146, 2023.

<sup>8</sup>K. Shabin, M. Jenihhin, A. Jutman, S. Devadze and A. Tserkov, "On-Chip Sensors Data Collection and Analysis for SoC Health Management," DFT, 2023, pp. 1-6

### 3. A LEGACY OF DEPENDABILITY IN CLUJ-NAPOCA, ROMANIA<sup>1</sup>

*Liviu Miclea, Szilárd Enyedi, Iulia Stefan, Ovidiu Stan, Cosmina Corcheș.*

*Technical University of Cluj-Napoca, Romania*

Electronic systems testing has been fundamental to the technological and industrial progress in Cluj-Napoca, with institutions like IPA Cluj (Design Institute for Automation, Cluj-Napoca branch) [7] playing pivotal roles. We trace the evolution of testing in Cluj from 1976 to present-day advancements continued by the DeSy (Dependable Systems) Group at the Technical University of Cluj-Napoca's Automation Department. The conferences co-organized by these teams are also important, starting with THETA in 1982 which evolved into present-day AQTR, and including ETS 2015, TSS 2015, DDECS 2019.

#### 3.A. Historical perspective

Cluj-Napoca has played a pivotal role in the advancement of Romanian computer science, particularly in the fields of automation and testing. The city's contributions date back to 1958 when it organized the first national symposia on cybernetics. This foundation was further strengthened in 1975 with the establishment of the Cluj Territorial Computing Center (CTCC), which turned, back then, the city into a national informatics hub.

A major milestone in Cluj's technological development was the founding of the Design Institute for Automation (IPA) Cluj in 1976. Initially conceived as a regional branch of IPA Bucharest, its establishment was influenced by a government strategy that aimed to develop automation and medical instrumentation industries in Cluj. Prof. Dr. Eng. Marius Hăngănut played a key role in defining its niche focus on automated testing technologies, an area where IPA Cluj quickly excelled.

Under Prof. Hăngănut's leadership, IPA Cluj became a center of innovation, closely collaborating with the Technical University of Cluj-Napoca (TUCN). Prof. Hăngănut later became the head of TUCN's Automation Department. Many of IPA's early specialists, such as Alfred Letia, Clement Festilă, and Olimpiu Negru, were affiliated with TUCN. The institute was instrumental in developing Romania's first programmable logic controllers (AP 101, AP 117) and the ECAROM-800 process computers. During the 1980s, IPA Cluj expanded its expertise, collaborating with major industrial players such as Automatica, Electromagnetica, and IEIA Cluj. Their efforts led to pioneering developments in distributed control systems (SDC-2050) and hydroelectric turbine regulation systems (REH-76M). One of its landmark achievements was the THETAROM family of automated test systems (1976-1993), which were successfully exported to countries like Germany, Poland, and China, enhancing Romania's reputation in automated testing technology.

In 1982, IPA Cluj, led by Prof. Hăngănut, launched the THETA Symposium (Technologies and Equipment for Automated Testing), which became an annual event until 1989. The first edition, held on November 5-6, 1982, featured 135

scientific papers and 227 participants from 28 institutions. The symposium played a crucial role in familiarizing Romania's technical community with global trends in quality assurance, reliability, and maintainability in electronics and automation.

Despite the economic and industrial challenges following the transition to a market economy in the 1990s, IPA Cluj demonstrated resilience by shifting its focus toward SCADA systems, system integration projects, and software-based automation. This transition laid the groundwork for an academic shift, with key IPA figures, including Prof. Dr. Eng. Liviu Miclea, moving into academia. Prof. Miclea, an IPA Cluj researcher from 1984 to 1995, later became a professor at TUCN, significantly influencing automation and dependable systems research.

In 1996, the THETA symposium was rebranded under the IEEE-AQTR (Automation, Quality and Testing, Robotics conference [8]) umbrella, resuming under the leadership of Ioan Stoian (Director of IPA Cluj) and Liviu Miclea (TUCN faculty). Since 2004, the conference has been co-sponsored by the IEEE Computer Society through TTTC, represented by Mr. Yervant Zorian, President of TTTC USA. The conference proceedings since 2006 have all been indexed in IEEE Xplore and Web of Science, accumulating over 1600 papers with authors from over 18 countries, the conference upholding an average acceptance rate of 63%.

#### 3.B. Team

The Dependable Systems (DeSy) group is led by Prof. Dr. Eng. Liviu Miclea. Other members are Prof. Dr. Eng. Honoriu Vălean, Prof. Dr. Eng. Silviu Folea, Prof. Dr. Eng. Ovidiu Stan, Assoc. Prof. Dr. Eng. Szilárd Enyedi, Assoc. Prof. Dr. Eng. Dan Goța, Lecturer Dr. Eng. Iulia Stefan, Lecturer Dr. Eng. Cosmina Corcheș, Lecturer Dr. Eng. Adela Pop, Lecturer Dr. Eng. Alexandra Fanca, Lecturer Dr. Eng. Claudiu Domuța, and Assist. Eng. Marius Misaros.

The PhD student members of the DeSy team include Eng. Henrietta-Helena Futo, Eng. Diana-Elena Niti, Eng. Alexandru Stanciu, Eng. Pavel-Alexandru Bejan, Eng. Alexandru Ciobotaru, Eng. Tudor Covrig, Eng. Lucian Farmathy-Pop, Eng. Alexandra-Elena Dobre, Eng. Vlăduț Dobra, Eng. Andreea Muscan, Eng. Razvan Dologa, Eng. Bogdan Drăghici, Eng. George Flutur, and Eng. Alexandru Jibotean.

The academic transition from IPA Cluj was solidified with the leadership of Prof. Dr. Eng. Liviu Miclea, who played a crucial role in maintaining the legacy of IPA Cluj within TUCN. After joining TUCN as a lecturer in 1995, Prof. Miclea ascended to a full professorship by 2004, serving as Head of the Automation Department (2003-2011), Dean of the Faculty of Automation and Computer Science (2012-2024), currently vice-president of TUCN's Senate.

At TUCN, the Dependable Systems Research Group (DeSy, <https://desy.utcluj.ro>) continues the tradition of automation research and testing while integrating modern challenges such as cybersecurity, artificial intelligence, IoT, and cyber-physical systems (CPS) [9]. The DeSy group focuses on system reliability, security, and resilience, addressing critical sectors such as energy, transportation, and healthcare. Their notable projects include research on cybersecurity, autonomous vehicles, and

<sup>1</sup>Special thanks to DeSy team member Assoc. Prof. Dr. Dan Ioan Goța.

intelligent systems. The group offers consulting, training, and system design services for industries and academia, ensuring the continued relevance and application of their research. Many of the current members finalized their PhD stages under Prof. Miclea's or Prof. Vălean's supervision, several completing training courses abroad.

The IEEE-AQTR biennial conference series is steered by its general chair Prof. Dr. Eng. Liviu Miclea—the DeSy team leader—with the DeSy team members heavily involved in organizing it. With a strong network of international collaborators and industry partnerships, the team remains at the forefront of dependable and secure system development.

### 3.C. Recent publications

The Dependable Systems (DeSy) group has made significant contributions to the field through numerous high-impact publications in recent years.

In 2025, the team continues the research with two new articles on the reliability of assistive robots<sup>1</sup>, and the reliability of neural networks on resource-constrained hardware<sup>2</sup>.

In 2024, DeSy analyzed the feasibility of training neural networks on data sourced directly from medical imaging equipment, rather than on images generated by them<sup>3</sup>. The team assessed the reliability of the Pepper Robot in handling office documents<sup>4</sup>. We also explored security in IoT with Kubernetes and Raspberry Pi clusters<sup>5</sup>. The team also analyzed the dependability of a real-time monitoring testing setup<sup>6</sup>.

In 2023, a team member co-authored a study on hybrid Retina Net classifiers for thermal imaging<sup>7</sup>. The team also examined advancements in autonomous service robots<sup>8</sup>. Meanwhile, other members of the team investigated environmental monitoring and air quality prediction for occupational health<sup>9</sup>.

<sup>1</sup>M. Misaros, L. Miclea, D. Goța, A. Stan, O. Stan, and S. Enyedi, "Reliability Calculation for Pepper Robots. Case Study: Assistive Robot With Human," in Lecture Notes in Electrical Engineering, Springer Basel AG, 2025.

<sup>2</sup>S. Enyedi, C.-A. Deac, L. Miclea, O.-P. Stan, D.-I. Gota, and M. Misaros, "A Reliability and Performance Study of Neural Networks on Resource-Constrained Platforms," in Lecture Notes in Electrical Engineering, Springer Basel AG, 2025.

<sup>3</sup>S. Enyedi, "On the Feasibility of Deep Learning Classification from Raw Signal Data in Radiology, Ultrasonography and Electrophysiology," AQTR, 2024, pp. 1–6.

<sup>4</sup>M. Misaros, O. P. Stan, S. Enyedi, A. Stan, I. Donca, and L. C. Miclea, "A Method for Assessing the Reliability of the Pepper Robot in Handling Office Documents: A Case Study," Biomimetics, vol. 9, no. 9, 2024.

<sup>5</sup>I.-C. Donca, O. P. Stan, M. Misaros, A. Stan, and L. Miclea, "Comprehensive Security for IoT Devices with Kubernetes and Raspberry Pi Cluster," Electronics, vol. 13, no. 9, 2024.

<sup>6</sup>D.-E. Nită, T. Kubik, S. Enyedi, and L. C. Miclea, "Increasing the Dependability of a Real-Time System," AQTR, 2024, pp. 1–6.

<sup>7</sup>V. Teju, K. V. Sowmya, S. R. Kandula, A. Stan, and O. P. Stan, "A Hybrid Retina Net Classifier for Thermal Imaging," Applied Sciences, vol. 13, no. 14, 2023.

<sup>8</sup>M. Misaros, O.-P. Stan, I.-C. Donca, and L.-C. Miclea, "Autonomous Robots for Services—State of the Art, Challenges, and Research Areas," Sensors, vol. 23, no. 10, 2023.

<sup>9</sup>A. Pop (Puscasiu), A. Fanca, D. Gota, and H. Valean, "Monitoring and Prediction of Indoor Air Quality for Enhanced Occupational Health," IASC, vol. 35, no. 1, pp. 925–940, 2022.

During 2022, the team members improved continuous integration and deployment for software projects<sup>10</sup>. In another key study, the team examined RFID object identification systems in IoT environments<sup>11</sup>.

An influential work [10] explored early advances in low-power IoT solutions for smart home automation, laying the groundwork for subsequent innovations in the field.

Another paper, tackling reconfiguration security for hardware agents in testing [11], is noteworthy for its applicability, cited by several patents in IEEE Xplore.

The DeSy team's papers have over 1600 citations in international indexes.

### 3.D. Recent projects

The DeSy group has been actively involved in multiple research and development projects, both nationally and internationally, addressing critical areas such as cybersecurity, artificial intelligence, IoT, and cyber-physical systems.

Several team members give lectures within Google's *Cybersecurity Seminars* project<sup>12</sup>, aiming to improve cybersecurity education and awareness. The team also collaborates with École Centrale de Lyon, in a project focusing on developing robust AI systems<sup>13</sup>.

DeSy was also involved in the Inno-EU+ project<sup>14</sup>, a Higher Education Initiative aimed at supporting innovation and entrepreneurship. Another key project focused on integrating cognitive capabilities into robotics for autonomous applications<sup>15</sup>.

DeSy played a vital role in a national PN2-Partnerships project that developed reliable, collaborative SCADA solutions for water resource management<sup>16</sup>. Before that, the team worked with Siemens on a project exploring the use of drones for railway infrastructure maintenance<sup>17</sup>.

DeSy contributed to a European FP7 project that developed highly personalized and ecologically sustainable product-service solutions<sup>18</sup>. Another international collaboration project with Italy's Politecnico di Torino focused on cloud-driven self-healing mechanisms<sup>19</sup>.

The DeSy team members have also received two prestigious ITC/TTTC "Gerald W. Gordon" awards, and international awards for best papers and patents.

<sup>10</sup>I.-C. Donca, O. P. Stan, M. Misaros, D. Gota, and L. Miclea, "Method for Continuous Integration and Deployment Using a Pipeline Generator for Agile Software Projects," Sensors, vol. 22, no. 12, 2022.

<sup>11</sup>C. Corches, M. Daraban, and L. Miclea, "Availability of an RFID Object-Identification System in IoT Environments," Sensors, vol. 21, no. 18, 2021.

<sup>12</sup>Google and TUCN, "Cybersecurity Seminars," 2024-2026.

<sup>13</sup>ECL and TUCN, "FortifyAI: Enhancing Efficiency, Reliability, and Security through Integrated HW-SW Co-Design for AI Algorithms," 2024-2025.

<sup>14</sup>European University of Technology Alliance, "The Innovative European University of Technology," 2021-2023.

<sup>15</sup>UPB et al., "Robots and Society: Cognitive Systems for Personal Robots and Autonomous Vehicles (ROBIN)," 2018-2020.

<sup>16</sup>IPA et al., "SCADA Federation, Collaborative Instrument for Water Management – Someș River Pilot Application (F2S)," 2014-2017.

<sup>17</sup>Siemens and TUCN, "Use of Commercial Drones for Autonomous Maintenance Services in Railways," 2013-2016.

<sup>18</sup>TUCN et al., "Collaborative Environment for Design of Ambient Intelligence Enhanced Product-Services (ProSEco)," 2013-2017.

<sup>19</sup>Politecnico di Torino and TUCN, "Designing Cloud-based Self-healing Cyber-Physical Systems (CyCloSe)," 2013-2014.

#### 4. ESLAB/LINKÖPING U: SOC TESTING AND DFT

*Zebo Peng and Petru Eles.*

*ESLAB, Linköping University, Sweden*

This section presents briefly the research activities in the areas of SoC testing and DFT at the Embedded Systems Lab (ESLAB) of Linköping University in the last 30 years. It highlights several contributions that were published in the Proceedings of the European Test Symposium and the IEEE Transactions on Very Large-Scale Integration (VLSI) Systems.

##### 4.A. Historical Perspective

The ESLAB research group traces its origins to the 1980s, with a focus on developing the CAMAD high-level synthesis (HLS) system. CAMAD was designed to transform behavioral specifications, written in VHDL, into hardware implementation structures at the register-transfer level. Our approach to HLS introduced a formal representation based on an Extended Timed Petri Nets (ETPN) model, which provides distinct yet interconnected descriptions of control and data paths. ETPN serves as an intermediate design representation, enabling an iterative transformation process that refines high-level specifications into optimized hardware implementations.

The fundamental principle of our methodology is that once a behavioral specification is mapped into an ETPN model, it can be viewed as a primitive implementation, and correctness-preserving transformations can be systematically applied to refine and improve the design. The selection of transformations is guided by an optimization strategy, which concurrently addresses key synthesis tasks, including operation scheduling, data path allocation, and control allocation. Our integrated approach enhances the likelihood of achieving a globally optimal solution. This foundational research has significantly contributed to the evolution of HLS techniques, establishing ESLAB as a pioneer in the field.

Building on the ETPN model and the transformational approach to HLS, we began developing DFT techniques in the 1990s. Our initial focus was on creating a testability analysis technique to assess the intermediate results of the HLS transformation process. This algorithm evaluates key testability metrics, including test generation costs, fault coverage, and test application time. Based on the testability analysis results, targeted testability-improvement techniques are applied to the most test-resistant areas of the design.

We developed three primary testability enhancement techniques: 1) controllability/observability-balance allocation to ensure a more uniform distribution of controllability and observability across the design; 2) partial scan insertion to selectively insert scan registers to improve test coverage while minimizing overhead; and 3) conditional scan installation to introduce conditional scan paths to enhance observability in specific design regions. Unlike traditional post-synthesis testability enhancement methods, our approach integrates testability-improvement transformations directly into the HLS process. By simultaneously considering testability factors alongside operation scheduling, data path allocation, and control allocation, we achieve globally optimized, highly testable designs.

We developed a comprehensive framework for SoC testing, incorporating a suite of algorithms to address test scheduling, test access mechanism (TAM) design, test set selection, test parallelization, and test resource placement. Our approach aims to minimize test application time and reduce TAM cost, while adhering to constraints on power consumption and resource availability. A key component of our methodology is an efficient heuristic-based approach for designing a core-level test interface (wrapper) and a TAM architecture, along with its optimized test schedule [12]. This approach ensures compliance with IEEE P1500 standards while achieving a balance between performance and efficiency.

We developed a hybrid Built-In Self-Test (BIST) solution for SoC testing, integrating pseudorandom test sequences with stored deterministic test patterns in a cost-effective manner. It achieves the optimal balance between these two test techniques, minimizing both test time and memory requirements while maintaining high fault coverage. To achieve this balance, we propose two algorithms that determine the optimal stopping point for pseudorandom test generation and to switch to stored deterministic patterns. Additionally, to accelerate the optimization process, we introduce a fast estimation method that efficiently evaluates the expected cost of different test configurations with minimal computational overhead. Furthermore, we implement an advanced algorithm that identifies the optimal ratio of pseudorandom sequences and deterministic test patterns, ensuring minimum energy consumption while meeting memory constraints, all without compromising test quality [13].

We developed a power-constrained test scheduling method for SoC testing in an abort-on-first-fail environment, where testing is terminated as soon as a fault is detected. Our approach leverages defect probabilities of individual cores to strategically guide test scheduling, ensuring that the expected total test time is minimized while adhering to peak power constraints. We proposed a heuristic algorithm that identifies a near-optimal scheduling order, balancing test efficiency and power constraints. Our method was also extended to support both test-per-clock and test-per-scan approaches, making it adaptable to a wide range of SoC architectures and enhancing flexibility in practical test applications.

Over the years, we have also developed a range of innovative techniques for testing hardware/software systems, addressing both hardware and software components in a unified manner. One of our key contributions is a hierarchical test generation technique for embedded systems, which ensures a seamless approach to testing integrated hardware-software architectures. Additionally, we developed a framework for optimizing assertion placement in time-constrained hardware/software modules, aimed at detecting transient and intermittent faults. This framework systematically identifies candidate assertion locations, associates optimal assertions with each location, and selects the most effective assertion set based on performance impact and assertion tightness [14].

##### 4.B. Thermal-aware Test Techniques

In recent years, our research has focused on addressing thermal-related challenges in SoC testing, where high tempera-

tures pose significant challenges for SoC testing and system performance and reliability. To address these challenges, a thermal-aware test scheduling technique is essential to minimize test application time while ensuring that core temperatures remain within safe operating limits.

We have developed a set of thermal-aware optimization techniques that effectively reduce test time while preventing core overheating. Our approach leverages test set partitioning, dividing test sets into shorter sequences interspersed with cooling periods to dissipate heat and maintain temperature stability. Additionally, we interleave test sequences from different test sets, strategically utilizing cooling periods and test bus bandwidth for data transportation—an approach that significantly reduces overall test time. In this context, the test scheduling problem is formulated as a combinatorial optimization problem, which we solve using constraint logic programming (CLP) to achieve an optimal test execution strategy.

Beyond temperature management during testing, research has also shown that different defects manifest at varying temperature levels, necessitating multi-temperature testing for improved defect detection. To address the extended test time challenge associated with multi-temperature testing, we developed a test scheduling technique that generates the shortest possible test schedules, ensuring that cores under test remain within a defined temperature interval while maintaining high defect detection efficiency. Our work contributes to enhanced SoC reliability by balancing test efficiency, thermal constraints, and defect detection accuracy, making multi-temperature-aware testing a viable solution for SoCs.

In advanced SoC designs, large temperature gradients also pose a significant challenge, exacerbating defects such as early-life failures and delay faults. Traditionally, thermal-aware test schedule techniques usually aimed only to minimize test application time while avoiding overheating. However, these techniques may overlook defects that manifest only under specific temperature conditions or gradients. For example, certain defects, such as clock-skew related faults, may require specific temperature gradients between neighboring blocks to be detected. Consequently, efficient detection of these defects requires precise thermal control during the testing process, to ensure that temperature gradients of appropriate magnitudes are applied to the circuit under test. This challenge is even more pronounced in 3D-stacked integrated circuits (3D-SICs) compared to traditional 2D ICs, due to their inherently larger temperature gradients [15].

To address this issue, we have developed efficient methods for enforcing specified temperature gradients on 3D-SICs during both burn-in and delay-fault testing. Our approach applies high-power stimuli to specific cores through the test access mechanism, eliminating the need for external heating mechanisms. By strategically injecting high-power test stimuli, we generate controlled thermal stress conditions, allowing for targeted defect detection. To further optimize this process, we jointly schedule the execution of the test, the application of high-power stimuli, and the cooling intervals, using temperature simulations to rapidly enforce the desired temperature gradi-

ents [15]. The test schedule generation is guided by thermal functions derived from a set of thermal equations, ensuring precise temperature control throughout the testing process. Experimental results validate the effectiveness of our approach, demonstrating its ability to efficiently enforce specified temperature gradients, thereby enhancing the reliability and accuracy of burn-in and delay-fault testing for both 2D and 3D SoCs.

We have also developed a thermal-aware software-based self-testing (SBST) method leveraging bounded model checking (BMC) to ensure that test temperatures remain within specified limits while accurately detecting worst-case delay faults under high-temperature conditions. It employs sequential constraints to guide automatic test pattern generation (ATPG), ensuring that the generated delay test patterns operate in functional mode. Additionally, it utilizes multi-level processor information to reduce model complexity, prevent BMC time-out aborts, and enable automated test program generation [16]. Experimental results demonstrate that our method achieves high delay fault coverage, effectively mitigates yield loss from over-testing, and enhances reliability in temperature-sensitive scenarios.

#### 4.C. Perspectives and Future Directions

Our ongoing research will focus on thermal-aware test and design techniques for SoC and cyber-physical systems (CPS). Many traditional techniques were developed with the assumption that chips within the same batch experience similar temperature profiles during testing and operation. In reality, however, deep-submicron technologies introduce significant process variations, resulting in temperature discrepancies even among chips produced in the same fabrication process. These variations, driven by differences in thermal resistance, capacitance, and power dissipation, require adaptive test and operation strategies that account for these inconsistencies. Addressing these complexities is essential to develop robust and reliable SoC/CPS testing and design methodologies.

Traditional SoC designs that disregard process variation risk inefficiency and unreliable operation. To mitigate these issues, our aim is to develop a framework for analyzing temperature-induced performance characteristics and failure mechanisms that incorporates process variation uncertainties. Since offline temperature simulations often deviate from actual on-chip conditions, an online scheduling approach, where real-time temperature monitoring via on-chip sensors informs test execution, can enhance adaptability. However, such an approach introduces run-time overhead and delays from temperature readouts. A hybrid strategy that takes advantage of both offline and online scheduling is necessary to balance efficiency and accuracy.

Future research directions include also: 1) Integrated design methodologies that holistically address thermal, reliability, security, and performance constraints, streamlining the SoC and CPS design process. 2) On-the-fly CPS optimization, leveraging machine learning and real-time sensor data to dynamically manage fluctuating workloads that cannot be fully anticipated at design time. 3) Efficient testing methodologies capable of handling the growing complexity of CPS architectures, ensuring long-term robustness and reliability throughout the system's lifecycle.

## 5. WHEN SAT-BASED ATPG BECAME PRACTICAL<sup>1</sup>

R. Drechsler<sup>UniB</sup>, S. Eggersglüß<sup>Siemens</sup>, G. Fey<sup>TUHH</sup>, A. Glowatz<sup>Siemens</sup>,  
D. Tille<sup>Siemens</sup>.

<sup>UniB</sup>University of Bremen, Germany

<sup>Siemens</sup>Siemens EDA, Germany

<sup>TUHH</sup>Hamburg University of Technology, Germany

While test generation based on Boolean Satisfiability had been proposed early on, taking it to practice took another 15 years. The advent of powerful solvers for Boolean Satisfiability together with deep insights on circuit modeling and features of test pattern generation were the enablers.

### 5.A. Motivation

While Moore's law was still on full track with exponentially increasing numbers of gates per chip, achieving sufficient fault coverage by pre-computed tests became more and more difficult. Classical structural algorithms for *Automatic Test Pattern Generation* (ATPG) were very successful but increasing logic depth and complexity of circuit structures started causing test pattern generation to fail. Here, failing means that the goal of more than 99.9 percent fault coverage was not achieved anymore. To reach such a coverage it is necessary to find test patterns for the modeled faults or to prove that a certain fault is untestable due to logic redundancies. Even relaxing parameters of structural test generation algorithms like FAN conjoined with accepting longer run times for test generation did not fully classify all faults but lead to many unclassified faults, resulting in a low fault coverage.

At about the same time, modern solvers for *Boolean Satisfiability* (SAT) had been developed and shown to be very effective in solving verification problems that had not been solved before. This gave rise to revisit the idea of using SAT for ATPG and lead to the industrial scale application of SAT-based ATPG in the longer run.

### 5.B. Historical Perspective

ATPG uses fault models where the stuck-at fault model was sufficient to create tests that detect all relevant physical defects for a long time. The extension to delay faults then continued to suffice until around 2010. Delay faults can be reduced to stuck-at faults for test generation, so in principle the same algorithms were suitable. The main difference is that two time frames and – by this – a more difficult search problem must be considered.

Deterministic test generation is the underlying most complex – NP-complete – problem that decides whether a fault for a specific modeled fault exists or whether the fault is untestable. However, this effort is only required for the faults that are most difficult to test or that are difficult to prove untestable while many faults are easily tested, i.e., many of the possible input assignments would show a difference from the correct behavior in presence of the fault. Thus, larger ATPG frameworks as illustrated in a simplified manner in Figure 5.1 combine deterministic test generation with fault collapsing,



Fig. 5.1. ATPG Framework

random simulation, sorting heuristics and fault compaction to create small test suites with high fault coverage.

Structural algorithms for ATPG like the D-algorithm, PODEM or FAN have been directly running on the circuit structure. The D-algorithm uses the basic insight that a Boolean difference between the faulty and the correct circuit must exist along a structural path between the site of a modeled fault to an observable (pseudo) primary output. PODEM starts at (pseudo) primary inputs to search for assignments that lead to a test for a given modeled fault. FAN starts from the fault location to justify an assignment that creates a difference between faulty and correct circuit that then is propagated towards outputs. Backtracking was needed whenever an assignment was found to not create a test. Basic learning techniques helped to transfer information between backtracking steps, but kept mainly local information and were difficult to implement.

Conceptually, all deterministic test generation algorithms exploit that the Boolean difference between the function of a faulty circuit and the correct circuit must be non-zero. This has been exploited by computing the Boolean difference directly, but the related symbolic algorithms did not scale well for large circuits and complex circuit structures. On the other hand SAT-based ATPG has early been used for comparing two circuits with respect to their functional equivalence using a so-called miter structure. Embedding structural knowledge related to the D-algorithm propagation rules into that model significantly improved the performance, but was – for a long time – not competitive versus the structural algorithms.

This was until SAT solvers became much more powerful tools. Benefiting from the underlying homogeneous data structure related to the conjunctive normal form, non-chronological backtracking, conflict resolution, and the resulting ability to exclude large non-solution spaces efficiently, brought a gigantic advance in their efficiency.

<sup>1</sup>We would like to thank Friedrich Hapke and Jürgen Schlöffel for their contribution and many helpful discussions.

TABLE 5.1  
INTEGRATING SAT-BASED ATPG AND FAN [17]

| Circuit | #targ.           | SAT |       | FAN(de) |        | FAN+SAT |        |
|---------|------------------|-----|-------|---------|--------|---------|--------|
|         |                  | ab. | time  | ab.     | time   | ab.     | time   |
| p99k    | $1.6 \cdot 10^5$ | 0   | 6:50m | 1398    | 6:02m  | 0       | 7:25m  |
| p177k   | $1.7 \cdot 10^5$ | 0   | 1:19h | 270     | 16:06m | 0       | 19:03m |
| p462k   | $6.7 \cdot 10^5$ | 6   | 2:16h | 1383    | 1:34h  | 0       | 1:51h  |
| p565k   | $1 \cdot 10^6$   | 0   | 2:23h | 1391    | 2:21h  | 0       | 2:47h  |
| p1330k  | $1.5 \cdot 10^6$ | 1   | 5:05h | 889     | 4:15h  | 0       | 5:00h  |

### 5.C. Collaborative Work and Advancement

A team was formed in 2005 consisting of academic and industrial researchers from the University of Bremen, Germany, as well as from Philips/NXP Semiconductors, Hamburg, Germany. This fruitful collaboration resulted in advancements in the field of ATPG as well as in SAT solving techniques. A short technical overview of the main achievements with respect to the ETS community is given in the following.

With the increase in SAT performance, SAT-based ATPG was able to outperform classical structural algorithms on hard problem instances, i.e., faults for which structural algorithms failed to find a test pattern or could not classify the fault as untestable [18]. While this effectively handles the deterministic test generation problem on a single fault, exploiting this efficiency within a full ATPG framework requires significant effort [17]. The integration with existing engines is an engineering aspect. Non-Boolean behavior of circuitry must efficiently be encoded. Environment constraints resulting from the integration into larger structures must be enforced to generate valid tests that can be applied in practice. If these practical aspects are treated, SAT-based ATPG can speed up the overall process significantly while increasing the fault coverage at the same time. The work in [17] showed that a significant fault coverage increase was achieved on industrial circuits. Table 5.1 shows for selected circuits with a given number of ATPG targets (#targ.) how SAT reduces the number of unclassified faults (ab.) and the combination FAN+SAT yields also acceptable run times.

However, the increase in design sizes, complexity and more complex fault models require also a continuous improvement of modeling and solving techniques to maintain the practicability of SAT-based ATPG. A selection of these improvements is discussed in the following.

An advantage of turning test generation into a satisfaction problem is that constraint modeling allows – in principle – to easily embed other types of faults. While the traditional modeling consisted of only stuck-at faults, the SAT-based ATPG formulation was extended to transition faults, path delay faults as well as small delay defects. However, the computational complexity is still NP-complete, thus only increasing the size of the problem instance may still lead to performance bottlenecks. Smart encoding achieves the required scalability and the guarantees on test quality needed for the practical application [19].

The traditional approach of first generating the complete SAT instance and then solving it has its disadvantages. Firstly,

conflicts within untestable faults often occur locally bounded to the fault site. While proving them unsatisfiable is trivial, the complete instance has to be generated at first. Secondly, unlike circuit-based approaches, it is not sufficient to propagate the fault effect to an observation point and justify the required propagation paths. Instead, the complete SAT instance has to be solved. Incremental techniques, where a partial instance is generated initially and extended if necessary, have shown significant improvements [20].

Usually, a SAT instance is a one-to-one mapping of the circuit structure to a Boolean formula. The redundancy, often contained in the circuit logic, is therefore encoded in the SAT instance as well. This leads to increased solving runtimes. Optimizing the Boolean logic, e.g. with the application of BDDs, can significantly reduce the SAT instance size and consequently the required runtime of SAT-based ATPG [21].

Applying other reasoning techniques like interpolation or induction, adding structural information into the SAT solving process can further improve the speed. Replacing of the SAT solver by a Pseudo Boolean SAT or Optimization solver facilitates more powerful modeling, e.g., to find longest paths.

### 5.D. Outlook

Today, SAT-based ATPG is a standard engine due to its effectiveness and versatility in including new fault models. Until today numerous further techniques have been developed to address test compaction, newer fault models, integrating technology information and working towards system-level test.

## 6. ONSEMI AND KU LEUVEN: ADVANCES IN AUTOMATED DEFECT-ORIENTED TESTING OF ANALOG AND MIXED-SIGNAL INTEGRATED CIRCUIT<sup>1,2</sup>

*A. Coyette, R. Vanhooren, W. Dobbelaere - onsemi Belgium;  
G. Gielen - KU Leuven, Belgium*

During their research collaboration over the past decade, onsemi and KU Leuven have strongly invested in the development of defect models and techniques for the automated generation of defect-oriented test methods and design-for-test techniques for analog and mixed-signal (AMS) integrated circuits (ICs), with the goal to maximize defect coverage and minimize test escapes. This paper gives an overview and perspective of the main achievements.

### 6.A. Historical perspective of the research

Throughout their decade of collaborative research, onsemi and KU Leuven have realized significant achievements with high impact. These have highly increased the detection of hard and latent defects and reduced the test escapes in AMS IC testing: 1) several analog defect models have been published; 2) novel design-for-test (DFT) structures have been presented and frameworks have been developed for their automatic insertion in AMS circuits to enhance controllability and observability; and 3) techniques have been developed for the (automated) generation of test signals and signature/test data analysis. Several of these achievements will now be highlighted below.

1) *Defect Models*: A fundamental and critical element of a defect-oriented testing approach obviously resides in the defect models being used. While the large body of literature uses over and over again – and with success – the same typical fault models such as the 5-fault or 6-fault models for transistors, the onsemi-KU Leuven collaboration also analyzed the weaknesses of these existing models in adequately representing physical failures, and invested in the development and validation of better analog defect models: the open gate model and the pre-activation of gate oxide breakdown.

In [22], KU Leuven-onsemi have presented a new open-gate DC fault model instead of the commonly used high-resistance model. The model is validated on experimental results on fabricated test circuits in 0.35- $\mu\text{m}$  BCD technology. The paper also presents a new testing approach to detect open defects in analog circuits, based on forcing the transistors outside their designed operation region.

Once all hard defects adequately detected through testing, the main problem for test escapes that cause IC failures in the field comes from latent defects like pinholes. In [23], KU Leuven-onsemi have presented a compact model for such pinhole defects that can be used to develop test methods for latent defect detection. The model has experimentally been validated on a 0.35- $\mu\text{m}$  DMOS technology and establishes the practical range of model values that should be used in simulations. The experiments consist of characterizing transistors containing

<sup>1</sup>The authors acknowledge the Flemish authorities for funding this work through multiple projects: Safe-IC 1, Safe-IC 2, NoRMA and AnalogTRIC.

<sup>2</sup>Other co-authors of this work are Baris Esen, Nektar Xama and Jhon Gomez.

latent defects that have been introduced artificially by etching pinholes in their gate (see an example in Figure 6.1). The measurement results show that the drain current in transistors with defects increases with the area and depth of the defect, and that this behavior can be modeled accurately using an effective  $\text{tox}$  value during test simulations.

As clear indication of the impact of this work, these contributions on AMS defect models have been anchored into IEEE standards like P2427 on Analog Defect Modeling.



Fig. 6.1. Atomic force microscopy picture of a transistor on which a pinhole was introduced for the modeling of latent defects in [23].

2) *Automated insertion of DfT structures* : The focus in this part of the collaboration has been to develop DfT structures and optimization algorithms to insert these structures into the Circuit under Test (CUT) such as to increase the defect coverage at the lowest possible cost. The targeted DfT solutions were lightweight in order to minimize the silicon area overhead and to facilitate generic deployment, i.e. they should be general enough, such that they can be inserted in all or most types of circuits. For instance, one simple DfT example explored was to add test diodes to the circuit, allowing a fully simultaneous detection of potential defects<sup>3</sup>.

To enhance the controllability of a given AMS circuit during testing, the topology modification technique has been developed, so that pull-up or pull-down test transistors are inserted into the initial circuit to make a connection between targeted nodes and the ground or power supply<sup>4</sup>. During testing, the gates of these inserted transistors and hence the connections are activated, hereby modifying the topology of the circuit to better expose targeted defects. Later this topology modification method was developed further to address the specific case of screening out latent defects [24], a contribution that received the ETS 2017 Best Paper award and illustrated in Figure 6.2.

To enhance the observability of a given circuit, a technique has been introduced [25] to determine circuit locations to which locally small detection blocks are to be connected. The presence of a targeted defect triggers such “detector block”, which then leaves a detectable trace in the power consumption. Hence,

<sup>3</sup>B. Esen, et al. "A very low cost and highly parallel DfT method for analog and mixed-signal circuits," proceedings IEEE European Test Symposium (ETS), 2017.

<sup>4</sup>A. Coyette, et al. "Automated testing of mixed-signal integrated circuits by topology modification," proceedings IEEE VLSI Test Symposium (VTS), 2015.



Fig. 6.2. Example of topology modification (in gray) applied in [24].

defect coverage is increased without extra test pins and with limited extra area.

Later on, controllability-enhancing and observability-enhancing DfT structures were co-generated and their combination was co-optimized to maximize the overall defect coverage at minimum extra test cost<sup>1</sup>. This was then extended to a full Built-In Self-Test (BIST) approach for analog IP blocks, where the blocks are co-designed with low-cost DfT structures for on-chip test signal generation and response analysis.

This work had a large impact on industry. Siemens has presented in 2024 at the International Test Conference follow-up developments and results capitalizing on the usage of the topology modification and local-detection techniques. For example, ADC circuits can now be tested within microseconds with high defect coverage and defect simulations within the same day. Further investigations are also occurring at onsemi to effectively deploy these techniques in the field.

3) *Automated generation and analysis of test signals:* The third part of the onsemi-KU Leuven collaboration focused on methods for the automated generation of improved test signals and test response analysis to increase the coverage of hard and latent defects.

For automated analog test signal generation, the ADAGE tool has been developed<sup>2</sup>. It determines optimal test stimuli to maximize the defect coverage, while simultaneously adding the necessary DfT circuitry. It uses circuit partitioning and interval arithmetic to deal with large circuit complexity<sup>3</sup>. Experiments show a large increase in coverage at minimum test cost.

Several methods have been developed for improved test response analysis. The optimized selection of proper defect-specific masks on test outputs has been shown to improve the

<sup>1</sup>A. Coyette, et al. "Automatic generation of test infrastructures for analog integrated circuits by controllability and observability co-optimization," *Integration, the VLSI Journal*, 2016.

<sup>2</sup>A. Coyette, et al. "ADAGE: automatic DfT-assisted generation of test stimuli for mixed-signal integrated circuits," *IEEE Design & Test*, Vol. 35, no. 3, June 2018.

<sup>3</sup>A. Coyette, et al. "Automatic test signal generation for mixed-signal integrated circuits using circuit partitioning and interval analysis," *proceedings IEEE International Test Conference (ITC)*, 2016.

defect coverage<sup>4</sup>. Also the use of Dynamic Part Average Testing (DPAT) instead of standard PAT at the final testing stage of ICs has been shown to improve the analog fault coverage<sup>5</sup>. The combination of this DPAT with information from visual inspection allows to effectively screen out outliers at a small increase in yield loss [26], a contribution that received the ETS 2020 Best Paper award. The use of multivariate statistical methods has also been shown to be effective for building predictor models for outlier behavior<sup>6</sup>. And even the use of relatively simple metrics such as the Difference in Distance to Mean value (DDTM) has been demonstrated to better identify latent defects at zero extra cost<sup>7</sup>.

### 6.B. Current state of play

Recent work is focusing on improving the defect coverage even further, also for latent defects, to bring the level of AMS test escapes closer to the target of zero part-per-billion (PPB) needed to avoid field returns. The use of machine learning techniques has been shown to extract even more information out of test data to this purpose. An approach using pre-trained SVM classifiers showed excellent results for latent defect detection at limited yield loss and without extra tests in the test program<sup>8</sup>. Secondly, embedding test generation and signature analysis in analog IP blocks with embedded BIST not only allows continuous monitoring of circuit functioning, but also shortens the total test time development<sup>9</sup>.

### 6.C. Connection to ETS and outlook

Many of the publications resulting from the joint onsemi-KU Leuven research collaboration have been published at the European Test Symposium (ETS), and two papers even received the Best Paper Award (in 2017 and 2020). It is clear that techniques for automating and improving the effectiveness of AMS IC testing and the related DfT methods are extremely important in industry, especially for safety-critical applications like automotive. In addition, industry is being confronted with many additional constraints. Time to market pressure is always high, not only for design but also for test development. Also the tasks of the engineers keep rising, with additional constraints such as functional safety, cybersecurity assessment and formal requirement management. These constraints need effective solutions in the near future.

<sup>4</sup>A. Coyette, et al. "Optimization of analog fault coverage by exploiting defect-specific masking," *proceedings IEEE European Test Symposium (ETS)*, 2014.

<sup>5</sup>W. Dobbelaere, et al. "Analog fault coverage improvement using final-test dynamic part average testing," *proc. International Test Conference (ITC)*, 2016.

<sup>6</sup>N. Xama, et al. "Avoiding mixed-signal field by outlier detection of hard-to-detect defects based on multivariate statistics," *proceedings IEEE European Test Symposium (ETS)*, 2020.

<sup>7</sup>J. Gomez, et al. "DDtM: increasing latent defect detection in analog/mixed-signal ICs using the difference in distance to mean value," *IEEE Tr. on Computer-Aided Design*, Vol. 41, no. 11, Nov. 2022.

<sup>8</sup>N. Xama, et al. "Boosting latent defect coverage in automotive mixed-signal ICs using SVM classifiers," *IEEE Tr. on Computer-Aided Design*, Vol. 42, no. 10, Oct. 2023.

<sup>9</sup>J. Gomez, et al. "High-cover analog IP block test generation methodology using low-cost signal generation and output response analysis," *proceedings IEEE European Test Symposium (ETS)*, 2023.

## 7. IMEC IN TEST

*Po-Yao Chuang and Erik Jan Marinissen.  
imec – Leuven, Belgium*

### 7.A. Team Composition

For the first 24 years of imec's existence, "test" was not considered a research topic. Imec's fabs — at the time only two, one for  $\phi 200$ mm wafers and one for  $\phi 300$ mm wafers — produced numerous wafers filled with parametric test structures such as Meander-Forks and Van der Pauws. These structures help(ed) characterize various technology options as part of imec's world-leading wafer processing technology research. However, testing was seen as routine operational work, not worth patenting or publishing.

This all changed in October 2008 with the arrival at imec of Erik Jan Marinissen. Several customers from imec's 3D-Integration research program had begun inquiring about how and when to test 3D-stacked multi-dies. In response, imec hired Marinissen, who had previously worked for almost 20 years at Philips Research on various test and design-for-test (DfT) topics, the most important one being modular testing of embedded-core-based SoC testing.

For fifteen years, imec's test research team was a one-man operation, occasionally supported by individual colleagues, suppliers, customers, and students. Among those colleagues were Phillippe Absil, Eric Beyne, Kristof Croes, Jeroen De Coster, Jaber Derakhshandeh, Bart De Wachter, Ingrid De Wolf, Luc Dupas, Mario Konijnenburg, Dimitri Linten, Rafal Magdziak, Stephan O'Loughlin, Herman Oprins, Armita Pod-pod, Siddharth Rao, Gouri Sankar Kar, Michele Stucchi, Joris Van Campenhout, and Geert Van der Plas. Among the suppliers were companies like Cadence Design Systems, Mentor Graphics (now Siemens EDA), Synopsys, Cascade Microtech (now FormFactor), Feinmetall, MPI Corporation, Technoprobe, and Tokyo Electron, while among the customers were ARM, GLOBALFOUNDRIES, Qualcomm, and TSMC.

Students from around the world traveled to Leuven, usually for internships lasting six to nine months: Jouke Verbree (TU Delft), Chun-Chuan Chi (NTHU Hsinchu), Sergej Deutsch (Braunschweig), Po-Yuan Chen (NTHU Hsinchu), Christos Papameletis (TU Delft), Tobias Burgherr (FHNW), Konstantin Shibin (TalTech), Ferenc Fodor (Cluj-Napoca), Ming Shao (KU Leuven), Yu Li (KU Leuven), Harm van Schaaijk (TU Eindhoven), Yu-Rong Jian (NTHU Hsinchu), Pai-Yu Tan (NTHU Hsinchu), Santosh Malagi (TU Delft), Leonidas Katselas (Thessaloniki), Michael Mainemer (TU Delft), Min-Chun Hu (NTHU Hsinchu), Stef Hermans (TU Eindhoven), Ricardo Pardo Parado (TU Eindhoven), Francesco Lorenzelli (Bologna), Marco Claessens (TU Eindhoven), Lizhou Wu (TU Delft), Zhan Gao (TU Eindhoven), Po-Yao Chuang (NTHU Hsinchu), Francesco Lorenzelli (KU Leuven), Sicong Yuan (TU Delft), Tsung-Hsuan (Annie) Wang (NTHU Hsinchu), and Fang-Ying (Kelly) Chen (NTHU Hsinchu).

In 2022, imec, under influence of Marinissen's Parkinson's Disease, opened a vacancy for a second employee in the Test and DfT team. There was a lot of response on this vacancy



Fig. 7.1. Fltr: (a) Erik Jan Marinissen, Christos Papameletis, and Tobias Burgherr; (b) Jeroen De Coster, Santosh Malagi, Ferenc Fodor, Yu-Rong Jian, Erik Jan Marinissen, Zhan Gao, Pai-Yu Tan, and Ingrid De Wolf; (c) Lizhou Wu, Alexander Marinissen, Zhan Gao, Adelia and Erik Jan Marinissen, Santosh Malagi, and Leonidas Katselas; (d) Alexander Marinissen, Zhan Gao, Francesco Lorenzelli, Po-Yao Chuang, and Kazuki Monta.

from all over the globe, but in the end it was decided to go with Po-Yao Chuang, a PhD student from prof. Cheng-Wen Wu at the National Tsing-Hua University (NTHU) in Hsinchu, Taiwan, who was at that time doing an internship at imec. After completing this internship in March 2023, Chuang returned to Taiwan, in order to re-start as an imec employee per July 2023. Chuang completed his PhD studies at NTHU in December 2024. Figure 7.2 shows Chuang and Marinissen on November 25, 2024, just before Chuang's departure to his PhD exam at NTHU in Hsinchu, Taiwan.



Fig. 7.2. Po-Yao Chuang (left) and Erik Jan Marinissen in Nov. 2024.

### 7.B. Test Research Topics at imec

Marinissen originally came to imec to address the test and test generation challenges of 2.5D- and 3D-stacked ICs and

hence most work fall under imec's industrial affiliation program on 3D system integration. Topics we worked and published on include the following.

- **3D-CoSTAR: test cost modeling and flow optimization** [*JETTA'12*] [*TODAES'15*]

Together with TU Delft and Qualcomm, we developed a versatile software tool that accounts for (1) design, (2) wafer processing, (3) stack assembly, (4) packaging, (5) testing, and (6) logistics shipment between various factories. The output of the tool is the total cost and expected quality level of a multi-die stack, as well as a cost breakdown per type.

- **Wafer matching of repositories of pre-tested wafers** [*ETS'10*] [*JETTA'12*]

Stacking entire wafers has attractive benefits, but unfortunately suffers from low compound stack yield, as one cannot prevent to stack a bad die to a good die or vice versa. Matching individual wafers from repositories of pre-tested wafers to each other is a simple yet effective method to significantly increase the compound stack yield. We presented a mathematical model. Simulation results demonstrated that, for realistic cases, relative yield increases of 0.5% to 10% could be achieved.

- **Large-array fine-pitch micro-bump probing** [*ITC'14*] [*ITC-Asia'17*]

Together with Cascade Microtech, Tokyo Electron, and Technoprobe, we demonstrated the feasibility of probing large arrays with 1,752 micro-bumps at  $40\mu\text{m}$  pitch (= JEDEC's "WideIO2"), to enable pre-bond testing of non-bottom dies through their functional micro-bump interface.

- **Various other 3D probe challenges**

[*ITC'15*] [27]

Includes probing on large tape frames, probing on ultra-thin wafers on tape, and probing singulated die (stacks) on tape.

- **3D-DfT test access architecture** [*VTS'10*] [*JETTA'12*]

We defined a standardizable DfT architecture based on IEEE Std 1500<sup>TM</sup> and patented certain aspects of it. In a joint development project with Cadence, we developed the corresponding DfT insertion tool flow. We designed two demonstrator test chips, one with TSMC and the other one with GLOBALFOUNDRIES.

- **Standard 3D-DfT** [*28*] [*Computer'21*]

In 2010, Marinissen led an IEEE-SA study group that inventorized the needs for standards in 3D testing and subsequently started the working group that developed IEEE Std 1838<sup>TM</sup>-2019. Marinissen has been very instrumental in the proliferation of this standard, through press releases, articles and interviews in trade journals, conference tutorials and papers, in-house company courses, etc.

- **Post-bond testing of passive interposers**

[*ITC'11*] [*ATS'11*]

2.5D- and 5.5D-stacked ICs contain at their bottom a passive interposer that interconnects the die (stacks) on top of it. Our work systematically covers these interconnects while using them as test access mechanism as well.

- **Chiplet interconnect test** [*3DIC'23*] [*ETS'25*] [29].

Chuang and Marinissen worked jointly on improving the tests for inter-die interconnects. In comparison with the conventional True/Complement Test by Wagner, which, for  $k$  interconnects, requires  $2 \cdot \lceil \log_2(k) \rceil$ , our E<sup>2</sup>TEST is both more *effective* (as it includes coverage of *weak* open and short defects) as well as more *efficient* (as we do no longer spend test patterns on non-realistic shorts between bumps which are too far apart). Later, E<sup>2</sup>TEST was extended to handle multiple "adjacency maps" for micro-bumps and/or interposer wiring.

- **Chiplet interconnect repair** [*VTS'24*] [29].

Chuang and Marinissen worked jointly on an Interconnect Repair Language that describes the on-chip "repair" (= detouring) infrastructure.

Next to 3D-test and -DfT, we also worked on other test-related topics, including the following.

- **Optimization of Cell-Aware Test** [*JETTA'21*] [30].

Together with Zhan Gao, PhD student at TU/e, and Cadence Design Systems, we developed and improved Cadence' Cell-Aware ATPG tool:

- Defect location identification on the basis of parametric extraction
- Optimizing Cell-Aware ATPG using defect detection matrices
- Application of Cell-Aware Test on an advanced 3nm CMOS technology library
- Tightening the mesh size of the Cell-Aware ATPG net for catching all detectable weakest faults

- **Equipment for automatic wafer probing of silicon photonics chips** [*ETS'16*]

In a joint development project with Cascade Microtech, we developed a wafer probe station for electrical and optical measurements.

- **STT-MRAM defects, faults, and test algorithms** [*DATE'20*] [*ITC'20*] [31]

With Lizhou Wu and Sicong Yuan, PhD students at TU Delft.

- **Understanding the transistor behavior of electron-spin qubits above cryogenic temperatures** [*EDL'24*, *TED'25*]

With Francesco Lorenzelli, PhD student at KU Leuven.

## 7.C. *imec and IEEE European Test Symposium 2021*

imec had won the bid to host and organize the 2021 edition of ETS in the beautiful medieval city of Bruges. However, a second wave of the world-wide COVID pandemic forced us to turn ETS in an on-line only event. Under leadership of General Co-Chairs Michele Stucchi (imec) and Georges Gielen (KUL), the registration rates could be drastically reduced, due to which we had an all-time high of 346 participants. Erik Jan Marinissen (imec) organized a unique on-line social event: the first Global Test Community Quiz. With 175 participants, GTCQ was a great success, and hence repeated with ITC'21.

## 8. TIMA: TEST, RELIABILITY AND SECURITY<sup>1,2</sup>

*G. Di Natale, M. J. Barragan, P. Maistri, S. Mir, E. I. Vatajelu  
Univ. Grenoble Alpes, CNRS, Grenoble INP, TIMA, France*

### 8.A. TIMA

Established in 1994, TIMA (Techniques of Informatics and Microelectronics for integrated systems Architecture) focuses on the architecture of integrated systems, encompassing electronic engineering and computer science. The laboratory is located in downtown Grenoble (France). Faculty members are either full-time researchers at the CNRS (The National Centre for Scientific Research), or professors at the Graduate schools of Engineering and Management (Grenoble INP) and University Grenoble Alpes (UGA).

The laboratory's expertise spans the specification, design, verification, and testing of integrated circuits and systems, leveraging advancements in micro and nanoelectronics (digital, analog, RF, MEMS, and sensors) and computer science (microprocessor architectures, embedded operating systems, and algorithms). TIMA aims to address challenges related to energy consumption, cost, performance, quality, and dependability (reliability, security, and safety) while promoting design automation through advanced methods and CAD tools.

TIMA has been a significant player in the ETS community, contributing as authors of numerous papers and serving on steering, organizing, and program committees:

Steering Committee (L. Anghel, G. Di Natale, EI. Vatajelu), General Chair (L. Anghel), Program Chair (S. Mir, EI. Vatajelu), Topic Chair (G. Di Natale, S. Mir, R. Leveugle, EI. Vatajelu), TSS Chair (L. Anghel, EI. Vatajelu), Program Committee Members (M. Barragan, M. Benabdenbi, B. Courtois, G. Di Natale, S. Mir, M. Nicolaïdis, M. Portolan, EI. Vatajelu).

### 8.B. Journey Through Our Research Activities

**Test and Standards:** Our research team has made significant contributions to the domain of VLSI testing and test standards, focusing on innovative solutions for memory testing, power efficiency, and interactive testing frameworks. We have developed advanced Built-In Self-Test (BIST) methodologies, including a programmable memory BIST based on transparency, which enhances flexibility by allowing programmable test algorithms and data for detecting a wide range of memory faults<sup>3</sup>. Additionally, we proposed a mixed-signal BIST computation offloading approach leveraging IEEE 1687, which reduces computational overhead by offloading intensive DSP tasks off-chip, enabling independent optimization of analog and digital components<sup>4</sup>. In the domain of power efficiency, we introduced low-power memory repair architectures tailored for high defect densities, combining Error-Correcting Code (ECC) techniques

<sup>1</sup>Our full team: M. Benabdenbi, S. Bourdel, K. Morin Allory, E. Lauga, R. Leveugle, R. Possamai Bastos, E. Simeu, N. Zergainoh

<sup>2</sup>Our previous members: L. Anghel, M. Nicolaïdis, M. Portolan, H. Stratigopoulos, R. Velazco

<sup>3</sup>S. Boutobza, et al., A Transparent based Programmable Memory BIST, ETS06

<sup>4</sup>M. Poortolan, Mixed-Signal BIST computation offloading using IEEE 1687, ETS17

with associative caches to minimize energy consumption during memory repair processes<sup>5</sup>. Furthermore, we explored interactive testing frameworks using IEEE 1687, employing PDL-1 to enable real-time communication between testers and the system under test, with implications for EDA tools and analog applications<sup>6</sup>. A key highlight of our work is the recent paper [32], which addresses the challenges of reliability in STT-MRAM under Dynamic Voltage Scaling (DVS). We proposed a novel methodology utilizing failure prediction registers to anticipate memory error probabilities at different supply voltage configurations. By dynamically adjusting the voltage based on these predictions, our approach ensures memory reliability while balancing speed, energy efficiency, defect reduction, and aging management. This work defines an acceptable voltage range that minimizes read and write errors, offering a robust solution for optimizing STT-MRAM performance in energy-constrained environments.

**AMS/RF Test and Reliability:** Since the creation of TIMA laboratory over 30 years ago, we have regularly contributed in the field of analog, mixed-signal and RF test. Our research is focused on two complementary goals: a) enabling analog low-cost BIST solutions, and b) developing machine learning-based test solutions, usually called alternate test in the literature, for AMS/RF integrated circuits.

Regarding our contributions in analog BIST for the European Test Symposium, our team has presented state-of-the-art embedded test instruments and efficient on-chip test techniques for mixed-signal circuits. We have developed the concept of reduced-code testing for the static test of ADCs that has been demonstrated for the on-chip characterization of pipeline<sup>7</sup> and SAR ADCS<sup>8</sup>. We have proposed accurate and spectrally pure on-chip sinusoidal stimulus generators based on the principle of harmonic cancellation for the dynamic test of ADCs<sup>9,10</sup>. We have proposed an on-chip test instrument based on subsampling techniques for the characterization of high-frequency clock jitter in industrial PLLs<sup>11</sup>. MEMS BIST solutions based on pseudorandom test stimuli and impulse response analysis have been developed<sup>12,13</sup>.

With respect to our contributions on alternate test methods, we have focused our efforts in exploring and overcoming the application limits of machine learning-based test for AMS/RF

<sup>5</sup>P. Papavramidou, et al., Reducing Power Dissipation in Memory Repair for High Defect Densities, ETS13

<sup>6</sup>M. Portolan, What Would Interactive Testing With 1687 Look Like?, ETS24

<sup>7</sup>A. Laraba, et al., Enhanced reduced code linearity test technique for multi-bit/stage pipeline ADCs, ETS12

<sup>8</sup>R. S. Feitoza, et al., On-chip reduced-code static linearity test of Vcm-based switching SAR ADCs using an incremental analog-to-digital converter, ETS20

<sup>9</sup>H. Malloug, et al., A 52 dB-SFDR 166 MHz sinusoidal signal generator for mixed-signal BIST applications in 28 nm FDSOI technology, ETS19

<sup>10</sup>A. Mamgain, et al., Analysis and mitigation of timing inaccuracies in high-frequency on-chip sinusoidal signal generators based on harmonic cancellation, ETS21

<sup>11</sup>H. Le Gall, et al. High frequency jitter estimator for SoCs, ETS15

<sup>12</sup>A. Dhayni, et al. MEMS built-in-self-test using MLS, ETS04

<sup>13</sup>A. Dhayni, et al., Evaluation of impulse response-based BIST techniques for MEMS in the presence of weak nonlinearities, ETS05

circuits. We proposed the concept of defect filtering for screening faulty circuits<sup>1</sup>. Since alternate test techniques are trained to learn process variations from observational data from the production line, circuits containing fabrication defects would degrade the training as the observed degradation is not caused by process variations. We then proposed the use of non-intrusive sensors for alternate test of RF circuits, addressing in particular parametric faults<sup>2</sup>. The idea of determining the root-cause of performance degradations motivated our research on feature selection, feature design and causal inference, aimed at designing optimum signature sets for robust and accurate machine learning-based test and calibration strategies<sup>3,4</sup>. These techniques were employed to develop an automated test generation flow for AMS-RF alternate test, that was demonstrated on a nonintrusive test program for mm-wave integrated circuits [33].

**Fault Tolerance and Robustness:** TIMA has made significant contributions to fault tolerance and performance analysis in complex digital systems by developing advanced models and innovative methodologies. We have designed high-level models to analyze the resilience of Networks-on-Chip (NoC) against faults, enabling the anticipation and enhancement of system robustness<sup>5</sup>. In the field of circuit robustness verification, we have proposed a hybrid algorithm combining SAT solving and simulation, capable of efficiently identifying counterexamples while considering logical, temporal, and electrical masking, as well as technological variations<sup>6</sup>. Our work has also focused on designing efficient fault detection architectures for low-power processors, particularly latch-based designs, as well as developing formal metrics to quantify system resilience through formal fault injection techniques<sup>7</sup>. Finally, we have studied self-repair mechanisms for interconnections in 3D integrated systems, leveraging adaptive serialization and deserialization techniques to ensure increased reliability<sup>8</sup>. In [34] we have explored alternative methods for assessing system dependability in compliance with functional safety standards. Traditional fault injection techniques, while effective, are costly, complex, and often applicable only late in development. To enable early and efficient evaluations, the paper highlights the advantages of cross-layer evaluation over single-layer methods, and it proposes alternative approaches such as Register Data Lifetime (RDL) analysis, Architectural Correct Execution (ACE) analysis, and software-level techniques, which help identify critical system components.

<sup>1</sup>H.-G. Stratigopoulos, et al., Defect Filter for Alternate RF Test, ETS09

<sup>2</sup>L. Abdallah, et al., Sensors for built-in alternate RF test, ETS10

<sup>3</sup>G. Leger, et al., Questioning the reliability of Monte Carlo simulation for machine learning test validation, ETS16

<sup>4</sup>M. J. Barragan et al., Efficient selection of signatures for analog/RF alternate test, ETS13

<sup>5</sup>F. Chaix, et al., A generic and high-level model of large unreliable NOCs for fault tolerance and performance, ETS14

<sup>6</sup>N. Thole, A Hybrid Algorithm to Conservatively Check the Robustness of Circuits, ETS16

<sup>7</sup>H. Yu, et al., Efficient Fault Detection Architecture Design of Latch-Based Low Power DSP/MCU Processor, ETS11

<sup>8</sup>M. Nicolaïdis, et al., I-BIRAS: Interconnect Built-In Self-Repair and Adaptive Serialization in 3D Integrated Systems, ETS11

## Hardware Security and Trust

We have made significant contributions to the security of hardware and test infrastructures by addressing vulnerabilities and developing protection mechanisms for integrated circuits (ICs). We have investigated the risks posed by hardware Trojans (HTs), proposing detection and mitigation techniques that prevent stealthy attacks exploiting shared test access mechanisms (TAMs)<sup>9</sup>. To secure access to test infrastructures, we have developed dynamic authentication mechanisms, such as integrating the SSAK protocol into MAST, ensuring secure execution of test procedures<sup>10</sup>. In the context of 2.5D/3D ICs, we have combined scan chain encryption with message integrity verification to protect against unauthorized access and malicious tampering [35]. Additionally, we have contributed to control flow integrity (CFI) verification, employing non-linear coding techniques to detect and prevent code manipulation<sup>11</sup>. Our research emphasizes the need for multi-layered security approaches, balancing protection with performance constraints, and ensuring hardware security throughout the entire lifecycle, from design to deployment<sup>12</sup>. More recently, we have started working on Physical Unclonable Functions (PUFs) as a means to enhance hardware security through cryptographic primitives used for key generation and device authentication. Our work focuses on evaluating and improving the reliability of PUFs, particularly their resistance to environmental variations, aging effects such as Bias Temperature Instability (BTI) and Time-Dependent Dielectric Breakdown (TDDB), as well as fault injection attacks (FIA) using laser-based techniques [36]. We have explored the integration of digital sensors to monitor environmental changes, such as voltage and temperature fluctuations, which could impact PUF response integrity. Specifically, for delay-based PUFs, we have investigated how propagation time variations can be leveraged for enhanced security, proposing mechanisms that detect anomalies and reinforce the robustness of PUF-generated cryptographic keys.

## 8.C. Perspectives

TIMA aims to be a key player in the major transformations of the electronics industry, aligning with the European Chips Act, and addressing sustainability challenges as well. We will advance quality, security and reliability in both digital and analog systems, developing methods for heterogeneous 2.5D/3D chiplet-based ASICs, enhancing security of future circuits, including RISC-V processors. In analog and RF circuits, we focus on ultra-low-power design, adaptive systems, and AI-driven optimization to improve performance, resilience, and energy efficiency. Through European and industrial collaborations, our research strengthens the security and reliability of embedded systems in critical sectors like automotive, telecommunications, and cybersecurity.

<sup>9</sup>M. Elshamy, et al., Hardware Trojan Attacks in Analog/Mixed-Signal ICs via the Test Access Mechanism, ETS20

<sup>10</sup>M. Portolan, et al., Dynamic Authentication-Based Secure Access to Test Infrastructure, ETS20

<sup>11</sup>G. Di Natale, et al., Nonlinear Codes for Control Flow Checking, ETS20

<sup>12</sup>A. Ali Pour, PUF Enrollment and Life Cycle Management: Solutions and Perspectives for the Test Community, ETS20

## 9. POLITO'S CONTINUOUS CONTRIBUTION TO THE EUROPEAN COMMUNITY OF TEST AND RELIABILITY

*P. Bernardi, S. Di Carlo, P. Prinetto, M. Sonza Reorda, M. Vianante.*

*Politecnico di Torino, Italy*

The Dept. of Control and Computer Engineering of Politecnico di Torino is one of the main contributors to the European community of test and reliability, both from a technical and from an organizational point of view. This Section will summarize our main contributions on both points of view, and highlight the current and future activities at PoliTo in the field<sup>1</sup>.

### 9.A. Technical contributions

Over the last 30 years, researchers from PoliTo have been very active on several subjects related to test and reliability, and systematically presented their solutions and results at ETS. Since ETS publishes papers in IEEE Xplore (2003), they authored and presented 26 papers at the event. Remarkably, these 26 papers are coauthored by more than 80 different authors, coming from more than 20 different institutions, witnessing the wide network of research connections created by PoliTo's scientists over the last decades.

In the following we highlight the main contributions.

1) **Reliability of RAM-based FPGA devices:** The growing interest for deploying high performance computing systems to space, combined with the need for overcoming the limitations of radiation hardened devices available in the early 2000s, stimulated the research group at PoliTo to pursue investigations about the use of commercial off the shelf SRAM-based FPGAs in radioactive environments. The methodology the group adopted was innovative, as it approached the problem from a different perspective. Indeed, the previous works relied on empirical approaches based on performing either accelerated radiation ground testing experiments, or fault injection experiments. The innovation the PoliTo research group proposed lied in a systematic reverse engineering of the SRAM-based FPGA configuration memory, allowing building a model of the relationship between configuration memory and FPGA resources. Thanks to this approach it was possible to develop static design validation algorithms to identify vulnerabilities of designs mapped in SRAM-based FPGAs, and to develop dependability-aware place and route algorithms to deploy soft-error immune designs in SRAM-based FPGAs. The resulting toolchain was covered in both ETS papers (e.g., [37]) and major journal papers, and it was the cornerstone for a number of research activities in collaboration with European Space Agency, Boeing Satellite Systems, and EADS.

2) **Functional test:** PoliTO extensively and actively works on the Functional Test since about thirty years. Along this period, especially from 1995 to 2015, the activities on Functional Test courageously continued while the major attention from

the community was concentrated on structural techniques. Recently, with the growing importance of safety in some domains (e.g., automotive) and the advent of ISO26262 about 10 years ago, the solid background of PoliTO on Functional Techniques like Software-Based Self-Test (SBST) enabled the adoption of the research results to In-Field reliability, Burn-In optimization and System-Level-Test. In particular, the practical relevance of these activities grew when Self-Test Libraries became a popular mechanism to test SoCs in the field in safety-critical applications (e.g., in automotive). Several papers appeared at ETS on the topic (e.g., [38] and [39]), extending also the scope to new fault models, to GPUs and to the compaction of the functional tests.

3) **Cross-Layer Reliability estimation:** PoliTo has made significant contributions to the field of cross-layer reliability estimation, particularly in the domains of AI-driven architectures, embedded systems, and fault-tolerant computing. Their research has focused on developing multi-layered methodologies that assess and mitigate reliability challenges stemming from hardware faults, transient errors, and aging effects across different abstraction levels of computing systems [40, 41]. A key aspect of their work has been the integration of Hardware Performance Monitoring (HPM)-based fault detection, machine learning-driven anomaly identification, and error propagation analysis, enabling a comprehensive assessment of reliability that spans circuit, architecture, and system levels. At the European Test Symposium, the PoliTo team has presented pioneering results demonstrating how HPMs can be leveraged for AI-driven fault detection, significantly improving the early-stage identification of transient errors in high-performance and safety-critical computing platforms. Their studies have also introduced novel cross-layer modeling frameworks, enabling the characterization of error propagation from hardware to application layers. This research has enhanced fault tolerance strategies in key sectors such as automotive, aerospace, and biomedical systems, fostering more resilient AI models and embedded architectures. By bridging the gap between low-level fault characterization and high-level system dependability, the PoliTo team continues to shape the landscape of cross-layer reliability estimation and self-adaptive fault-tolerant computing.

4) **Memory and Embedded Memory test and diagnosis:** As a constant trend from early 2000 till today, Embedded Memories in System-on-Chip continue to be a significant component of the overall yield of modern chips. Despite the subject of testing embedded memories looks quite mature after decades of intensive studies, the topic evolved in recent years to adapt to technology trends (i.e., new memory cell technologies) and to the escalation to huge memorization capacity (i.e., up to Megabytes available on-chip). In productive environments where large volumes of devices are manufactured, great attention is dedicated to discovering any systematic issue that may severely impact the yield. Unfortunate situations, when root causes are impacting a meaningful portion of the chip population, are dramatic detractors of the production economy and must be intercepted as soon as possible. PoliTO investigations covered a large spectrum of related topics, including

<sup>1</sup>The authors would like to acknowledge the contribution to the research activities on test and reliability by the other faculty members of PoliTo working in the area: Sarah Azimi, Corrado De Sio, Riccardo Cantoro, Maurizio Rebaudengo, Josie E. Rodriguez Condia, Annachiara Ruospo, Ernesto Sanchez, Alessandro Savino, Giovanni Squillero, Luca Sterpone.

BIST architectures, Diagnostic DfT and Firmware, and Failure Bitmap analysis, both in RAM and Flash memories.

The research activities highlighted so far were supported by several bodies, including the European Commission, the European Space Agency, the Italian and Piedmont governments. Among them, it is worth mentioning the following:

- **AMATISTA (IST)** - Automatic Tool for Insertion and Simulation of Fault Tolerant Architectures
- **BASTION (FP7)** - Board and SoC Test Instrumentation for Ageing and No Failure Found
- **CLERECO (FP7)** - Cross-layer Early Reliability Evaluation for the Computing Continuum
- **MAMMoTH-Up (H2020 COMPET)** – Massively Extended Modular Monitoring for Upper Stages
- **VEGAS (H2020 COMPET)** - Validation of European high capacity rad-hard FPGA and Software tools
- **RESCUE (H2020 ITN)** - Interdependent Challenges of Reliability, Security and Quality in Nanoelectronic Systems Design
- **PERIOD (H2020 MSCA)** - Pursuing Efficient Reliability of Object Detection for automotive and aerospace applications
- **VITAMIN-V (HORIZON Europe)** - Virtual Environment and Tool-boxing for Trustworthy Development of RISC-V based cloud Services.
- **TIRAMISU (HORIZON MSCA Doctoral Network)** - Training and Innovation in Reliable and Efficient Chip Design for Edge AI
- **DARE (HORIZON-JU-RIA)** - Digital Autonomy for RISC-V in Europe
- **FAIR (NextGenerationEU)** - Future Artificial Intelligence Research
- **SERICS (NextGenerationEU)** - Security and Rights in the CyberSpace

Finally, we remember that the ETS2005 Best Paper Award went to the paper “Multiple Errors produced by Single Upsets in FPGA configuration memory: a possible solution” by M. Sonza Reorda, Luca Sterpone and M. Violante [1].

### 9.B. Organizational contributions

PoliTo’s researchers played a key role in the support to the activities of the European community of test and reliability from an organizational point of view. Considering ETS (and its ancestor ETW), the following is a list of the key roles they played:

- 1997: Paolo Prinetto – General Chair
- 2000: Paolo Prinetto – Program Chair
- 2008: Matteo Sonza Reorda – General Chair
- 2012: Massimo Violante – Program Chair
- 2023: Paolo Bernardi – General Co-Chair
- 2025: Matteo Sonza Reorda – Program Co-Chair

Moreover, researchers from PoliTo played (and still play) several other roles in the ETS organization: as an example, in the ETS2025 organization Stefano Di Carlo is the Embedded Tutorial co-chair, Alessandro Savino the Fringe Workshops co-chair, Paolo Bernardi the Industrial Relations co-chair, Esteban

Rodriguez the Publication co-chair, Ernesto Sanchez one of the Regional Liaison chairs, and Riccardo Cantoro the Topic co-chair for topic T2.

Paolo Prinetto, Matteo Sonza Reorda and Paolo Bernardi have been / are also part of the ETS Steering Committee.

### 9.C. Perspectives and outlook

Politecnico di Torino continues to advance research in test and reliability, addressing key challenges in modern semiconductor design.

A major focus is the testing of complex Digital and Mixed-Signal SoCs, where researchers develop efficient methodologies to enhance test coverage while minimizing cost. Similarly, ensuring the reliability of AI accelerators and neuromorphic systems is critical, particularly for safety-critical applications in automotive, healthcare, and aerospace. PoliTo explores fault detection and mitigation strategies tailored to these emerging architectures.

The adoption of AI-driven test solutions is another active area, with machine learning techniques improving fault detection, test pattern generation, and predictive diagnostics. Additionally, volume test optimization and diagnostics remain crucial for large-scale semiconductor production, with efforts aimed at enhancing fault localization and yield learning.

With the rising need for cybersecurity and hardware security, PoliTo is also developing test strategies that protect against hardware-based attacks while maintaining high efficiency.

Currently, 14 faculty members and 20+ researchers are engaged in these efforts, supported by European and national projects in collaboration with industry. PoliTo is also committed to training future experts, with its PhD students consistently earning recognition, including the 2024 TTTC E. J. McCluskey Best Doctoral Thesis Award.

Looking ahead, Politecnico di Torino remains dedicated to shaping the future of semiconductor testing, ensuring reliability in emerging technologies, and driving innovation in the field.

## 10. RESEARCH AT SORBONNE UNIVERSITÉ, CNRS, LIP6: ANALOG AND MIXED-SIGNAL ICs TESTING AND SECURITY, AI FOR ICs TESTING, TESTING ICs FOR AI, AND AI SECURITY<sup>1,2</sup>

H.-G. Stratigopoulos. Sorbonne Université, CNRS, LIP6, Paris, France

This section traces the evolution of research on topics relevant to the IEEE European Test Symposium (ETS) at the Computer Science Laboratory (LIP6) of Sorbonne Université, which is also affiliated with the French National Centre for Scientific Research (CNRS). It highlights the formation of the research group, its key contributions, current projects, and future prospects.

### 10.A. Historical perspective of related research

The research group was founded in 2015 by Dr. Stratigopoulos, Research Director at CNRS. Before that, he was a member of the TIMA laboratory at Université Grenoble Alpes and Grenoble INP. Dr. Stratigopoulos has been actively engaged with ETS, serving on the Steering Committee since 2019 and taking on various roles, including Program Chair in 2017, tutorial presenter in 2018<sup>3</sup>, organizer of the Test Spring School from 2020 to 2023, organizer of the fringe IEEE Workshop on AI Hardware: Test, Reliability and Security (AI-TREATS), and Topic Chair for dependable AI and AI for testing from 2023 to 2025.

In its early years, the group focused on research in analog and mixed-signal (AMS) Integrated Circuit (IC) testing, building upon previous work conducted at the TIMA laboratory. Two key outcomes were the development of an adaptive test framework<sup>4</sup> and a generic Built-In Self-Test (BIST) technique for AMS ICs [42]. The adaptive test framework dynamically adjusts the standard test program on a per-die basis, achieving a desired trade-off between test time savings, fault coverage, and robust outlier detection. The BIST technique leverages system-wide invariances, where any deviation signals abnormal operation, and is designed for both post-manufacturing testing and concurrent error detection. Its fault coverage for an industrial

<sup>1</sup>We acknowledge the funding by the French National Centre for Scientific Research (CNRS), the Doctoral School EDITE de Paris, the Sorbonne Center for Artificial Intelligence (SCAI), ams AG, the French National Research Agency (ANR) projects EDITSoC, STEALTH, RE-TRUSTING, and CHAMELEON, and the EU projects Penta HADES, Chips JU Resilient Trust, Horizon dAIEDGE, CHIST-ERA TruBrain, and EDF ARCHYTAS.

<sup>2</sup>We acknowledge all colleagues (Prof. Hassan Aboushady and Dr. Marie-Minerve Louérat), long-term collaborators (Prof. Yiorgos Makris from UT Dallas, Prof. Ozgur Sinanoglu from NYU Abu Dhabi, Dr. Luis A. Camuñas-Mesa from the Instituto de Microelectrónica de Sevilla, Eric Faehn from STMicroelectronics, Dr. Fei Su from Intel, Prof. Ihsen Alouani from QUB), past PhD students (Dr. Julian Leonard, Dr. Sarah A. El-Sayed, Dr. Antonios Pavlidis, Dr. Mohamed Elshamy, Dr. Alán Rodrigo Díaz Rizo, Dr. Theofilos Spyrou), present PhD students (Spyridon Raptis, Abdelrahman Emad Abdelazim, Paul Kling, Hazem H. Hammam, Hanwen Xuan, Ioannis Kaskampas, Christos Malogiannis, Xindan Zhang, Valentin Barbaza), and post-doctoral researchers (Dr. Engin Afacan, Dr. Mariam Tlili, and Dr. Zalfa Jouni) who have contributed to this research.

<sup>3</sup>H.-G. Stratigopoulos. “Machine learning applications in IC testing”. In: *Proc. IEEE Eur. Test Symp. (ETS)*, 2018.

<sup>4</sup>H.-G. Stratigopoulos and C. Streitwieser. “Adaptive Test With Test Escape Estimation for Mixed-Signal ICs”. In: *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.* 37.10 (2018), pp. 2125–2138.

ADC by STMicroelectronics was validated using the Tessent DefectSim tool by Siemens EDA.

In 2018, the group expanded its research scope to include two new areas: hardware security with a focus on AMS ICs and the testing and reliability of AI hardware accelerators, particularly for Spiking Neural Networks (SNNs).

In the field of hardware security, our initial focus was on the threat of piracy, leading to the development of various defense mechanisms based on locking and layout camouflaging. Modern IC designs increasingly depend on third-party Intellectual Property (IP) cores, rely on outsourced fabrication, and face advanced reverse engineering techniques, making them highly vulnerable to theft. Consequently, IC designers must integrate anti-piracy features into their designs to mitigate these risks.

Locking provides comprehensive protection against attackers throughout the supply chain. This technique integrates a keying mechanism into the design, making it controllable by a multi-bit digital key to resist brute-force attacks. When the correct key is applied, the circuit functions as intended, whereas an incorrect key leads to significant performance degradation or complete malfunction. The key remains confidential, is not disclosed to untrusted parties, and is securely stored in a tamper-proof memory after fabrication. Our initial approach involved applying state-of-the-art logic locking techniques to the digital section of AMS circuits<sup>5</sup>. Notably, we demonstrated an audio application where the effect of locking could be *heard* for the first time. Our second defense leveraged the digital programmability of AMS ICs, using the digital calibration word as the secret key<sup>6</sup>. Lastly, we developed a logic locking technique tailored specifically for RF transceivers [43]. Traditional logic locking methods for digital ICs are often vulnerable to counter-attacks aimed at either extracting the secret key with minimal effort or bypassing the keying mechanism altogether. Our solution is inherently resistant to such countermeasures. The keying mechanism directly affects the synchronization between the transmitter and receiver. When an incorrect key is used, the communication link fails to establish, whereas with the correct key, the bit error rate remains unaffected.

We also introduced the first analog layout camouflaging technique designed to obfuscate the sizing of layout components<sup>7</sup>. This method specifically targets defending against reverse engineering attempts.

Alongside our work on anti-piracy measures for AMS ICs, we investigated Hardware Trojan (HT) attacks and corresponding defense strategies for AMS ICs. We introduced the concept of a digital-to-analog HT attack, where a compromised digital block exploits the test infrastructure to target a victim AMS

<sup>5</sup>J. Leonhard et al. “Digitally-Assisted Mixed-Signal Circuit Security”. In: *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.* 41.8 (2022), pp. 2449–2462.

<sup>6</sup>M. Elshamy et al. “Locking by Untuning: A Lock-Less Approach for Analog and Mixed-Signal IC Security”. In: *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.* 29.12 (2021), pp. 2130–2142.

<sup>7</sup>Leonhard et al. “Analog and Mixed-Signal IC Security via Sizing Camouflaging”. In: *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.* 40.5 (2021), pp. 822–835.

block within a system-on-chip<sup>1</sup>. Additionally, we proposed a covert communication channel attack in RF transceivers, enabled by an HT embedded within the physical layer of the transmitter<sup>2</sup>. To further advance research in this domain, we released an open-source covert channel dataset generated in hardware and proposed an AI-based detection strategy<sup>3</sup>.

Our research on the testing and reliability of AI hardware accelerators specifically targets neuromorphic computing based on SNNs [44]. Our initial effort focused on understanding the failure modes of spiking neurons<sup>4</sup>. The resulting fault model was used to conduct large-scale behavioral-level fault simulations on deep SNNs, leading to the development of some of the first fault-tolerance strategies<sup>5</sup>. We then carried out a comprehensive reliability analysis of an SNN hardware accelerator by performing fault injections on actual hardware<sup>6</sup>, proposing the use of an on-chip classifier for real-time detection of abnormalities in spiking behavior<sup>7</sup>. Lastly, we proposed a functional test strategy for SNN hardware accelerators, demonstrating that inputs that are correctly classified but with low confidence provide high fault coverage<sup>8</sup>.

### 10.B. Current research, frameworks and projects

Currently, the group remains focused on the same research areas, namely hardware security and SNN test and reliability, while also introducing a new research axis on SNN security.

Existing locking techniques for AMS/RF ICs secure the system at a high level by targeting peripheral biasing circuitry, calibration, or digital sections. However, they leave the individual AMS components exposed, making them susceptible to piracy. In our recent work, we demonstrated a method to obfuscate the size and ratings of components within AMS blocks. This approach ensures that an incorrect key not only disrupts performance trade-offs but also increases the likelihood of damaging the chip as some components will suffer electrical over-stress, effectively deterring key trial-based attacks on locked fabricated chips<sup>9</sup>.

In the domain of SNN testing and reliability, we recently released an open-source fault injection framework for SNNs

<sup>1</sup>M. Elshamy et al. “Digital-to-Analog Hardware Trojan Attacks”. In: *IEEE Trans. Circuits Syst. I, Reg. Papers* 69.2 (2022), pp. 573–586.

<sup>2</sup>A. R. Díaz-Rizo, H. Aboushady, and H.-G. Stratigopoulos. “Leaking Wireless ICs via Hardware Trojan-Infected Synchronization”. In: *IEEE Trans. Dependable Secure Comput.* 20.5 (2023), pp. 3845–3859.

<sup>3</sup>A. R. Díaz-Rizo et al. “Covert Communication Channels Based On Hardware Trojans: Open-Source Dataset and AI-Based Detection”. In: *Proc. IEEE Int. Symp. Hardw.-Oriented Secur. Trust (HOST)*. 2024, pp. 101–106.

<sup>4</sup>S. A. El-Sayed et al. “Spiking Neuron Hardware-Level Fault Modeling”. In: *Proc. IEEE Int. Symp. On-Line Test. Robust Syst. Des. (IOLTS)*. 2020.

<sup>5</sup>T. Spyrou et al. “Neuron Fault Tolerance in Spiking Neural Networks”. In: *Proc. Design Autom. Test Europe Conf. (DATE)*. 2021, pp. 743–748.

<sup>6</sup>T. Spyrou et al. “Reliability Analysis of a Spiking Neural Network Hardware Accelerator”. In: *Proc. Design Autom. Test Europe Conf. (DATE)*. 2022, pp. 370–375.

<sup>7</sup>T. Spyrou and H.-G. Stratigopoulos. “On-Line Testing of Neuromorphic Hardware”. In: *Proc. IEEE Eur. Test Symp. (ETS)*. 2023.

<sup>8</sup>S. A. El-Sayed et al. “Compact Functional Testing for Neuromorphic Computing Circuits”. In: *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.* 42.7 (2023), pp. 2391–2403

<sup>9</sup>H. H. Hammam, H. Aboushady, and H.-G. Stratigopoulos. “Analog Circuit Anti-Piracy Security by Exploiting Device Ratings”. In: *Proc. Design, Automat. Test Eur. Conf. Exhib. (DATE)*. 2025.

[45]. This framework offers high flexibility, featuring a comprehensive and extensible fault model library and support for single or multiple faults, permanent or transient faults, and injection at different stages, i.e., pre-training, during training, or post-training. It also includes deterministic and statistical fault injection methods, speed-up optimizations, and a graphical environment for visualizing results. Additionally, we introduced an innovative functional testing method for SNNs that optimizes a short-duration input to maximize fault coverage [46]. This approach significantly reduces both test generation and application time compared to the state-of-the-art while effectively scaling to deep SNNs.

In our new research direction on SNN security, we introduced an input-triggered HT attack for SNNs<sup>10</sup>. The HT trigger and payload mechanisms are confined to a single neuron, making the attack highly stealthy and difficult to detect. The trigger is trained to generate a rare output pattern in a critical neuron, and when this pattern occurs, it forces the neuron into saturation. This, in turn, injects additional spikes into the network, severely degrading classification accuracy.

### 10.C. Perspectives and Outlook

Looking ahead, AI is becoming increasingly pervasive in all aspects of modern life. As its applications continue to expand, many existing and emerging use cases require high fault tolerance and pose distinct security risks and challenges. Ensuring reliable and secure operation is essential to safely integrate AI into society, prevent misuse, and build trust in this transformative technology.

SNNs are emerging as a promising alternative to traditional Artificial Neural Networks (ANNs), offering a fundamentally different approach to information processing by mimicking brain-like functionality. To fully leverage their low-power and low-latency advantages, dedicated neuromorphic processors are being developed, necessitating parallel consideration of testing, reliability, and security aspects.

While the objectives of testing and reliability, as well as the security threats, remain similar to those of traditional ANNs, directly applying existing solutions is often impractical due to the unique spike-domain operation of SNNs, which requires fundamentally different approaches. Fault models, which serve as the foundation for test and reliability strategies, remain incomplete and fail to accurately reflect emerging hardware architectures. Likewise, test and fault-tolerance strategies frequently overlook the realities of hardware implementation. From a security perspective, threats such as adversarial attacks, backdoor attacks, model theft, and data privacy risks are largely unexplored for SNNs, with limited understanding of the attack landscape and practical defenses.

Our group remains committed to advancing research in these areas, working to improve the theory and practice of testing, reliability, and security for emerging SNN hardware architectures.

<sup>10</sup>S. Raptis et al. “Input-Triggered Hardware Trojan Attack on Spiking Neural Networks”. In: *Proc. IEEE Int. Symp. Hardw.-Oriented Secur. Trust (HOST)*. 2025.

## 11. UCY: TEST AND RELIABILITY RESEARCH<sup>1</sup>

*M. K. Michael, S. Neophytou, S. Hadjitheophanous, K. Christou, M. Skitsas. University of Cyprus, Cyprus*

This section provides a brief discussion of the test and reliability research at the University of Cyprus (UCY) over the last 20 years, in the topics of delay testing, automatic test pattern generation (ATPG) and simulation, test compaction, test embedding, fault diagnosis, on-line testing and test scheduling, software-based self-test (SBST), reliability for soft-errors and aging phenomena, fault injection based vulnerability analysis, reliability of edge-AI accelerators, and application driven system-level reliability in the edge-cloud continuum. The section outlines our main contributions, current focus and future research plans and vision.

### 11.A. Historical perspective of related research

Test research activities at UCY started in the early 2000s, a time when ultra-scaling of CMOS devices led to the emergence of new failure phenomena in extraordinarily large and complex VLSI chips. Efficient detection of timing failures during manufacturing test became crucial, due to the excessive propagation delays, process variations, crosstalk effects, etc., caused by technology scaling. In particular, delay defects became of increasing interest due to the unrelenting push of CMOS designs towards higher speeds, and new delay fault models were proposed in an attempt to formalize the challenge and allow for efficient fault detection. Our work initially concentrated on solving challenging open problems in delay testing and simulation, for a variety of delay fault models (Path Delay Faults – PDFs, critical PDFs, primitive and critical-primitive PDFs, traditional and enhanced transition delay faults) applied to different circuit abstraction levels (gate-level, functional/macro-level, and RTL-level). Our pioneering work on this topic provided theoretical formulations for fault function generation<sup>2,3</sup>, allowing for new tools and methodologies to derive and grade (simulate)<sup>4</sup> high-coverage compact test sets.

To minimize excessive test application time, test relaxation and compaction/compression techniques are tightly intertwined with ATPG. Computationally affordable static compaction methods can be employed; however, the more challenging problem of dynamic compaction leads to better results. In addition to our contributions in dynamic compaction methods for various delay fault models, we also explored static and dynamic test compaction techniques<sup>5</sup>, test relaxation for compaction and

<sup>1</sup>We acknowledge the support from the Cypriot Research and Innovation Foundation, the EU H2020 programme under grant agreement No. 739551 (KIOS CoE) and the Government of the Republic of Cyprus through the Cyprus Deputy Ministry of Research, Innovation and Digital Policy.

<sup>2</sup>M.K. Michael et al., “ATPG Tools for Delay Faults at the Functional level”, ACM TODAES, 2002, pp. 33-57.

<sup>3</sup>S. Neophytou, M.K. Michael, et al., “Functions for Quality Transition Fault Tests and their Applications in Test Set Enhancement”, IEEE Trans. on CAD, 2006, pp. 3026-2035.

<sup>4</sup>S. Padmanaban, M.K. Michael, et al., “Exact Path Delay Fault Coverage with Fundamental Zero-Suppressed BDD Operations”, IEEE Trans. on CAD, 2003, pp. 305-316.

<sup>5</sup>S. Neophytou and M.K. Michael, “Test Set Generation with a Large Number of Unspecified Bits Using Static and Dynamic Techniques”, IEEE Trans. on Computer, 2010, pp. 301-316.



Fig. 11.1. [48]: Parallel test generation phase targeting hard-to-detect faults.

defect coverage improvement<sup>6</sup> [47], and guided compact test generation for on-chip test embedding schemes. Moreover, we capitalized from the manycore computing power era to significantly accelerate the performance of fault simulation, dynamic ATPG and compaction, focusing on parallel algorithm design and programming tools for shared-memory manycore architectures. Parallelization is challenging, especially in compaction, and affects the quality of the results. Our parallel methodologies<sup>7,8</sup> provided impactful computational speedup and scalability with respect to state-of-the-art and commercial tools, while maintaining strong test compaction rates for single/multi-detect test sets [48]. Fig. 11.1 outlines our parallel dynamic ATPG process, utilizing parallel fault simulation.

A key technology utilized in many of the aforementioned works was decision diagrams (DDs), spanning from Binary DDs (BDDs), to Zero-suppressed BDDs (ZBDDs), to our newly proposed Irredundant Sum-of-Products (ISOP) ZBDDs. We provided theoretical formulations of the underlying problems, which subsequently were represented and manipulated with DDs using new operations or new DD structures. BDDs were utilized to represent fault model specific formulated test functions, while ZBDDs could efficiently represent physical/logical circuit paths<sup>9</sup>, allowing the manipulation of an exponential number of faults while avoiding any fault enumeration<sup>10</sup> [49], [50], identifying false paths or NBTI-critical paths<sup>11</sup>.

We also investigated Software-Based Self-Test (SBST), which can be employed during manufacturing test or in-field. As SBST can be applied in-field, it can detect aging/wear-out induced faults<sup>12</sup>. In this context, our O/S-assisted Dea-

<sup>6</sup>S. Neophytou and M.K. Michael, “Test Pattern Generation for Relaxed n-detect Test Sets”, IEEE Trans. on VLSI, 2012, pp. 410-423.

<sup>7</sup>S. Hadjitheophanous, S. Neophytou, M.K. Michael, “Exploiting Shared-Memory to Steer Scalability of Fault Simulation using Multicore Systems”, IEEE Trans. on CAD, 2019, pp. 1466-1479.

<sup>8</sup>S. Hadjitheophanous, S. Neophytou, M.K. Michael, “Maintaining Scalability of Test Generation using Multicore Shared Memory System”, IEEE Trans. on VLSI, 2020, pp. 553-564.

<sup>9</sup>S. Neophytou and M.K. Michael, “Path Representation in Circuit Netlists Using Linear-Sized ZDDs with Optimal Variable Ordering”, JETTA, 2018.

<sup>10</sup>S. Neophytou and M.K. Michael, “Tackling the Complexity of Exact Path Delay Fault Grading for Path Intensive Circuits”, IEEE ETS, May 2015.

<sup>11</sup>H. Kim, S. B. Boga, A. Vitkovskiy, S. Hadjitheophanous, et. all, “Use it or Lose it: Proactive, Deterministic Longevity in Future Chip Multiprocessors”, ACM TODAES, 2015, pp. 1-26.



Fig. 11.2. Main components of DaemonGuard working in unison to facilitate real-time monitoring and in-field test of functional units in a multi-core system.

monGuard framework (Fig. 11.2) enables real-time observation to initiate on-demand selective SBST of stressed modules in multi-core chips [51], offering substantial savings in test time with negligible impact on system performance. DeamonGuard was further complemented with efficient checkpointing and recovery techniques to support gracefully degraded systems<sup>1</sup>.

To tackle the emerging threats in nanoscale CMOS technologies impacting field-level product reliability from soft-errors<sup>2,3</sup>, we investigated vulnerability in RLT-level designs<sup>4</sup> and subsequently proposed effective protection solutions for single- and multi-bit upsets<sup>5</sup>. Our work on this topic also investigated acceleration of fault-injection campaigns via FPGA emulation for approximate algorithms<sup>6</sup> as well as resource-constrained Neural Networks (NNs)<sup>7</sup>, in the application domains of embedded vision and edge AI/ML.

Our technical innovations have been driven by many collaborations with international teams in the areas of test and reliability. We have been deeply engaged in the ETS community, hosting the 2017 European Test Week, chairing the technical program of 2024 ETS and the 2025 Test Spring School, along with many other roles. The team has also been very active in the

<sup>1</sup> M. Skitsas, et al., “Self Testing of Multicore Processors”, in Many Core Computing: Hardware and Software (pp. 1-30), 2019, IET.

<sup>2</sup> M. Skitsas, C. Nicopoulos, M. K. Michael, “Exploring System Availability during Software-Based Self-Testing of Multi-core CPUs”, JETTA, 2018, pp. 67-81.

<sup>3</sup> M. Ottavi, et al., “Dependable Multicore Architectures at Nanoscale: the view from Europe”, IEEE Design & Test, 2015, pp. 17-28.

<sup>4</sup> A. Kritikakou, et al., “Functional and Timing Implications of Transient Faults in Critical Systems”, IEEE IOLTS, 2022.

<sup>5</sup> M. Maniatakos, M.K. Michael, et al., “Revisiting Vulnerability Analysis in Modern Microprocessors”, IEEE Trans. on Computers, 2015, pp. 2664-2674.

<sup>6</sup> M. Maniatakos, M.K. Michael and Y. Makris, “Multiple-Bit Upset Protection in Microprocessor Memory Arrays using Vulnerability-based Parity Optimization and Interleaving”, IEEE Trans. on VLSI, 2015, pp. 2447-2460.

<sup>7</sup> I. Chadjinimas, et al., “Emulation-Based Hierarchical Fault-Injection Framework for Coarse-to-Fine Vulnerability Analysis of Hardware-Accelerated Approximate Algorithms”, ACM/IEEE DATE, 2016.

<sup>8</sup> P. Corneliou, et al., “Fine-Grained Vulnerability Analysis of Resource Constrained Neural Inference Accelerators”, IEEE DFTS, 2021.

extended international test community (VTS, DFTS, IOLTS, ISVLSI, DATE), under various organization and program roles.

### 11.B. State of play and assets in related research

Driven by challenges imposed by emerging critical application domains, enabled by embedded systems and IoT-enabled Cyber Physical Systems (CPS), in the new AI-dominated era and the edge-cloud computing paradigm, our current focus in reliability has shifted towards two directions. The first one concentrates on reliability and robustness of AI/ML HW and applications, embedded at resource-constrained edge IoT devices. One such example is our recent and on-going work on vulnerability analysis of dynamic DNNs<sup>8</sup>, with the ultimate goal of co-optimizing design and reliability parameters. Our second direction is motivated by the multiple challenges in CPS applications in the edge-hub-cloud continuum, due to their intrinsic criticality, the heterogeneity of devices in the continuum, including the diverse sensing and actuating capabilities at the edge, as well as their varied computational, communication, energy, and hardware reliability limitations. Recent work in this direction focuses on jointly optimizing latency, energy, and reliability via exact multi-objective and multi-constrained scheduling of application tasks, while considering heterogeneous devices with different reliability characteristics and constraints<sup>9,10</sup>. Our current research activities are motivated by our contributions in various EU multi-partner research projects, such as SESAME (2020-2024), our flagship EU Teaming project KIOS CoE (2017-2025), GuardAI (2024-2028) and TIRAMISU (2024-2028).

### 11.C. Perspectives and outlook

Our research outlook continues to be driven by disruptive technologies and subsequent applications, investigating the opportunities and challenges in dependability and robustness in the underlying HW/SW systems. In particular, safety critical CPS applications deploying AI/ML solutions in distributed and heterogeneous computing environments, spanning from autonomous and connected vehicles, multi-UAV emergency response systems, to biomedical edge devices, pose growing demands for real-time, energy-efficient, and reliable execution at the network edge. Furthermore, ensuring the robustness of AI solutions is a necessary key-enabling step before adapting such approaches in safety critical applications. Within this context, we plan to continue and extend our research efforts, to provide holistic application-driven solutions for dependability and robustness. Via our participation in EU multi-partner projects with academic and industrial collaborators, we envision high-impact multi-disciplinary work, reinforced by the synergies between the EU Chips and AI Acts. We embrace open science and collaborative research in dependability, remaining committed to the growing research vision of the ETS community.

<sup>8</sup> G. Konstantinidis, M.K. Michael, et al., “Vulnerability Analysis of Early-Exit DNNs using hardware-aware software-level fault models”, IEEE AI-TREATS (in conjunction to ETS), 2024.

<sup>9</sup> A. Kouloumpiris, et al., “An optimization framework for task allocation in the edge/hub/cloud paradigm”, FGCS Journal, 2024, pp. 1-13.

<sup>10</sup> A. Kouloumpiris, et al., “Optimal Multi-Constrained Workflow Scheduling in the Edge-Cloud Continuum”, IEEE COMPSAC, 2024, pp. 1-10.

## 12. WHEN ETS BECAME APPROXIMATED: THE FRENCH CONNECTION

*A. Bosio<sup>1</sup>, B. Deveautour<sup>2</sup>, P. Girard<sup>3</sup>, M. Traiola<sup>4</sup>, A. Virazel<sup>3</sup>.*

<sup>1</sup>*Centrale Lyon, INSA Lyon, CNRS, Université Claude Bernard Lyon 1, CPE Lyon, INL, UMR5270, France.*

<sup>2</sup>*Nantes Université, CNRS, IETR UMR 6164, F-44000 Nantes, France*

<sup>3</sup>*LIRMM, University of Montpellier/CNRS, Montpellier 34000 France*

<sup>4</sup>*Univ Rennes, CNRS, Inria, IRISA - UMR 6074, F-35000 Rennes, France*

### 12.A. Historical introduction

The *Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier* (LIRMM) is among the main contributors to the European community of test and reliability from 1996, the year of the ETS (at that time ETW) foundation. The works carried out at LIRMM cover different test topics: Memory Test, Diagnosis, Defect-aware test, Low Power test, Analog test, etc. In 2016, research work started on the test and reliability of Approximate-Computing-based hardware. Since then, this research has significantly impacted the scientific community and has expanded to several French institutions, namely the *Institut des Nanotechnologies de Lyon* (INL), *Institut de Recherche en Informatique et Systèmes Aléatoires* (IRISA) and *Institut d'Electronique et des Technologies du numéRique* (IETR) creating the French connection that still contribute to test community and in particular to ETS both in terms of organization and scientific contributions.

### 12.B. Test and Reliability of Approximate Computing-based hardware

Approximate Computing (AxC) is nowadays an established computing paradigm which takes advantage of the inherent application resiliency. It is based on the intuitive observation that selectively relaxing non-critical specifications may lead to improvements in power consumption, run time, and/or chip area [52]. AxC has been applied to the whole digital system stack. In particular, at circuit-level AxC has been applied basically in two ways: (i) *over-scaling* and (ii) *functional approximation*. Over-scaling consists in lowering the Integrated Circuit (IC) supply voltage to reduce its energy consumption. If the circuit is systematically designed to benefit from over-scaling, the timing errors are negligible compared to the energy gain. Nevertheless, the energy gain of over-scaling techniques turns out to be small. Therefore, a considerable amount of work has been presented on circuit *functional approximation*: the circuit functionality is systematically changed – thus, some controlled errors are introduced – to achieve energy-efficient circuits. Circuit error can be measured according to different error metrics.

The work we conducted focuses on digital Approximate Circuits, regardless of the approach employed to obtain them. Since approximation changes the IC behavior, it is important to revisit test [53]. On the one hand, the occurrence of a defect in the circuit can lead it to produce unexpected catastrophic errors.

On the other hand, some defects can be tolerated, when they do not induce errors over a certain level (i.e., approximation during the test procedure). If properly investigated and managed, this phenomenon could lead to an increase in the number of circuits passing the test phase. This is usually referred to as *production yield increase*. Indeed, selling acceptably-functioning circuits – that still respect the user requirements, despite the defects – would increase the profit of semiconductor companies. This is especially critical due to the effect of *process variability* on CMOS technologies. Indeed, CMOS technologies at nano-scale have increasingly negative performance in terms of circuit yield and reliability. To take advantage of the opportunity offered by AxICs, conventional test flow should be revisited.

Therefore, we proposed the concept of *Approximation-Aware testing (AxA testing)*. We identified three main AxA testing phases: (i) AxA fault classification, (ii) AxA test pattern generation, (iii) AxA test set application. Briefly, *fault classification* has to divide faults into *catastrophic* (to test) and *acceptable* (not to test), according to a metric; *test pattern generation* has to produce tests able to cover all the catastrophic faults and, at the same time, to leave acceptable faults undetected; finally, the *test set application* role is to analyze the test outcomes and classify AxICs accordingly, into *catastrophically faulty*, *acceptably faulty*, and *fault-free*. Only AxICs falling into the first group will be rejected. Ultimately, this leads to a yield increase compared to the conventional test flow [52].

We also investigated the fault tolerance aspects, when considering approximate circuits. Approximate Computing can inherently contribute to fault tolerance. AxC techniques, when carefully designed, can provide error resilience by reducing the overall area and power consumption of fault-tolerant architectures. For example, reducing data precision or using approximate arithmetic units can lower resource usage while still ensuring system dependability. This is particularly relevant in scenarios where traditional error-masking techniques, such as Triple Modular Redundancy (TMR), are too costly in terms of area and energy. We explored the concept of Approximate TMR (ATMR), which modifies conventional TMR by incorporating approximation-based redundancy. Unlike traditional TMR, where three identical modules are used for majority voting, ATMR introduces approximate versions of redundant tasks, reducing overhead while still maintaining fault tolerance. We showed that ATMR can achieve nearly the same reliability as TMR while consuming significantly fewer resources [54]. We addressed the challenge of mitigating area overhead in TMR systems, by using approximate hardware. We proposed the Quadruple Approximate Modular Redundancy (QAMR), which uses four approximate modules instead of three precise ones [55]. The key idea is that at least three of the four modules must always provide the same correct output, ensuring full reliability while lowering area costs. The QAMR approach leverages Approximate Computing (AxC) to selectively introduce approximations in non-critical components of the circuit, reducing hardware requirements. Unlike prior approximation-based TMR solutions, which often compromise reliability, QAMR ensures full logic masking of transient and

permanent faults. The method involves strategically modifying four approximate replicas of the original circuit so that for every given input, three modules produce an identical output. This allows QAMR to maintain the same fault tolerance level as conventional TMR while significantly reducing hardware overhead. Our evolutionary algorithm-based Design Space Exploration (DSE) approach systematically identifies Pareto-optimal QAMR implementations that balance area and delay trade-offs [56]. We use *logic falsification* and *output removal* as approximation methods. In the first specific Boolean function values are selectively altered to achieve cost reductions without compromising error resilience. The second involves eliminating logic cones from circuit outputs. Our experiments on FPGA and ASIC technologies show that QAMR can outperform TMR in most cases, achieving area reductions while maintaining or even improving timing performance. With this study we pushed the boundaries of fault-tolerant architectures by proving that Approximate Computing can be effectively integrated into redundancy-based systems without sacrificing reliability.

#### Research Projects

The work carried out in AxC opened the way to several research projects (national and international). Among them we have to cite:

- AdequateDL (ANR): Approximating Deep Learning Accelerators;
- RE-TRUSTING (ANR): REliable hardware for TRUST-worthiness artificial INtelliGence;
- APROPOS (H2020 MSCA Doctoral Network): Approximate Computing for Power and Energy Optimisation;
- TAICHIP (HORIZON Twinning Bottom-Up action): Boosting TalTech Capacity in Reliable and Efficient AI-Chip Design.

#### 12.C. Perspectives and Outlook

Approximate Computing (AxC) is emerging as a crucial paradigm for optimizing digital systems, but its full potential is yet to be realized. As semiconductor manufacturing faces increasing variability challenges, Approximation-Aware Testing (AxA testing) presents a promising approach to enhancing production yield by distinguishing between catastrophic and acceptable faults. This shift in testing methodology requires further refinement in fault classification metrics and test pattern generation strategies to ensure a seamless integration into industrial workflows. Additionally, as the complexity of AI and data-intensive applications grows, AxC techniques could be tailored to specific workloads, maximizing energy efficiency while maintaining application-level accuracy. Future research could explore machine learning-driven AxA testing methods, leveraging adaptive algorithms to dynamically adjust test criteria based on evolving hardware performance and defect tolerance thresholds.

Beyond testing, the fusion of Approximate Computing and fault-tolerant architectures introduces new design opportunities, particularly in critical embedded systems and low-power computing domains. Techniques like Approximate TMR (ATMR) and Quadruple Approximate Modular Redundancy (QAMR)

demonstrate that fault tolerance can be achieved with significantly lower resource overhead than traditional methods. However, further investigation is needed to assess long-term reliability under varying environmental conditions and workload dynamics. The integration of evolutionary algorithm-based Design Space Exploration (DSE) paves the way for automated optimization of AxC-based fault-tolerant systems, but additional research is required to refine these methodologies for large-scale circuits. Looking ahead, the synergy between AxC and emerging hardware paradigms, such as neuromorphic and quantum computing, could redefine fault resilience strategies, pushing the boundaries of energy-efficient and robust system design.

Finally, the authors actively contributed and served in ETS steering, program and organization committees:

- Patrick Girard: was Program Chair in 2008, General Chair in 2013 and a member of the Steering Committee;
- Arnaud Virazel: was Local and Finance Chair in 2013 and is the 2025 Program Co-Chair;
- Alberto Bosio: was the Program Chair in 2023 and served as Topic chairs in different editions;
- Marcello Traiola: is the 2025 Publicity Co-Chair;
- Bastien Deveautour: was the Finance Chair in 2023 and the 2025 PhD Forum Co-Chair.

### 13. INRIA, TARAN: FROM RADIATION EXPERIMENTS TO MICROARCHITECTURE AND SOFTWARE: AN HOLISTIC PERSPECTIVE ON RELIABILITY RESEARCH <sup>1</sup>

*Marcello Traiola, Fernando Fernandes dos Santos, Angeliki Kritikakou. Univ Rennes, Inria, CNRS, IRISA, and IUF*

#### 13.A. Historical perspective of related research

Systems designed with modern technologies have increased susceptibility to external perturbation sources, such as radiation, voltage variation, and electromagnetic signals, leading to an increase in transient perturbations occurring during computation. In this context, our team has been exploring novel methods to perform a vulnerability assessment of a system and to protect architectures against faults. Our protection mechanisms have not only a low overhead in terms of area, performance, and energy, but also a significant impact on improving the resilience of the architecture under consideration. Such protections require acting at most layers of the design stack.

#### 13.B. Current research, frameworks and projects

1) *Cross-layer reliability assessment approaches:* We introduced FLODAM [57], an innovative methodology that enhances the reliability assessment of hardware systems. FLODAM addresses the limitations of traditional approaches by offering a comprehensive analysis that spans from the semiconductor layer to the application layer. This cross-layer perspective ensures a more thorough understanding of potential vulnerabilities in complex hardware designs. FLODAM's approach integrates various layers of hardware design, enabling the identification of reliability issues that may not be apparent when examining individual layers in isolation. By considering interactions between layers, FLODAM provides a holistic view of system reliability, facilitating the detection of subtle faults and the development of more robust hardware solutions. One of the key advantages of FLODAM is its ability to combine the benefits of existing reliability analysis methods. Traditional techniques often focus on specific layers or aspects of hardware design, potentially overlooking critical cross-layer interactions. FLODAM bridges this gap by offering a unified framework that encompasses multiple layers, leading to more accurate and comprehensive reliability assessments. The implementation of FLODAM involves a systematic process that begins with the semiconductor layer and extends through to the application layer. This process includes the evaluation of material properties, circuit design, architectural considerations, and software interactions. By analyzing each layer in the context of the entire system, FLODAM identifies potential failure points and suggests targeted improvements to enhance overall reliability.

Similarly, we have addressed the critical issue of hardware reliability in Vision Transformers (ViTs), which are advanced machine-learning models known for their high accuracy. Due to their substantial size and complexity, ViTs are particularly susceptible to hardware faults, posing significant challenges in ensuring their dependable operation. Traditional microarchitectural fault simulations are often inadequate for evaluating large

ViT models, as they can require extensive time—potentially years—to yield statistically meaningful data. To overcome this limitation, we proposed a two-level evaluation methodology that combines empirical data from neutron beam experiments with software fault simulations [58]. This approach provides a more efficient and wide assessment of ViT reliability. We conducted over 70 hours of neutron beam experiments and more than 600 hours of software fault simulations, focusing on 12 different ViT models executed on two NVIDIA GPU architectures. This extensive data collection enabled a thorough characterization of fault models within ViT kernels, identifying specific faults that are more likely to propagate and affect the output. Understanding these fault propagation pathways is essential for developing targeted mitigation strategies to enhance the robustness of ViT models against hardware-induced errors. Building upon the fault characterization findings, we introduced *MaxiMum corrupted values* (MaxiMals), a low-cost mitigation technique specifically designed to reduce the impact of transient faults in ViTs. MaxiMals operates by identifying and correcting corrupted values within the model's computations, thereby preventing the propagation of errors that could compromise the model's performance. Experimental results demonstrated that MaxiMals effectively corrected 90.7% of critical failures, with minimal execution time overheads as low as 5.61%. This balance between fault mitigation efficacy and computational efficiency makes MaxiMals a practical solution for enhancing ViT reliability in real-world applications.

This work has been partially funded by the RAPID DGA project FLODAM, an industrial research project dedicated to the reliability assessment of RISC-V processors.

2) *Reliability Assessment Overhead Reduction:* In an effort to reduce the time and resource overhead when performing reliability assessments for Deep Neural Networks (DNNs), we proposed harDNNing [59]. Given the extensive size and complexity of DNNs, traditional fault tolerance methods often incur prohibitive overheads, making them impractical for real-world applications. To address this challenge, we proposed a comprehensive framework that assesses fault tolerance and implements cost-effective protection mechanisms tailored to the specific characteristics of DNNs. The framework begins with a datatype-and-layer-based fault injection process, which is driven by the unique attributes of the DNN under consideration. This targeted fault injection enables the identification of vulnerabilities at both the parameter and bit levels within different layers of the network. By focusing on these specific areas, the approach ensures a more precise and efficient fault analysis compared to generic fault injection methods. Following the fault injection phase, the framework employs classification-based machine learning techniques to predict the criticality of network parameters and their individual bits. This predictive analysis allows for the differentiation between critical and non-critical components, facilitating the selective application of protection measures. Such granularity ensures that resources are allocated effectively, enhancing fault tolerance without imposing unnecessary computational or memory overheads. To protect the identified critical components, we use selective Error

<sup>1</sup>Lucas Roques, Romaric Nikiema, Olivier Sentieys

Correction Codes (ECCs). This strategy provides robust protection for essential parameters and bits while maintaining low cost. The efficacy of the proposed framework is demonstrated through experiments on two Convolutional Neural Networks (CNNs), each utilizing four different data encodings. The results show that the framework enhances the fault tolerance in a cost-effective manner, making it a viable solution for deploying reliable DNNs.

3) *Impact on the reliability of tools' high-level parameters:* We explored the effects of High-Level Synthesis (HLS) tools on the reliability of hardware accelerators designed for Artificial Neural Networks (ANNs) [60]. The study focuses on how different HLS design parameters, particularly resource reuse, influence the error rates of ANN accelerators when exposed to high-energy neutron radiation. We experimented on FPGA-based accelerators generated through HLS, exposing them to neutron beam tests to evaluate their resilience to transient faults. The results show that HLS parameters significantly impact the error rates, with larger accelerators exhibiting higher error rates but also delivering more correct executions before failure.

The study highlights the trade-offs between hardware resource usage and reliability. By varying the reuse parameters in HLS, we show that accelerators with higher resource reuse (smaller designs) tend to have lower error rates but longer execution times, while those with lower reuse (larger designs) achieve faster execution but are more susceptible to errors. Interestingly, despite the increased error rate, the larger designs can deliver up to  $15\times$  more correct executions than the smaller designs. We measured this through *Mean Executions Between Failures (MEBF)* metric, which provides a more holistic view of reliability by considering both error rates and execution time. The results show that larger accelerators, despite their higher error rates, can achieve significantly higher MEBF values, meaning they can perform more correct classifications before experiencing a failure. Furthermore, we observed that critical Silent Data Corruptions (SDCs, i.e., model misclassification) are more frequent in larger accelerators, which aligns with the increased likelihood of faults affecting critical computations in these designs. These findings underscore the importance of carefully selecting HLS parameters to achieve performance and reliability in ANN hardware accelerators. Finally, we also compared bare-metal implementations with those running on an operating system, showing that the OS significantly increases the rate of detected unrecoverable errors (DUEs, i.e., system crashes) due to the additional complexity and resource usage.

This work has been partially funded by the ANR project RE-TRUSTING (ANR-21-CE24-0015-02). RE-TRUSTING project focuses on developing fault models and performing failure analysis of hardware platforms for AI (HW-AI) to study their vulnerability. This means ensuring that the hardware is error-free and that the AI hardware does not compromise the AI prediction accuracy and does not bias AI decision-making.

4) *Timing-aware Fault Tolerance:* We also investigate the effects of transient faults on the timing behavior of real-time systems, particularly focusing on how these faults impact Worst-Case Execution Time (WCET) estimations. Traditional

WCET estimation methods assume fault-free hardware, which can lead to unsafe timing guarantees when transient faults occur. These faults can significantly alter the execution time of applications, leading to potential violations of real-time constraints. Few techniques consider the timing impact of faults, especially those occurring in processor cores. Hence, we proposed a novel mechanism to mitigate these effects with minimal timing overhead, ensuring that WCET estimations remain reliable even in the presence of faults [61].

First, we performed a vulnerability analysis to assess the impact of transient faults on both functional and timing behavior. Using a RISC-V core as a case study, we performed extensive microarchitectural fault injection experiments to evaluate how faults affect execution time. The results reveal that transient faults can increase the maximum observed execution time by up to 700% compared to fault-free execution, significantly distorting WCET estimations. Also, we show that traditional fault-free WCET approaches are insufficient for systems operating in fault-prone environments. To address the timing impact of transient faults, we proposed a fault-tolerant mechanism that uses two identical cores operating in lock-step, comparing their pipeline registers at each clock cycle to detect faults. When a fault is detected, the system discards the faulty results and restores the correct values from a backup register, ensuring that the fault does not propagate and affect the execution time. The mechanism introduces a bounded overhead of only two clock cycles, making it highly efficient compared to traditional fault-tolerant techniques like watchdog timers or task re-execution, which often incur significant timing penalties. Our experiments showed that the proposed approach restores WCET estimations close to fault-free levels, even under fault injection.

This work has been partially funded by the ANR project FASY (ANR-21-CE25-0008-01). FASY aims to tackle the combined challenge of designing time-predictable and reliable multicore embedded systems. It provides the means to analyze both the functional and timing behavior of applications executed on multicore architectures, perform fault-aware WCET estimation, and design cores with time-predictable and reliable execution under faults.

### 13.C. Perspectives and outlook

In the upcoming years, we will be developing research on the reliability assessment of computing architectures relying on emerging technologies, in particular as part of the recently EU-funded project ARCHYTAS (ARCHitectures based on unconventional accelerators for dependable/energY efficient AI Systems) - European Defence Fund EDF-2023-RA, 2025-2028, collaborations with 5 industrial partners, 6 SMEs, and 14 academic partners. The ARCHYTAS project aims to investigate and study the feasibility of non-conventional AI accelerators for defense applications that take advantage of novel technologies at the device and package level: optoelectronic-based accelerators, volatile and non-volatile processing-in-memory, and neuromorphic devices. In the project, our team will be a WP leader and conduct analyses of the state-of-the-art reliability assessment approaches for existing AI hardware accelerators regarding their applicability to ARCHYTAS accelerators.

## 14. HiCREST:

### RELIABILITY IN THE AI AND QUANTUM COMPUTING ERA

*G. Casagrande, M. Vallero, F. Vella, P. Rech.  
University of Trento, Italy*

The High performance Computing and Reliable Systems in Trento (HiCREST) Laboratory has been created to put together expertises from programming, models, compilers, computing architecture, physics, and reliability with the joint goal of improving both systems' performances and reliability. Current and future computing architectures, applications, and technologies offer unprecedented opportunities in terms of computing power and efficiency. Unfortunately, the intrinsic complexity of the computing hardware and the probabilistic nature of computation make the reliability evaluation and improvement challenging. HiCREST exploits the synergy derived from different backgrounds and research interests to stimulate the design and implementation of innovative solutions and allows an investigation that crosses all the abstraction layers from the physical implementation to the algorithm definition. In this scenario, the ever increasing interest of adopting Machine Learning in safety-critical applications, such as automotive, space, industrial production, exacerbates the need for performance, reliability, and efficiency. The research philosophy of HiCREST is depicted in Figure 14.1 and described in [62].

#### 14.A. Historical path to AI and Quantum Reliability

Reliability has always been considered as a fundamental aspect for computing systems. The reliability level is decided depending on the application criticality and was typically measured based on detailed and deep investigation at transistor level. Normally, memories were considered the most critical core of the computing system and a lot of effort has been carried out to understand and improve their reliability. With the rise of neural network and of parallel computing architectures such as Graphics Processing Units (GPUs), the focus has shifted from memory to data-path and computing errors.

We were the first to perform a reliability study on **Graphics Processing Units (GPUs)**, back in 2013<sup>1</sup>, well before the extensive use of GPUs for machine learning acceleration. Our pioneer work showed how to properly investigate the reliability of complex and parallel devices, underlying the importance of having a realistic evaluation not necessarily based on software fault injection. In fact, given the complexity of the architecture it is unlikely for a single impinging particle to induce a simple single bit-flip at software level. It is then necessary to tune the fault-injection with a physical- and architecture-level study. In particular, the way the work is distributed in the available computing units can significantly modify both the error rate and the error manifestation at the output (i.e., how many output elements are corrupted). We have also characterized the effect of radiation on the Convolutional Neural Networks (CNNs) execution<sup>2</sup>, proposing innovative and smart mitigation solutions [63] since ECC is not sufficient for CNNs reliability.

<sup>1</sup>Rech, P. et al., 2013 18th IEEE European Test Symposium (ETS)

<sup>2</sup>Santos, Fernando Fernandes dos et al., IEEE Trans. on Rel., Analyzing and Increasing the Reliability of Convolutional Neural Networks on GPUs

Additionally, in ETS 2014 we have studied how the use of an **Operating System** can impact the radiation effect on embedded systems<sup>3</sup>. We have seen that the OS does not modify the probability of Silent Data Corruptions (SDCs) but can impact significantly the Detected Unrecoverable Errors (DUEs). We have also investigate how to exploit cache concurrencies to reduce the impact of memory errors in the computation. Flushing cache can be beneficial to reduce the error rate of a code. Later, we have also deeply investigate the reasons for this behaviour, combining architectural fault injection with beam experiment on ARM systems.

Additionally, we have studied the reliability of **Field Programmable Gate Arrays**, considering various aspects such as ageing and voltage scaling, but also, more focused on AI, propose selective hardening for deep neural networks [64]. Later, we have also investigate the possibility to implement soft core in FPGAs and evaluate their reliability.

We have also evaluate the effect of **Reduced Precision** in the reliability of machine learning applications. We have seen that reducing the data or operation precision is beneficial in reducing the error rate of the application. However, the impact of the fault in the output correctness can be higher in low precision operations, eventually increasing the probability to have misclassifications. This holds for GPUs but also for FPGAs. We have exploit the GPUs mixed-precision units to propose a low-cost Reduced-Precision Duplication With Comparison.

We have seen that, for Commercial-Off-The-Shelves (COTS) devices, the **silicon process** used to implement the device has an impact on its radiation sensitivity. It is then necessary to perform radiation experiments to be sure not to underestimate the COTS device radiation-induced error rate.

Finally, we put a lot of effort in understanding and mitigating the effect of radiation in **object detection applications**<sup>4</sup>, sticking to out experimental and analytical philosophy that guarantees accurate and precise evaluations as well as allow to design efficient and effective hardening solutions.

#### 14.B. Current reliability research philosophy

With the evolution of computing systems, the reliability qualification and improvement needed to be adapted. The complexity of modern computing architectures, which are often parallel and include dedicated functional units, makes the reliability evaluation very challenging. Trying to perform an accurate and precise evaluation on each transistor and tracking the fault propagation to the output is a mere illusion. Moreover, the rise of Artificial Intelligence changes completely the concept of correct execution itself, since the output is intrinsic probabilistic. In the last years HiCREST has focused the reliability research also in **Quantum Computing (QC)**, as shown in Figure 14.2. In fact, the bound to the miniaturization of classical chips, in conjunction with increasing investments in the field, has made QC a compelling technological solution to solve an always larger class of problems. The key feature of

<sup>3</sup>Santini, Thiago et al., 2014 European Test Symposium (ETS), Reducing embedded software radiation-induced failures through cache memories

<sup>4</sup>Special Session at 2024 IEEE European Test Symposium (ETS), Reliability and Security of AI Hardware



Fig. 14.1. HiCREST philosophy: combine physical level beam experiments with fault injection to propose accurate evaluation and effective hardening solutions.



Fig. 14.2. HiCREST philosophy adapted to quantum technology.

QC is to exploit quantum properties of matter (superposition and entanglement) as a computing resource rather than as an interference. A quantum bit (qubit) encodes information in controllable two-level quantum mechanical systems. Since the qubit is a quantum system, its state is probabilistic and so is the output of the quantum circuit. With quantum gates the probability to be in a 0 or 1 state can be biased. Despite the intense dedication of the community, a crucial issue is still acting as a bottleneck for the large-scale adoption of QC: every existing implementation suffers from reliability problems. The focus has been oriented on the study of the intrinsic decoherence problem, called noise, which already led to (extremely expensive) working strategies of suppression, namely surface codes, and neglecting other threats, such as radiation. We have then decided to trigger a novel research line, voted to the understanding and improvement of quantum computing radiation reliability. Following the HiCREST research philosophy, we are combining physical beam experiments and simulation on qubits with fault simulation and fault propagation in quantum circuits. We have seen that quantum devices are extremely likely to be corrupted by radiation<sup>1</sup>, particularly muons (immune for CMOS). We have also formalize the **Quantum Vulnerability Factor (QVF)** to quantify the probability for a fault to propagate in a quantum circuit [65] and build an open-source **quantum fault injector (QuFI)** to ease the fault

simulation, even considering Quanvolutional Neural Networks (QNNs)<sup>2</sup>. We have also identify the weaknesses of current **Quantum Error Correction (QEC)** codes against radiation-induced faults, proposing alternative algorithmic solutions [66].

#### 14.C. Future challenges and opportunities

The HiCREST group is well aware of the challenges of guaranteeing high reliability in novel applications and technologies. Our idea is to have an accurate reliability evaluation, based on beam experiments or physical simulations. This is the only known way to have a realistic understanding of the fault probability and fault manifestation at higher levels of abstractions. Then, we strive to identify the faults that are more likely to corrupt the final application output. By protecting the system from these critical faults or blocking their propagation, we increase the efficiency of the hardening solution. When a mitigation solution is implemented we always validate it with experiments, so to avoid the risk of being ineffective.

Additionally, we exploit the (reliability) opportunity that comes from ML and quantum technologies rather than seeing these as complex and hard to understand. The only way to ensure high reliability in future technologies is understanding the opportunities they offer, re-thinking the reliability problem as a feature more than a mere overhead.

<sup>1</sup>Giorgio Casagrande et al. "Understanding the Contributions of Terrestrial Radiation Sources to Error Rates in Quantum Devices". IEEE Transactions on Nuclear Science

<sup>2</sup>Vallero, Marzio et. al., IEEE Transactions on Quantum Engineering, Understanding Logical-Shift Error Propagation in Quanvolutional Neural Networks

## 15. IHP: RELIABILITY-AWARE HARDWARE DESIGN FOR EMERGING APPLICATIONS

*Leticia Maria Bolzani Poehls<sup>1</sup>, Milos Krstic<sup>1,2</sup>, Marko Andjelkovic<sup>1</sup>, Fabian Luis Vargas<sup>1</sup>.*

<sup>1</sup>*IHP – Leibniz Institute for High Performance Microelectronics, Frankfurt Oder, Germany,*

<sup>2</sup>*University of Potsdam, Germany*

For more than one decade, the Leibniz Institute for High Performance Microelectronics (IHP) actively worked aiming the development of highly reliable hardware for safety-critical applications. All activities related to this subject have been developed in the context of the System Architecture Department and several relevant projects have been funded and developed in the last years. In the last years, IHP also started to focus on the quality- and reliability-aware design of high-performance hardware for edge AI applications. Thus, with this historic retrospective these aspects are going to be further explored and a strategic vision with respect to the European Test Symposium (ETS) community is going to be presented.

### 15.A. IHP Historical Perspective

The Leibniz Institute for High Performance Microelectronics (IHP) was founded in 1983 as an institute of the Academy of Sciences [1]. In 1992, the institute was re-established as a GmbH and became a member of the "Arbeitsgemeinschaft Forschungseinrichtungen Blaue Liste", today Leibniz Association, being included in the joint federal and state funding. Since April 2017, it is part of the nationally coordinated Research Fab Microelectronics Germany (FMD). A new institute building with a 1000 m<sup>2</sup> class 3 clean room according to DIN EN ISO 14633-1 was built and started to operate in 1999. The existing clean room was extended to 1500 m<sup>2</sup> from 2018 to 2020. With all further extensions realized until 2013 and the increased clean room area, the institute now possesses more than 12,500 m<sup>2</sup> of usable space. Thus, IHP is a non-university research establishment, a member of the Leibniz Association and thus institutionally funded by the State of Brandenburg, the Community of States and the Federal Republic of Germany. Synergies are achieved through the coordinated collaboration of the Materials Research, Technology, Circuit Design, System Architectures and Wireless Systems departments with their core competencies, resulting in vertically optimized solutions. The core competencies enable IHP to make significant contributions to current and future societal challenges such as health, security, mobility, sustainability and communication technologies. The department of System Architectures has a long-standing history of research in the domain of highly reliable hardware design for safety-critical applications, and recently for edge AI applications. In this context, the paper describes key activities through related projects to the European Test Symposium (ETS) community and summarizes the new strategic vision of IHP moving its focus to the AI domain.

The reliability subject represents a very relevant aspect of the main activities developed in the context of the Department of System Architectures. The IHP team started with Milos Krstic leading the Group of Design and Test Methodology in 2010 and

later also the Department of System Architectures [67, 68, 69]. In 2016, Marko Andjelkovic joined IHP, complementing the Department's expertise in the domain of fault-tolerant computing. The IHP team's expertise was complemented in the context of test and reliability of safety-critical applications when Fabian Luis Vargas joined IHP in 2022. In addition, Leticia Maria Bolzani Poehls, joined IHP to lead the newly created Group of Neuromorphic Hardware in January 2025, demonstrating the dedication to new activities in the AI domain as integral part of IHP's vision [70, 71]. From the historic perspective, the major application driver for the subject of reliability have been space applications. In this specific context several activities have been developed with respect to design methodologies for radiation hard circuits and fault tolerant processing. It is worth to mention the exploration of LEON 3FT processors already twenty years ago. Notable developments have been performed within the EU Project VHiSSI from 2012 to 2015, focusing on SpaceFibre ASIC design and evaluation, within a European consortium. In addition, the development of rad-hard communication chips was also in the focus of the EU project SEPHY (2015-2018). Currently, IHP also has the Eurostars project SECHIS (2017-2020) and the BMBF 6G-TakeOff project, where the main goal is to develop rad-hard baseband processor for satellites. In the last years, IHP has been dedicated to the development of the space microcontrollers and reconfigurable fault tolerant multi-processors. The topic of microcontrollers has been successfully started with the remote terminal unit ASIC, assembling a 32-bit LEON processor into a fully functional dual-chip-in-package system. The major achievement from 2024 is related to the successful conclusion of the EU project MORAL, which has been devoted to 32-bit microcontroller development, based on internal PEAKTOP ISA. This chip includes also a number of digital interfaces, as well as 12-bit ADCs and DACs. After validation, the chip was characterized and tests aiming to evaluate its endurance under radiation, both Total Ionizing Dose (TiD) and Single Event Effects (SEEs), confirming its suitability for the space missions have been performed. The enabler for this work was also the research on the design methodology for space application, including the development of the radiation hard library and the hardening methods at the RTL-level. On the system level, the major focus was on the dynamically reconfigurable multi-processing architectures supporting different operation modes, including core level redundancy and low-power modes. First ASIC has been made as early as 2014, demonstrating the success of the initial concepts. Another major step forward was taken within the PISA project, where one 4-core LEON-based IC has been developed, which in addition to its reconfigurability supports the option of adaptive voltage scaling (AVS) on the cores and memories. Finally, the project Scale4Edge aims to create an ecosystem for a scalable and extendable edge-computing platform based on the RISC-V instruction set. In more detail, in the Scale4Edge we have recently introduced TETRISC chip, transferring the concept to the RISC-V domain. This chip adopts strategies for Silicon Lifecycle Management (SLM). In more detail, dedicated on-chip sensors were introduced to support and enable the recon-

figuration of multiprocessors. In the last years, IHP's has added the domain of AI to its focus areas, exploring the reliability of AI processing, in the complex Deep-Learning accelerators and in specific FPGA-based AI-processors. Moreover, IHP started to investigate emerging technologies that are amongst the preferable candidates for AI-acceleration. In this context efforts to increase AI-processing reliability in RRAM-based processing structures have been undertaken. Still in this direction, the use of Graph Neural Networks for accelerating fault injection was successfully demonstrated. Finally, the adoption of ML to analyze SETs' impact on standard cells, as well as to real-time prediction of solar particle events in space is being explored at IHP.

### 15.B. Current Activities

With the clear importance that AI already has in different segments of our society and the expected increase in the future, the capability of developing high performance hardware assuming restrict constraints in terms of power consumption requiring high quality and reliability can be identified as the key enabler for implementing these emerging applications. In more details, the development of AI-oriented hardware is considered essential to address the von Neumann bottleneck adopting not only CMOS, but also emerging technologies. In the last years, the IHP team started to go on this direction and with the newly established Group of Neuromorphic Hardware, IHP will be prepared to tackle the most eminent concerns with respect to power consumption, quality and reliability of AI-oriented hardware. The development of disruptive neuromorphic hardware adopting CMOS- and emerging technology-based circuits for implementing a wide range of edge AI applications is the main goal of this Group. The Group assumes a multidisciplinary and interdisciplinary approach that integrates activities related different subjects, such as device technology as well as circuit and architecture design for edge AI applications. The requirements and constraints associated to edge AI applications demand the integration of not only CMOS- but also emerging technology-based circuits and architectures. This heterogeneity increases significantly the complexity of designing hardware for edge AI applications. In this context, a holistic approach considering all lifecycles phases as well as assuming all abstraction levels, from device, through circuit, architecture and algorithm, to system is considered mandatory in order to guarantee the implementation highly reliable neuromorphic hardware. Fig. 15.1 depicts the addressed topics of the Group of Neuromorphic Hardware assuming all abstraction levels (device, circuit, architecture and algorithm) to guarantee the development of high-performance neuromorphic hardware assuming a holistic power-, quality- and reliability-aware design strategy for edge AI applications. Note that this is only possible due to the multidisciplinary team at IHP and the long-standing national and international collaborations with industry and academia. These strong collaborations directly contribute to the significant number of funded projects, such as TAICHIP, INSEKT, TWIN-RELECT, among others. In more detail, TAICHIP aims to develop a platform for designing energy-efficient and highly

reliable edge AI applications assuming advanced AI architectures and open source RISC-V processor architectures. When considering INSEKT, the project's goal is to develop a RRAM-based accelerator as an ASIC using the 130nm IHP technology including design-for-testability strategies aiming the detection of manufacturing deviations at time zero. Finally, in TWIN-RELECT the goal is to develop an EDA tool able to support cross-layer reliability assessment assuming multiple faults, which allows the development of more efficient strategies to provide fault mitigation and tolerance in different abstraction levels.



Fig. 15.1. Neuromorphic Hardware Group's activities assuming all abstraction levels aiming the development of neuromorphic hardware for edge AI applications

### 15.C. Perspective and Outlook

One can resume that IHP's team addresses a variety of topics related to ETS community. Besides the topics covered by other related Departments, the research within the Department of System Architectures includes power-, quality- and reliability-aware design of CMOS- and emerging technology-based hardware. In more detail, the IHP team has established itself as a key contributor for the ETS community, not only regarding safety-critical but, with growing intensity, also for emerging applications. The broad internal knowledge base in combination with the collaborative and development-oriented research strategy will enable IHP to continue its contribution to research and society in the future.

## 16. SYNOPSIS: ENSURING QUALITY, SAFETY AND RELIABILITY IN THE EVOLUTION FROM MONOLITHIC SoCs TO MULTI-CHIPLET DESIGNS

*Grigor Tshagharyan, Gurgen Harutyunyan, Valerik Vardanian, Samvel Shoukourian, Yervant Zorian. Synopsys Armenia, Embedded Test & Repair Group, Yerevan, Armenia*

### 16.A. Historical Perspective

Under the mentorship of Dr. Zorian, the Synopsys Armenia Embedded Test & Repair team has played a pivotal role in advancing test, repair, and reliability in the semiconductor industry. This journey began in the mid-1990s with the founding of Virage Logic and has since been marked by continuous innovation in response to evolving industry challenges. While research priorities have shifted over the decades, the team's commitment to advancing the test community and solving complex problems has remained unwavering. Technological evolution over this time can be viewed in three key epochs, each defined by distinct challenges and innovative solutions.

#### *First Epoch: The Era of Yield and Quality*

In the early years, chip yield and quality were the semiconductor industry's primary concerns and BIST solution emerged as a cost-effective, high-efficiency alternative to traditional ATE test. With memories occupying up to 80–90% of total chip area, they became the main contributors to yield loss. Therefore, research primarily focused on realistic fault models for SRAM and robust BIST methods for their detection. Functional fault models were expanded to address complex memory defects, including static, dynamic, and linked faults, resulting in several key publications, including [72] presented at ETS. Another key research priority was SoC yield optimization, which led to the development of efficient redundancy allocation and repair strategies, as outlined in [73].

During this era, memory technology underwent significant changes. As MOSFET scaling approached its limits due to leakage and short-channel effects, the industry adopted FinFET technology, which improved electrostatic control and continued Moore's Law. However, FinFET introduced new defect mechanisms that existing fault models could not capture, prompting the development of new test algorithms for detection. Because foundries rarely share real silicon data, the Advanced Inductive Failure Analysis (AIFA) method was introduced to systematically investigate FinFET defects using memory physical layout data [74], leading to development of more efficient, technology-specific BIST solutions.

As technology advanced, the fault landscape has significantly evolved and expanded, leading to more sophisticated failure mechanisms that demand increasingly complex test sequences for detection. This trend inspired the development of the Fault Periodicity Table (FPT) – a structured model for fault representation that not only reflects the current state but also helps anticipate future developments. This table serves as a predictive model, offering insights into the trajectory of technological development. Each column in FPT represents a fault nature associated with specific test techniques, while each row groups fault families based on their complexity.

#### *Second Epoch: The Rise of Safety and Reliability*

While quality dominated the first epoch, the second brought safety and reliability to the forefront, forming a new triad of industry priorities. The main driving force behind this shift in paradigm was the rapid advancement of automotive market, accompanied with the evolution of other markets, including industrial, medicine and avionics. The tendency for greater safety and better driving experience was forcing automakers to continually integrate large amount of E/E driving assistance and infotainment systems into their vehicles. Automakers had to sought both high-performance systems (enabled by advanced FinFET nodes) and extremely low DPPM and FIT rates for safety-critical applications. This demanded a fundamental redesign of BIST architecture and rethinking of the test solution [75]. In this epoch, testing expanded beyond manufacturing, encompassing the entire product lifecycle – from production to in-field operation. New test modes and metrics emerged, redefining the industry's approach to fault prevention and system resilience.

#### *Third Epoch: The Era of Multi-Die Chips*

The third epoch marks a fundamental shift in SoC design paradigm. Increasing chip complexity and diminishing yield have made single-die integration economically impractical. As a result, the industry is embracing multi-die architecture, distributing functionality across multiple dies manufactured using different technologies and with different supply chains, enabling true heterogeneous integration. Although multi-die and chiplet-based designs offer significant advantages, they also introduce new test and repair challenges – especially in ensuring scalability and efficiency across interconnected dies. With standardization efforts still in progress, developing scalable test & repair methodology remains a priority.

Another critical factor spanning across the second and third epoch is the rise of Silicon Lifecycle Management (SLM) [76]. Continuous silicon health monitoring has become essential driven by:

- **Scaling Complexity:** With each new technology node, growing transistor counts and interconnect densities amplify the manufacturing variability.
- **System Complexity:** tight hardware-software interactions, aging, degradation, power fluctuations, and computational workloads create new reliability challenges.
- **Advanced Packaging:** Multi-chip and 2.5D/3D stacking raises complexities with interconnect integrity, thermal concerns, and inter-die/inter-chip fault diagnosis.

### 16.B. Key Research Areas and Publications

The research carried out by the team over the past few decades can be divided into the following key topics:

#### *Fault Modeling & Test Algorithm Development*

Fault modeling has been one of the team's earliest and most extensively explored research areas, leading to numerous publications in prominent conferences and journals. Various fault families, including static, dynamic, as well as linked and unlinked faults, have been thoroughly investigated, resulting in the development of effective and/or optimal test solutions

for each category. In particular, minimal march tests were proposed to detect different fault types, including detection of all unlinked static faults addressed in [VTS'2005], detection of linked static faults presented in [VTS'2006], and the detection of dynamic faults proposed in [ETS'2006]. Beyond fault detection, efficient march tests were also designed for fault localization in [DDECS'2006] and full diagnosis of static faults in [EWDTs'2006]. In addition, a software tool was developed for generating effective test algorithms tailored to specific set of fault models, as described in [EWDTs'2007], while an advanced structure-oriented method for march test generation was introduced in [TCAD'2012].

#### *Periodicity of Faults*

Insights from previous research revealed distinct regularities and periodicities in the evolution of fault mechanisms, such as the special symmetry measure, introduced in [JETTA'2011], which was applied in BIST implementations. To systematically capture these findings, a Fault Periodicity Table (FPT) was proposed in [VTS'2013]. The FPT serves two primary purposes: facilitating the design of a generic, programmable BIST architecture capable of covering a wide range of faults and predicting emerging fault mechanisms in upcoming semiconductor technologies. In [EWDTs'2016], the FPT was expanded to include also external memory faults, while [TCAD'2019] uncovered novel periodicity and regularity properties, leading to the formulation of new rules governing fault evolution.

#### *Built-in Self-Test & Self-Repair*

Building on the findings of previous studies, a robust and flexible built-in self-test infrastructure was proposed and progressively refined in [ITC'2002], [CSIT'2011], [IOLTS'2011] and [ATS'2011], laying the groundwork for today's widely adopted BIST architectures. This concept was further enhanced by introducing programmability features, enabling more efficient algorithm coding and test execution. In parallel, yield optimization emerged as another key research focus, with considerable efforts directed toward evaluating existing redundancy allocation and self-repair algorithms, culminating in the development of an optimal repair strategy, supporting both manufacturing and in-field repair scenarios, as detailed in [MTDT'2001], [VTS'2004], and [D&T'2004].

#### *Advanced Inductive Failure Analysis*

With the advent of three-dimensional transistors, analyzing new defect types became crucial for scaling test solutions to cutting-edge technologies. To address this challenge, an AIFA flow was developed, which automates defect investigation and significantly reduces analysis time. The AIFA process involves defect injection at the GDS or SPICE Netlist level, using a technology-specific defect library, SPICE simulation to thoroughly analyze defect behavior and an automated synthesis of an efficient test algorithm with minimal manual intervention needed. The AIFA method and its implications for FinFET memories were detailed in [VTS'2014], [VTS'2015], [TDMR'2015], and [EWDTs'2015], and further enhancements to support emerging Gate-All-Around FET technology were presented in [VTS'2023] and [TDMR'2025].

#### *Functional Safety and ISO 26262 Compliance*

As Advanced Driver Assistance Systems and In-Vehicle Infotainment became integral to modern automobiles, the demand for higher safety and reliability grew exponentially. The ISO 26262 standard, developed specifically for the automotive, defines functional safety requirements necessary for E/E systems in production vehicles, with products classified into four Automotive Safety Integrity Levels. In [ITC'2017], a functional safety-oriented test architecture was introduced for automotive applications, addressing challenges across production, power-up, and mission-mode testing. A follow-up study in [ITC'2018] demonstrated the solution's compliance with ISO 26262. In [ITC'2020], a unified functional safety subsystem architecture was proposed for advanced nanometer nodes, featuring sophisticated ECC mechanisms for mitigating address and data errors.

#### *Silicon Lifecycle Management*

The concept of SLM has emerged as a critical research direction aimed at enhancing functional safety, as well as improving reliability, availability, and serviceability throughout a system's operational life. As part of SLM-driven research, aging and electromigration phenomena in cutting-edge memory technologies were investigated, leading to the development of an efficient in-field test methodology compliant with automotive industry standards. These studies have been published in [ITC'2018], [ITC'2022], [ETS'2023], [VTS'2024], [ITC'2024], and [DT'2024]. Another key research area has focused on reducing FIT rates and enabling predictive maintenance through in-field monitoring. The use of Error Correcting Codes and SLM monitoring subsystems for these purposes was extensively examined in [ITC'2022] and [ITC'2023]. In particular, analytical approaches have been proposed to leverage ECC data for extending memory service life and to monitor functional paths in real time for assessing SoC performance and degradation trends.

#### *16.C. Perspectives and Outlook*

The evolution of test and repair methodologies across the three epochs outlined earlier highlights the semiconductor industry's relentless innovation in response to ever-changing quality, safety, and scalability challenges. As the industry continues to advance across all levels of integration, new research opportunities are emerging to ensure that test and repair solutions keep pace with next-generation technologies.

At the atomic level, the transition to full-fledged 3D integration is advancing through the adoption of GAA transistors. At the component level, the evolution of next-generation non-volatile resistive and magnetic memories are rapidly progressing. At the SoC level, multi-die architectures are becoming a transformative paradigm, incorporating both protocol-based communication interfaces like HBM and PCIe, as well as non-protocol based general-purpose communication schemes. Future research will focus on delivering efficient and scalable solutions that ensure not only the testability and repairability of these increasingly complex systems but also address RAS challenges and enable long-term silicon health maintenance.

## 17. SMU & UST: IDENTIFYING PROBLEMS EARLY—EFFICIENT ONLINE TEST AND PREDICTIVE DELAY DETECTION<sup>1,2</sup>

J. Dworak<sup>†</sup>, K. Nepal<sup>†</sup>, T. Manikas<sup>†</sup>.

<sup>†</sup> Southern Methodist University, Dallas, Texas, USA

<sup>‡</sup> University of St. Thomas, St. Paul, Minnesota, USA

In an era when electronics have become ever more ubiquitous in mission-critical applications, the requirement to test circuits and systems not only at manufacturing, but in the field as well, has taken on greater importance. Field test may occur when a circuit is powered on or powered off, but this may not be sufficient to ensure that latent defects, aging, and other developing or environmentally-sensitive problems will not cause silent data corruption. To address this, we have developed approaches to enhance the efficiency of in-field test application by interleaving test patterns with regularly executing assembly instructions during naturally occurring stall cycles. We have also created new design for test (DFT) circuit structures that can be used at both manufacturing test and in the field to detect defects and predict that new delay-based failures may be imminent. In this paper, we briefly describe these two approaches that we proposed in recent years at the European Test Symposium and in its affiliated IMTR workshop. We will then briefly describe some of our current and future endeavors.

### 17.A. Harvesting Wasted Clock Cycles for In-field Test

Online testing for modern microprocessors has traditionally relied on stored ATPG patterns or conventional Logic and Memory Built-In Self Test (LBIST and MBIST) methods or a Software-Based-Self-Test (SBST) that uses code snippets placed into a cache for test and verification. Most of these works involve taking the core under test offline to perform the test—restricting how often tests can be applied and making test application more intrusive. To avoid taking a processor offline, instruction level techniques that detect hazards and insert test instructions instead of NOPs in the code at compile time [77] have also been proposed. However, this approach requires modifying the instruction set.

To avoid modifying the instruction set, we developed an approach for the dynamic insertion of tests into an already-running program so that stall cycles that would otherwise be wasted may be used for field testing. We demonstrated that stall cycles may be replaced with functionality that can be used to check at least some of a core’s circuitry without changing the state of the machine, interfering with the execution of the program, or requiring modification of the Instruction Set Architecture (ISA) [78].

Figure 17.1 shows additional circuitry inserted to test the ALU (and possibly other logic in the EX pipeline stage) of a RISC-V machine in [78]. The additional circuitry consisted of a small memory added to the decode stage to hold test patterns, muxes to direct test patterns into the pipeline register when a stall is detected, and a comparator and memory in the

MEM stage to determine if the test was passed. By placing the additional circuitry in this way, unacceptable delays from the ID/EX pipeline register through the ALU to the EX/MEM pipeline register can be detected more reliably.



Fig. 17.1. Additional hardware (shaded) added to facilitate ALU test. All Test control signals are denoted in red. Only portions of the pipeline directly affected by our approach are shown. [78]

To better characterize the ability to detect delays, we explored the application of our approach to transition faults. In particular, we simulated five C programs in the RISC-V to identify when stall cycles occurred and the number of stall cycles inserted each time. Although transition faults can be detected when only a single stall cycle appears, determining the transition coverage that can be obtained *a priori* is difficult because it depends on the instruction before the stall. To obtain more definite transition coverage numbers we need to consider cases where two consecutive stall cycles allow us to inject a preconditioning pattern and then an observation pattern. However, a question arises regarding what to do when only a single stall cycle is present with no consecutive stall. Should the next pattern pair be applied the next time a stall is encountered (Case 1) or should we continue to reapply the first pattern of a pattern pair on subsequent stalls until both patterns of the pair can be applied consecutively (Case 2)?

For our data collection, we focused on detecting the *functional* faults—in other words, those faults that were identified as being more likely to cause an error during program execution. Although different programs correspond to a slightly different “functional fault list,” even when the C programs were used to identify the functional faults for test pattern generation did not match the C programs that were considered to be “run in the field,” almost 99% of all field-based functional stuck-at faults and over 93.4% of all field-based transition faults were detected. We also found the median time taken to detect a fault (the median of the median number of clock cycles until detection) is faster for Case 2 than Case 1. To measure the area and performance overhead of the added circuitry, we used the OpenLane EDA toolset to convert the Verilog RTL descriptions of the RISC-V core (without the memory blocks) and mapped them to standard cells for ASIC implementation using the SkyWater SKY130 PDK. The die area increased by 5.4%, with a minimal impact of 1.2% on the original clock frequency. If the memories or caches that would normally accompany the

<sup>1</sup>This work was supported by NSF CCF1812777 and CCF1814928

<sup>2</sup>The following students contributed to this work: Eslam Yassien, Yongjia Xu, Hui Jiang, Thach Nguyen and Alexander Coyle.

processor were included in the base die area, then the overhead percentage would be even less.

### 17.B. Dual Use Circuitry for Early Failure Warning and Test

Incorrect circuit timing often leads to errors in the field, including silent data corruption. Once a circuit violates timing, recovery can be difficult even when the violation is detected. Canary flip-flops have previously been proposed by other researchers to identify when the desired slack of a path has first been violated in functional mode even before failure occurs. MISRs (multiple input signature registers) have been shown to provide significant benefits during test—including potentially during scan-based field test—by allowing defects to be detected during scan shift [79, 80].

Both the canary flip-flop approach and the scan shift MISR require the use of additional flip-flops connected to the same circuit signals as the functional flip-flops (possibly through some additional gates). However, while the canary flip-flops may achieve their most important use in functional mode, the MISR is used primarily in test mode. By creating a single structure that performs both functions, we can amortize the cost of the additional flip-flops over the entire lifetime of the circuit—including functional mode and test. In [81], we showed how this might be achieved and showed the added benefits of the combined structure. One possible implementation of such a structure is shown in Figure 17.2.



Fig. 17.2. Dual Use Structure in which Flip-Flops are shared between Canary FF use in Functional Mode and MISR during Scan Shift [81]

The figure depicts a Mux-D Scan Flip-Flop, where the multiplexer's select line connects to the *Scan Enable* signal. Below it is a dual-use flip-flop that functions as MISR FF when *Test Mode* is 0 and as a Canary FF when it is 1. The Canary FF circuitry includes the dual-use FF, a delay element, and an XOR gate to the compare values captured by the two flip-flops. The MISR circuitry consists of the dual-use flip-flop and an XOR gate feeding into its D input, with additional XORs for feedback. An AND gate determines the operational mode – MISR functionality when *Test Mode* is 1 or adds delay to the Canary path when *Test Mode* is 0. The dual-use flip-flop minimizes delay by eliminating the multiplexer, maintaining a small slack difference between the canary and functional paths.

While both canary flip-flops and the MISR used to capture data during scan shift are useful, they are unacceptably costly if every functional flip-flop is paired with a dual use flip-flop. Our analysis studied the effects of different approaches to flip-flop selection, analyzing the relative benefit of privileging slack length, fault detection, or a balance between the two. We explored three approaches for selecting flip-flops: a) MISR FF selection using Stuck-at Fault Cones, (b) Canary FF selection based on critical paths and (c) a greedy Dual Purpose FF selection based on both fault cones and critical paths.

We found the Greedy Slack algorithm to be an effective compromise capturing flip-flops at the ends of critical paths that also deliver decent additional fault detection for stuck-at faults. We chose as a maximum overhead approximately 8% of a circuit's total flip-flops, in keeping with previous research by other authors, although the specific overhead tolerance will vary based on the specific circuit and its application. Regardless of the algorithm chosen, our dual-use circuitry combining the MISR and the canary structure has a compelling benefit for the detection of both timing failure and stuck-at faults: even selecting only 8% of the flip-flops while solely considering the longest paths (i.e. Algorithm B) detects 22-88% of the low-detected stuck-at faults additional times during scan shift

### 17.C. Perspectives and Outlook

In our past work on various efficient approaches, we have focused our efforts on identifying times when tests could be applied and/or defects could be detected by harnessing resources that were otherwise going unused. We are currently working on additional optimizations to detect other types of defects using such an approach.

We are also working to build on some of our previous work in circuit monitors to develop new approaches for estimating fault coverage of functional sequences, directing test generation, and aiding in debug and diagnosis to help address issues that may lead to future silent data corruption. Some of these approaches will also have implications and benefits for hardware security.

Finally, advanced packaging techniques hold new challenges for the test, diagnosis, and repair of chiplet-based systems. Our future work will explore aspects of these developing needs as well.

## 18. HARDWARE DEPENDABILITY RESEARCH AT TU DELFT

*Mottaqiallah Taouil, Moritz Fieback, Anteneh Gebregiorgis, Rajendra Bishnoi, Said Hamdioui.*

*Delft University of Technology, The Netherlands*

### 18.A. Historical Perspective

Figure 17.3 shows the evolution of hardware dependability research at TU Delft. Such research was started in 1988 by prof. Van de Goor. His research initially focused on ATPG optimization and logic testing (both structurally and functionally), but quickly shifted mainly towards memory testing. He made substantial contributions to the field of SRAM and DRAM testing. These contributions include defect modeling, fault modeling, test generation, Design-for-Testability, Build-in-Self Test. This has resulted in an impressive amount of published articles, as well as the publication of the golden book on memory testing that is still relevant to date [82]. With the retirement of prof. Van de Goor around 2004, research on semiconductor testing in Delft slowed down and nearly came to a halt.

When prof. Hamdioui joined Delft University around 2008, he re-established hardware dependability topics in the Computer Engineering Lab and revitalized the university's research in Computer Engineering and testing. The Lab focuses on the invention, the design, the prototyping and the demonstration of disruptive computing accelerators/engines by making use of advanced CMOS technologies as well as of unique features of emerging technologies. The research targets a wide range of energy-constrained edge applications, including AI, such as personalized healthcare and smart environments. The research adopts a holistic approach in which it addresses the whole computing engine design stack (i.e., device technology, circuit design, architectures, compilers, algorithms and applications). This is done not only in order to maximize the computational efficiency, but also in order to further push the research quality. The main focus is on the middle layers (circuit design, architectures, and tools) with their dependability aspects.

Initially, the research on hardware dependability has covered four key directions: memory testing, safety, memory reliability, and the testing of 3D stacked ICs. One of the most notable outcomes was the development of 3D-COSTAR, a tool used to evaluate test flows for 3D stacked ICs. In recognition of its impact, 3D-COSTAR was awarded the HiPEAC Technology Transfer Award in 2015. As prof. Hamdioui's group expanded, he expanded the scope of the research to include topics such as device-aware test, testing and reliability of AI (Artificial Intelligence) accelerators based on emerging computing paradigms and device technologies, and hardware security.

Collaboration with both industry and academic partners has been a core philosophy of the lab. Notable collaborators include IMEC (Belgium), NXP (Netherlands), Cadence (Germany/USA), Intel (USA), STMicroelectronics (France), Intrinsic ID (Netherlands), Politecnico di Torino (Italy), Karl-sruhe Institute of Technology (Germany), Aix-Marseille University (France), Tallinn University of Technology (Estonia), and RWTH Aachen University (Germany), among others.

### 18.B. Current Research

Current research on hardware dependability in TU Delft focuses on three aspects: Test, Reliability, and Security.

1) *Testing*: The ongoing test activities focus on testing emerging memory technologies as well as AI Accelerators.

*Memory Test*: we have recently demonstrated based on solid theory and silicon measurements that existing commercial test approaches are inaccurate for emerging memory technologies, such as RRAMs, STT-MRAMs and FeFET. Relying on such solutions will end in selling defective memory chips to customers. We therefore developed and patented a new approach referred to as "Device-Aware Test (DAT)" [83]. DAT goes beyond Cell-Aware Test; it does not assume that a defect in a device can be modeled electrically as a linear resistor (as the state-of-the art approach suggests), but it rather incorporates the impact of the physical defect into the technology parameters of the device and thereafter in its electrical parameters. Once the defective electrical model is defined, a systematic fault analysis is performed to derive appropriate fault models and subsequently test solutions. DAT has been applied for different emerging memory technologies such as RTT-MRAMs, RRAMs and FeFET, and has been successfully demonstrated on real silicon devices of IMEC (Belgium) and ST Microelectronics (France). The results were published in leading industrial conferences, won many awards, and attracted the attention of the industry.

*Testing of AI Accelerators*: We have been also working on testing computation-in-memory architecture for AI applications, both using traditional technologies such as SRAMs as well as emerging memory devices such as RRAM and FeFET. In our research we have demonstrated that existing test approaches that are either based on standard memory tests are insufficient to obtain a high fault coverage in these architectures [84]. This is due to the fact that CIM circuits operate in both memory and computing modes, and make use of different hardware in each of these. Not to mention that advanced traditional memories and emerging memory devices give rise to unique defects that needs special attention (e.g., appropriate modeling). As such, we are working on the development of novel test methods that address these architectural changes in a structured manner, while considering the new defects, to achieve the required fault coverage. As AI circuits rarely need to deliver 100 % accuracy, someone can argue that functional test may do the job for some less critical applications. Such trade-off between functional and structural test is also a key question we are addressing.

2) *Reliability*: Hardware reliability research includes the analysis of reliability failure mechanisms and their impact on chip lifetime, modeling of reliability failure mechanisms, design-for-reliability, and developing mitigation schemes. The hardware reliability research at Computer Engineering Lab covers the reliability issues of advanced CMOS-based designs as well as emerging technologies such as memristors and their impact on computational accuracy.

*Reliability of advanced CMOS devices*: Deep-scaled CMOS, in the nanometer regime, suffers from lifetime reliability and design productivity. More importantly, technology scaling ac-



Fig. 17.3. Evolution of Hardware Dependability Research at TU Delft

celerates aging-induced failures [85]. To address this issue, TU Delft has been carrying out various research activities in collaboration with IMEC, such as aging impact analysis, aging modeling, and mitigation techniques to counteract the impact of aging on logic as well as memory components.

*Reliability of emerging devices:* Reliability of emerging memory devices, as well as Computation-In-Memory (CIM) architectures based on such devices, have also been investigated. The inherent non-idealities of such devices pose major challenges and may cause functional errors and reduce the computing accuracy during the deployment. Such non-idealities consist of time-zero (e.g., variation and wire parasitic) and time-dependent ones (e.g., endurance, device degradation, and resistance drift). The Computer Engineering Lab has been addressing these issues at different level. At the device-level we explore the use of new memristor device structure and material composition for better characteristics. At the circuit-level, we develop innovative circuit designs (e.g., different cell structures) that can provide correct functionality in the presence of non-idealities. At the architecture-level, we aim at mitigating the impact of non-idealities by changing the way in which an application is mapped to hardware.

3) *Hardware Security:* Our research on hardware security focuses on the design and implementation of hardware countermeasures during the design stage in order to prevent attacks in the field. Part of our research is to protect weights in computation-in-memory architectures for AI applications by making them resilient against side-channel attacks, while minimizing the impact on area. With respect to side-channels, we focus also on the resiliency of conventional symmetric and asymmetric cryptographic implementations. Several articles have been published on this and show that extreme low-area overhead schemes are possible. Another part of the research focuses on root of trust and authentication using Physical Unclonable Functions (PUFs) [86]. Here, we focus mostly on the reliability of advanced SRAM PUFs and propose, for example, solutions against bias patterns in the PUF responses. Our latest research in this domain focuses on low-cost Trusted Execution

Environments based on RISC-V's memory protection unity.

### 18.C. Outlook

The hardware dependability research at the Computer Engineering Lab will continue to address advanced technology nodes and emerging memory technologies, which are critical for the future of computing. As the industry moves toward smaller nodes and more complex devices, testing strategies must evolve to meet the unique challenges posed by technologies like FeFET, SOT, STT, and RRAM. These emerging memory technologies, coupled with new computing paradigms such as computation-in-memory and near-memory computing architectures, demand a deep understanding of their specific failure mechanisms. This knowledge is crucial not only for developing appropriate testing approaches that ensure both quality and reliability at scale, but also for creating effective design-for-test and design-for-reliability solutions. Additionally, the increasing concerns around hardware security will require focused efforts to design countermeasures against emerging attacks targeting these advanced systems, ensuring architectures remain secure without sacrificing performance. This interdisciplinary research will not only advance the field of hardware dependability but also lay the foundation for secure, reliable, and efficient future computing systems in an ever-evolving technological landscape.

## 19. MACHINE LEARNING ASSISTED TESTING AND POST-MANUFACTURE TUNING OF MIXED-SIGNAL ELECTRONICS: FROM AMPLIFIERS TO DNNs<sup>1 2 3</sup>

A. Chatterjee, A. Saha, S. Komaraju, K. Ma, C. Amarnath.  
Georgia Institute of Technology

Systems-on-Chips (SoCs) manufactured with advanced technologies have become increasingly vulnerable to uncertainties stemming from the effects of manufacturing process variability effects on novel device materials and designs and need to be tested with high fault coverage. Conventional test strategies for AMS circuits are driven by specification based testing techniques where the Device-Under-Test (DUT) specifications are measured and compared against pre-defined specification bounds to make pass/fail decisions. This requires a different test set-up for each specification resulting in increased test time and test cost. Moreover, complex test instrumentation is required for specification measurement; instrumentation such as oscilloscopes and spectrum analyzers that are difficult to incorporate on-chip for the purpose of built-in (self) test. Defect-based testing techniques, on the other hand, suffer from the lack of adequate fault models for complex emerging failure mechanisms in advanced deeply scaled technologies and the plethora of defects that need to be simulated to validate the failure coverage of the relevant test stimuli applied.

In addition to the problem of testing manufactured AMS DUTs, tested devices need to be *tuned post-manufacture, for yield recovery* from manufacturing variations due to the use of advanced silicon technologies. Such tuning, however, is difficult when the testing procedure itself is hamstrung by the need to measure complex device specifications using iterative test-tune-test algorithms that incur significant test time and tuning costs. To alleviate the above problems, test techniques for AMS circuits and systems are needed that: (a) reduce dependence on traditional specification-based testing methods to the maximum extent possible, while implicitly guaranteeing the same levels of failure coverage and minimizing test time and test instrumentation costs, (b) minimize the numbers of defect simulations that need to be performed for effective test stimulus generation, (c) allow efficient tuning of AMS systems with minimum post-processing of DUT response signals and (d) allow built-in (self) test (BiST) and tuning of AMS systems with *minimal on-chip test resources and external tester support*.

### 19.A. Historical perspective

To address the issues outlined above, one approach is to investigate how *machine learning concepts* can be applied to the problem of testing and tuning of AMS circuits and systems. To this end, the concept of alternate testing of AMS systems was first developed in [87]. This makes use of a *single test configuration* for an AMS DUT and applies a

<sup>1</sup>This research was supported by the US National Science Foundation, Semiconductor Research Corporation, Intel, Texas Instruments and MARCO-DARPA

<sup>2</sup>Other authors: P. Variyam, R. Voorakaranam, S. Devarakond, V. Natarajan, X. Wang

<sup>3</sup>S. Komaraju, K. Ma and C. Amarnath are currently affiliated with Intel Corporation, Rebellions AI, and Google respectively.

carefully optimized test stimulus to the DUT that *maximizes the statistical correlation* between the observed response of the DUT to the stimulus and its specification values of interest (such as gain, linearity, noise, etc) under *specified statistics of process variations and defects*. It is seen that under such high correlation, a trained regressor, such as based on multivariate adaptive regression splines, can be used to predict the DUT specification values directly from the observed response.

To train the regressor, the specifications of an initial *training set* of DUTs selected from diverse process corners are measured. The same set of DUTs is also subjected to the optimized test stimulus discussed above. From these two datasets: (a) specification values of DUTs and (b) DUT responses to the applied test stimulus (called the *response signature*), a regressor is trained to predict the former from the latter. During testing, the DUT is tested with the optimized stimulus and its specification values are predicted using the trained regressor. Subsequently, pass/fail classification of the DUT is performed using test acceptance bounds on the predicted DUT specification values which are calibrated to account for measurement noise and regressor prediction accuracy.

However, there is a caveat: the regressor is unable to perform this prediction when a DUT corresponds to a process corner which is not represented by the process variability statistics captured by the DUTs in the training set of devices used to train the regressor. This requires the use of an *outlier detector* which detects *outlier AMS DUTs* which are passed on to the package testing process in a "real-time" manufacturing test environment. These devices are added to the training set of DUTs and the regressor is retrained on the updated training set. This process continues until the training set of DUTs encompasses the majority of realistic process variations seen in manufacturing. Eventually outlier devices are minimized, or asymptotically approach zero, at which point very few or no specification based tests need to be conducted on manufactured DUTs.



Fig. 19.1. Machine learning assisted test/tuning flow.

Post-manufacture tuning of DUTs is also performed using machine learning assistance. A regressor can be trained to predict the *optimal tuning knob values* of a DUT (such as programmable bias currents and voltages) from the DUT test response signature (called *one-shot tuning*). Alternatively, gradient descent algorithms running on a chip-based processor or dedicated tuning engine can be used to iteratively tune the DUT using a cost metric based on predicted DUT specification values from a trained regressor as described earlier. This may use the

tuning knob values predicted from one-shot tuning above, as a starting point for iterative tuning.

### 19.B. State of play and assets in related research

Figure 19.2 (left) shows a DC-DC hysteretic power converter in which the feedback signal of the power converter is stimulated with an optimized external stimulus  $V_{stim}$ . The response of the converter (test response signature) is passed to a trained regressor which predicts the load and line regulation specifications of the power converter [88]. Figure 19.2 (right) shows the predicted load regulation (y-axis) vs. the measured load regulation (x-axis) of the converter. Each dot on the plot represents a different power converter DUT. It is seen that machine learning assisted post-manufacture tuning can result in yield improvements as large as 40%. The broad approach of machine learning assisted alternative testing and tuning of AMS circuits and systems has been validated on a range of AMS/RF circuits: amplifiers, mixers, voltage regulators, power converters, data converters, RF and MIMO transceiver systems and SerDES line drivers [89].



Fig. 19.2. Machine learning assisted power converter testing.

The above machine learning assisted test/tuning methodology has also been applied to *deep neural networks* (DNNs) implemented with analog RRAM crossbar arrays. Performance metrics for DNNs include classification accuracy, recall, precision and F1 score, among others. These arrays employ memristive devices that are vulnerable to manufacturing process variations that impact DNN classification accuracy. For example, Specification-based testing of RRAM based DNNs implies application of 10,000 images from the CIFAR test image dataset to determine classification accuracy. This is expensive and time-consuming. So the goal is to apply as few test images as possible and predict the DNN (DUT) classification accuracy from a test response signature consisting of the concatenated responses of the final dense layer of the DNN and the average values of neurons in each hidden layer of the DNN to each test image applied. This DNN is tuned post-manufacture, by adjusting the slopes and biases of the neuron activations on a per-layer basis using a tuning (optimization) algorithm ( $2N$  tuning parameters for an  $N$ -layer DNN). Figure 19.3 (a) shows the actual (measured, y-axis) vs. the predicted (x-axis) classification accuracy of a RRAM based DNN (MobileNet) on the CIFAR-10 dataset using only 10 selected images from the dataset, while Figure 19.3 (b) shows yield vs. accuracy threshold (min accuracy over which DNN is classified as "good") for the same DNN with yield of 38% for accuracy threshold of 77%. With tuning, this yield is 96% (58% yield

improvement) [90]. A key aspect is the ability to rapidly *test and tune DNN hardware* including spiking networks [91], in seconds as opposed to 10s of minutes or hours as per the state of the art.



Fig. 19.3. Machine learning assisted RRAM based DNN testing and tuning.  
19.C. Perspectives and outlook

While there is no single test and tuning technique for AMS circuits and systems that is universally better than other techniques over the wide range of practical AMS/RF applications, machine learning based techniques have advantages in that they do not require detailed modeling of device physics and circuit equations for test development. Rather, machine learning (ML) algorithms can automatically capture interdependencies between physics-based artifacts and high level circuit and system performance. This reduces manual modeling effort. However, ML techniques need to be trained carefully and therein lies the adage "there is no free lunch". Such training does require designer knowledge and experience and modeling of test instrumentation nonidealities. In environments where test access is difficult such as for embedded AMS components in SoCs, and where placing test instruments such as oscilloscopes is impossible, the use of ML based methods with built-in sensors for test response observation offers unique advantages for built-in self-testing, tuning and lifelong system maintenance. A challenge still remains however; how can machine learning systems be trained and calibrated for embedded DUTs whose inputs and outputs are not fully controllable and observable, respectively? Moreover, the effects of nonidealities in sensors inserted into the design for response observability, need to be compensated by intelligent sensor de-embedding algorithms.

Another alternative to regressor based specification prediction is to use machine learning based lightweight outlier detection. The idea is to design alternative tests that characterize the behaviors of "good" devices (DUTs) using response clustering algorithms in a reduced-dimension space such as via principal components analysis of the DUT response across diverse process corners. Devices (DUTs) that are determined to not be within the response statistics of "good" devices (outliers) are classified as "bad". Very few defective devices in the vicinity of DUT pass/fail boundaries as determined by device specifications need to be simulated and perhaps none at all, during test stimulus generation. Test acceptance limits are determined using boundaries determined by statistical analysis methods or trained ML classifiers. For DNNs, such alternate tests need to be developed for emerging RRAM based transformers and other generative AI architectures.

## 20. KIT: BEYOND CMOS AND CMOS 2.0: RELIABLE, TESTABLE AND SECURE

M. Tahoori, M. Mayahinia<sup>1,2</sup>.

Karlsruhe University of Technology (KIT), Germany

This section outlines the research direction pursued over the past 16 years by the Chair of Dependable Nano Computing (CDNC), part of the Department of Informatics at the Karlsruhe Institute of Technology (KIT). CDNC's research focuses on dependable computing using emerging and advanced nanoscale technologies, particularly focusing on testing, reliability, and security aspects, and explores the use of emerging devices for novel computing paradigms. We highlight our contributions to the evolving field of dependable computing, tracing our journey from past achievements to current efforts and future directions. As a European-based academic group dedicated to various facets of dependability, the CDNC maintains a strong connection with the IEEE European Test Symposium (ETS), which will be discussed in more detail below.

### 20.A. Establishment and research vision of the CDNC

CDNC was established in 2009 by Prof. Mehdi Tahoori when he joined the Karlsruhe Institute of Technology for the establishment of a research group on “Design and Computing in the Nano Era” financed within the German Excellence Initiative. One of the main research pillars of CDNC was the investigation and mitigation of various aging effects and mechanisms across the entire design stack, from technology all the way to microarchitecture and application. Negative and Positive Bias Temperature Instability (NBTI and PBTI) as well as Hot Carrier Injection (HCI), are the major transistor aging mechanisms. Thereby, our early contributions in the aging modeling and mitigation were centered around the transistors, i.e., Front End of the Line (FEoL) elements<sup>3,4</sup>. Nevertheless, more recently, particularly stemming from the aggressive downscaling in advanced technology nodes, the Back End of the Line (BEoL) interconnect are subject to severe aging through a phenomenon known as electromigration<sup>5,6</sup>. Therefore, the trajectory of our long-term reliability research has shifted toward a comprehensive cross-layer investigation of aging in FEoL and BEoL elements.

Additionally, the advent of resistive non-volatile memories has opened the door to normally-off computing, paving the

<sup>1</sup>The authors would like to thank the past and current MSc, PhD students, and postdocs of the CDNC.

<sup>2</sup>The authors acknowledge the funding from the German Research Foundation, the Federal Ministry of Education and Research, European Research Council (advanced grant), imec, and Siemens EDA.

<sup>3</sup>Kiamehr, Saman, et al. “Investigation of NBTI and PBTI induced aging in different LUT implementations.” International Conference on Field-Programmable Technology. IEEE, 2011.

<sup>4</sup>Amouri, Abdulazim, et al. “Aging effects in FPGAs: An experimental analysis.” 24th international conference on Field Programmable Logic and Applications (FPL). IEEE, 2014.

<sup>5</sup>Nair, Sarath Mohanachandran, et al. “Physics based modeling of bimodal electromigration failure distributions and variation analysis for VLSI interconnects.” IEEE International Reliability Physics Symposium (IRPS). IEEE, 2020.

<sup>6</sup>Nair, Sarath Mohanachandran, et al. “Workload-aware electromigration analysis in emerging spintronic memory arrays.” IEEE Transactions on Device and Materials Reliability, 2021.

way for highly energy-efficient systems. Moreover, their inherent resistive properties make them well-suited for analog in-memory computing, offering a promising solution to the memory wall bottleneck and unlocking further energy savings. However, despite these advantages, emerging non-volatile memory technologies face significant reliability challenges, including sensitivity to process variations and manufacturing defects. Addressing these issues requires robust design strategies. Consequently, exploring cross-layer approaches—from technology to algorithm level—for building energy-efficient yet reliable systems based on non-volatile memories represents a research direction within CDNC<sup>7,8</sup>.

While computing devices and systems are pervasive today, a large number of important domains, such as fast-moving consumer goods and personalized medicine, have still not seen the benefits of computing. Cheap and flexible printed electronics may allow us to break these barriers. However, this is highly challenging since only a small number of devices can be integrated, and those have high variability. At CDNC, we have been the first to perform cross-disciplinary research in this topic, making breakthroughs by developing printed computing architectures for cheap yet reliable and accurate electronics<sup>9,10</sup>.

Our work on cross-layer resiliency has been extended to hardware security by uncovering *electrical-level attacks*, fundamental vulnerabilities that exploit the interconnected nature of hardware systems, bypassing logic-level and algorithmic isolation schemes. We demonstrated these vulnerabilities in Field Programmable Gate Arrays (FPGAs), widely used in Machine Learning and Deep Learning across cloud and edge computing. We were the first to showcase *remote side-channel* and *fault attacks* on cloud FPGAs, including cryptographic key recovery, without physical access, posing threats to millions of cloud users<sup>11,12</sup>.

*1) Collaborations and contributions:* The CDNC has built upon strong national and international collaborations with renowned academic and industrial partners, including Darmstadt University, Kaiserslautern University, University of Erlangen-Nuremberg, Technical University of Munich, and Dresden. On the international level, CDNC has an ongoing collaboration with the National University of Athens in Greece, the Grenoble Institute of Engineering in France, as well as several universities in the United States, including Duke University, Arizona State University, University of Illinois Urbana-Champaign, and University of California, Riverside. Besides the academic groups, CDNC has been collaborating with indus-

<sup>7</sup>Bishnoi, Rajendra, et al. “Read disturb fault detection in STT-MRAM.” In International Test Conference, IEEE, 2014.

<sup>8</sup>Münch, Christopher, et al. “MBIST-based Trim-Search Test Time Reduction for STT-MRAM.” IEEE 40th VLSI Test Symposium (VTS), IEEE, 2022.

<sup>9</sup>Hefenbrock, Michael, et al. “In-situ tuning of printed neural networks for variation tolerance.” Design, Automation & Test in Europe Conference & Exhibition (DATE), IEEE, 2022.

<sup>10</sup>Weller, Dennis D., et al. “Realization and training of an inverter-based printed neuromorphic computing system.” Scientific Reports, 2021.

<sup>11</sup>Schellenberg, Falk, et al. “An inside job: Remote power analysis attacks on FPGAs.” IEEE Design & Test, 2021.

<sup>12</sup>Krautter, Jonas, et al. “FPGAhammer: Remote voltage fault attacks on shared FPGAs, suitable for DFA on AES.” IACR Transactions on Cryptographic Hardware and Embedded Systems, 2018.

tries such, Siemens EDA, Infineon Technologies, Fraunhofer, imec, and Synopsys. Additionally, since its foundation, the CDNC has actively contributed to major testing conferences by presenting research, organizing special sessions, and workshops. These include the International Test Conference (ITC), the VLSI Test Symposium (VTS), and ETS. In particular, CDNC organized the 24th ETS in Baden-Baden in 2019.

#### 20.B. Today's research direction of the CDNC

The increasing demand for highly efficient Artificial Intelligence (AI) continues to drive and shape CDNC's research focus. In response, CDNC is advancing a cross-layer system stack design to develop energy-efficient, high-performance, reliable, and secure AI hardware accelerators. This effort spans innovations in Hyperdimensional Computing (HDC) and neural networks, implemented both on FPGAs and using in-memory computing fabrics. Also, realizing the full potential of flexible electronics calls for comprehensive hardware–software co-design, marking another key research direction within CDNC.

To further elaborate, our work presented in [92] targets compressed test pattern generation for deep neural networks, which can be applied to any hardware platform, achieving a fault coverage of up to 99.99% and a compression rate of 307.2x. Considering hardware-specific platforms, in [93], we investigate the challenge of out-of-distribution detection for quantized neural networks implemented on an FPGA, targeting the safety-critical systems. Compared to prior methods, with a negligible overhead, our approach in [93] improves detection accuracy by up to 25% on average and operates effectively with binary neural networks.

Besides reconfigurable fabrics, by utilizing the crossbar structure of the emerging non-volatile memories in [94], we consider the Error-Correcting Code (ECC) to protect the neural network parameters. The proposed approach integrates ECC parity bits directly into the neural network parameters during training. As a result, we show zero memory overhead up to 4-bit error correction per 64-bit data, while having a negligible impact of ECC encoding on baseline accuracy (1% for CIFAR-10 and 2% for CIFAR-100 datasets).

In addition, a hardware-software co-design approach to achieve an energy-efficient and reliable HDC acceleration using resistive non-volatile Content-Addressable Memory (CAM) structure is the main focus of our work, presented in [95]. To achieve a compromise between energy efficiency and process-variation susceptibility of CAM-based analog design, we conduct a holistic investigation encompassing technology, circuit, architecture, and algorithmic levels. This way, the energy consumption compared to the state of the art is improved by 8x, with negligible accuracy loss of 0.1%.

However, the HDC acceleration using the CAM structure introduces side-channel vulnerabilities, which form the main focus of our work in [96]. Specifically, the data-dependent power consumption in the CAM structure, utilized for storing and matching hypervectors, generates identifiable patterns that significantly expose class hypervectors to side-channel attacks. Our experiments indicate that merely 338 power traces suffice to accurately recover class hypervectors. To counteract this

vulnerability, we propose a dual-rail hiding technique, which equalizes power consumption by duplicating CAM arrays to store complementary class hypervectors. Although this approach effectively hides power signatures and makes side channel power attacks unsuccessful, it introduces considerable overhead, approximately an 87% increase in area and a 74% rise in power consumption.

In flexible electronics and with the focus on the hardware-software co-design, we propose a binary search Analog-to-Digital Converter (ADC) architecture specifically optimized for flexible electronics. Moreover, integrating an ADC optimization during the training phase of classifiers such as Multi-Level Perceptrons (MLPs) allows the removal of redundant components, achieving up to 5x reduction in transistor count with less than 1% accuracy loss<sup>1</sup>.

Processing the sensory time series data is another use case of flexible electronics. To mitigate manufacturing variation and sensor noise, we utilize a second-order learnable low-pass filter as a hardware-based solution, and employ data augmentation along with the injection of simulated variations in hardware components during neural network training as software-based methods. The neural network directly mimics the functionality of flexible electronics, where elements such as weights correspond to the emerging non-volatile memories. This hardware-software co-design results in a ~24.7% improvement in accuracy and a ~91% power reduction<sup>2</sup>.

#### 20.C. Perspective and outlook

The realization of edge AI using emerging technologies, such as non-volatile memories and flexible electronics, represents a promising future research direction for CDNC. In this context, mixed-signal intelligent on-sensor processing emerges as a key enabler for many resource-constrained edge-AI applications.

The five major challenges—scaling, memory, power, sustainability, and cost—are now limiting the effectiveness of traditional CMOS design methods at advanced nodes. These limitations demand a fundamental shift in design and technology approaches to meet the increasing performance and efficiency requirements of future applications. In strong collaboration with imec, CDNC is researching CMOS 2.0, a new paradigm driven by System Technology Co-Optimization (STCO), where technology and system design are developed in tandem.

In addition, CDNC actively contributes to the development of open-source Electronic Design Automation (EDA) tools, as we aim to contribute meaningfully to the European Chips Act. We believe that such an impact can only be achieved through strong collaboration with other leading research groups across Europe, among which the ETS community stands out as one of the most promising partners.

<sup>1</sup>Lozano Duarte, Paula Carolina, et al. "Design and In-training Optimization of Binary Search ADC for Flexible Classifiers." Proceedings of the 30th Asia and South Pacific Design Automation Conference. 2025.

<sup>2</sup>Gheshalghi, Tara, et al. "ADAPT-pNC: Mitigating Device Variability and Sensor Noise in Printed Neuromorphic Circuits with SO Adaptive Learnable Filters." 2025 Design, Automation & Test in Europe Conference & Exhibition (DATE), IEEE, 2025.

## 21. CONNECTING HIGH TO LOW ABSTRACTION LEVELS IN MAKING DESIGN AND TEST DECISIONS IN EMBEDDED SYSTEMS

*M. Rajabalipanah, K. Basharkhah, N. Nosrati, Z. Jahanpeima, Z. Navabi.*  
University of Tehran, Tehran, Iran

The focus of our research group is on abstract methodologies for realization of design, evaluation, and test of digital systems with consideration of low-level physical properties. This research summary paper presents an aspect of our work that uses our implementation of RISC-V, i.e., AFTAB, in embedded systems for various design, test, and reliability applications.

### 21.A. Preview

In our research group we consider various aspects of digital system design from the very abstract virtual platform level to low-level physical design. Through the use of hardware description languages, e.g., VHDL, System Verilog, and SystemC (TLM, AMS, UVM, etc.), we keep a connection between physical properties of hardware with abstract models, no matter how high level the abstraction we deal with. This connectivity is achieved through the use of back-annotation, training neural networks, and use and development of specific domain hardware languages with direct one-to-one hardware correspondence.

For exercising our methods and techniques, we have designed and implemented a RISC-V-based machine, based on which several embedded systems are built. The processor (Instruction Set Simulator (ISS)) is developed in SystemC for evaluation of our embedded systems at the system-level. Most hardware techniques that we develop are first tested in SystemC at the system-level and then lower-level hardware of them are obtained. This applies to our reliability techniques of RISC-V power analysis. The rest of this paper gives a glimpse of some of works that relate to digital system testing.

### 21.B. Current Research, Frameworks and Projects

#### 1) AFTAB:

One of our key projects is AFTAB (which stands for A Fine Tehran/Turin Architectural Being), a 32-bit processor based on the RISC-V architecture [97]. AFTAB supports the 32IM extension and features interrupt and exception handling. It includes both Machine Mode (M-mode) and User Mode (U-mode) for enhanced functionality. We have developed both multi-cycle and pipelined versions of AFTAB. Its toolchain has been fully developed, and we have also created an ISS for system-level integration and testing. Additionally, we have worked on testing and security aspects of this processor, which will be discussed in detail in later sections.

The design of AFTAB's datapath and controller is illustrated in Fig. 21.1, showcasing the structured architecture of the multi-cycle processor.

#### 2) SAYAC-based Embedded System:

SAYAC is a RISC-V-like open-source academic processor, shown in Fig. 21.2, that was originally designed to support educational processor hardware architecture and implementation.



Fig. 21.1. AFTAB's datapath and controller

However, the simple and ample architecture of this processor provides increased value enabling the researchers to devise new research directions in design, test, reliability, high-level modeling, and security. During past years our research group evolved the architecture to a complete Ecosystem shown as Fig. 21.3.

The SAYAC ecosystem includes open-source tools like Yosys synthesis tool and Qflow for design process. In addition to these, a SystemC-based gate-level power estimation tool has been developed for power characterization and back-annotation.

#### a) Processor Testing:

Several existing Design for Test (DFT) techniques are incorporated into the SAYAC processor to make it testable and evaluate its testability for post-manufacturing faults, considering stuck-at-fault models. After several design modifications to make SAYAC test-ready, the following DFT techniques have been incorporated into SAYAC:

- Single and Multiple scan testing
- Built-in self-test (BIST) architectures, including RTS and STUMPS
- Boundary-scan IEEE 1149.1 standard

#### b) Memory Testing:

Memory blocks of SAYAC, including RAM, instruction ROM, and register file have been tested for memory fault models through:

- Memory BIST architectures
- Boundary scan IEEE 1149.1 architectures

#### c) Compiler:

Since SAYAC processor is a custom core with an in-house ISA,





Fig. 21.3. SAYAC Ecosystem

a custom compiler has to be developed for this purpose. We are currently working on an LLVM-based compiler as a joint work with Indian Institute of Technology Bhubaneswar under the supervision of Professor Srinivas Boppu. At the time of writing SAYAC compiler handles many of C constructs. Some applications have been tested using this compiler and the work is under progress.

#### 3) Test Tool:

UT-DATE (University of Tehran Design and Test Environment) is a test toolchain developed to automate the testing process from start to finish. UT-DATE checks scan design rules and can insert various test hardware in the CUT, including scan chains, Built-In Self-Test (BIST), and Boundary Scan IEEE 1149.1 hardware. After generating fault lists and test patterns, it runs virtual tests for fault simulations. Finally, it reports test results, such as fault coverage.

#### 4) Reliability:

We focus on enhancing reliability in both special-purpose accelerators (like neural networks) and general-purpose processors. For processors, which have a heterogeneous structure, we introduce a machine learning-based checker. This checker monitors the data bus, address bus, and control signals to detect faults. A dedicated hardware implementation is designed to meet the time, area, and power constraints of embedded systems, with additional protection provided by residue-based checkers.

#### 5) Security:

Some work has been done on enhancing security in processors, addressing vulnerabilities and ensuring robust protection against threats. Below, we highlight some of these key efforts and advancements.

##### a) Processor Memory Protection:

Physical Memory Protection (PMP) is a key feature in RISC-V processors that provides hardware-enforced memory protection. It allows the system to define access permissions (read, write, execute) for specific memory regions, enhancing security by preventing unauthorized access to critical areas of memory. In AFTAB, we have implemented PMP to ensure robust memory protection. This feature enables AFTAB to securely manage memory access at the hardware level, making it suitable for applications requiring high security and reliability.

##### b) Control Flow Graph Checking:

Processors in safety-critical systems are susceptible to con-

trol flow hijacking through fault injection or software-based attacks. To mitigate these threats, we employ the control flow integrity (CFI) technique. Our approach segments the program into hierarchical blocks called DICONs, each beginning at a divergence point in the control flow graph (CFG) of a program and ending at a convergence point. Custom instructions are used for DICON signature verification. Upon completing a sub-block, the collected signature is compared against the compiler's expected signature stored in the secure external memory. Our method is implemented on the AFTAB demonstrating significant advantages in memory usage and execution time while effectively detecting the aforementioned attacks.

##### c) AES Custom Instruction:

Cryptography plays a key role in protecting data, and AES (Advanced Encryption Standard) is well-suited for hardware implementation. We designed a 128-bit AES unit that uses the same hardware for both encryption and decryption. This unit was integrated into AFTAB, by adding four custom instructions. These instructions differ in their opcode for load/store operations, while a single bit in the instruction (called "mode") determines whether encryption or decryption is performed. The input and output data are handled through AFTAB's internal memory bank. This method improves data security compared to using the AES unit as an external accelerator or co-processor, where data would need to leave the processor for encryption or decryption.

## 21.C. Connection to ETS

In ETS 2021 [98], we designed a processing element (PE) for convolution accelerators that supports row-stationary dataflow. Our PE includes an embedded mechanism for online testing of its computational parts, leveraging CNN data sparsity to avoid performance degradation. Additionally, we use a low-cost error-detecting code to verify PE interconnections and local memory.

In ETS 2023 [99], we showed that IP-core interconnects can be fully characterized by post layout information of the IP-core, load properties, and the number of destination cores they are driving. This information can be back-annotated into abstract system level interconnect models to be used by core integrators for design space exploration (DSE). We proposed a machine-learning based methodology that uses signoff parasitic information and the actual wire data to generate the dataset and train a model. The model was evaluated in fast high-level SystemC environment SAYAC and AFTAB processors in two SoCs.

In ETS 2024 [100], we presented an executable model of a BIST-inserted IP core. This model provides the customer with the golden signature of any test configuration and gives insights into how the custom configuration may affect the quality of the test and what configuration best suits the test constraints.

In ETS 2025 [101], we introduced a gate-level back-annotation of aging characteristics that improves simulation efficiency and retains accuracy. This method is used in a uniform framework that brings in aging considerations in an event-based simulation environment and automates the extraction of the required information.

## 22. ETS AND THE COMPUTER ARCHITECTURE GROUP OF THE UNIVERSITY OF STUTTGART

*Hans-Joachim Wunderlich, University of Stuttgart, Germany; Sybille Hellebrand, Paderborn University, Germany*

Founded in the same year, the European Test Workshop (ETW, now ETS) and the Computer Architecture Group of the University of Stuttgart have a long joint history. This brief report points to some of the highlights in research and in the evolutions of the conference.

### 22.A. The roots

In the year 1996, six researchers had the feeling that in Europe, academia and industry needed an event especially dedicated to test and reliability of semiconductors and systems. Chaired by Christian Landrault (LIRM Montpellier), Joan Figueras (UPC Barcelona), Paolo Prinetto (P Torino), Paulo Teixera (INESC, Lisbon), Hans-Joachim Wunderlich (U Stuttgart) and Yervant Zorian (LogicVision, San Jose) the first European Test Workshop started in Sète close to Montpellier.

Just two months later, the Computer Architecture and Computer Engineering Group was established at the University of Stuttgart. Its most prominent research topics will be described below.

The first three editions of the ETW did not publish formal proceedings but only informal handouts. However, these handouts contained many seminal contributions published later at renowned conferences and journals. The first formal proceedings appeared 1999 at the ETW in Constance (Germany) chaired by H.-J. Wunderlich, but even after elevating ETW to a symposium in 2004, ETS contained a workshop track where early results were only printed in the informal handouts, and the final versions of the papers were published at different places. In addition, the best papers of ETW and ETS were selected for publication in the Kluwer (later Springer) Journal JETTA. The Stuttgart Computer Architecture group used the option of early pre-presenting at ETW and ETS extensively.

### 22.B. Highlights in research and collaboration

#### 1. Deterministic Logic BIST

From the very beginning, deterministic logic BIST (DLBIST) has been in the center of the activities of the Computer Architecture group. Today, DLBIST is supported by the major CAD vendors, since it has become mandatory in safety-critical systems like automotive, air or space applications and even in high-performance computing. One of the papers with the highest number of references of this topic in literature was presented in the first edition of ETW<sup>1</sup>, later ETW presentations increased the efficiency of DLBIST<sup>2</sup> and proved the applicability to industrial circuits<sup>3</sup>. While the concept to combine test point insertion

<sup>1</sup>This work was first orally presented at ETW, but published in the same year at ICCAD: H.-J. Wunderlich, G. Kiefer, "Bit-flipping BIST", International Conference on Computer Aided Design (ICCAD), 1996.

<sup>2</sup>Originally presented at ETW: S. Hellebrand, H.-G. Liang, and H.-J. Wunderlich: "A Mixed Mode BIST Scheme Based on Reseeding of Folding Counters", Journal of Electronic Testing 17, 341–349 (2001).

<sup>3</sup>Originally presented at ETW: G. Kiefer, H. Vranken, E. J. Marinissen, H.J. Wunderlich: "Application of Deterministic Logic BIST on Industrial Circuits", Journal of Electronic Testing 17, 351–362 (2001).

with DLBIST has been commercialized just recently, the first feasibility studies were published in the formal proceedings of ETW twenty years earlier<sup>4</sup>. Today, commercial EDA vendors are extending DLBIST to cover more sophisticated fault models as well, while the Computer Architecture group in Stuttgart published case studies of transition fault testing with DLBIST together with an industrial partner at ETS also twenty years ago<sup>5</sup>.

All these papers laid the basis for further adaptions and optimizations to consider power, variability, temperature and impact on speed and hardware. They are still subject of current research in the community<sup>6</sup> [102].

#### 2. Power-aware Test

End of the 90s of the last century, power consumption of semiconductor systems became a major concern, and during testing, power consumption is even increased further. Again, one of the most referenced papers in literature on power-aware test and design-for-test has been first presented in the workshop track of ETW 1999 in Constance, later at ITC and in an extended journal version<sup>7</sup>.

At this time, a lively dispute followed whether there is really a need to be power conscious during test, but today it is mainstream in order to keep temperature within the specification and to ensure signal integrity and hence yield. Any new technology and any new design-for-test scheme have power-aware test as an ongoing research and development goal.

#### 3. Logic Diagnosis and Automotive Systems

Especially in automotive systems with their extremely high safety and reliability requirements, lifetime silicon management and diagnosis are mandatory<sup>8,9</sup>. In the automotive area, the test community could realize that design-for-test and -diagnosis is a need for the end users. The OEMs do not any more consider these features as an expensive overhead but as an additional value. The Computer Architecture group took serious efforts to combine embedded test and DLBIST with high resolution diagnosis and presented a comprehensive overview in a tutorial

<sup>4</sup>H. Vranken, F. Meister, H.-J. Wunderlich: "Combining Deterministic Logic BIST with Test Point Insertion", Proceedings of the 7th European Test Workshop (ETW'02), Korfu, Greece, 26-29 May 2002, pp. 105-110.

<sup>5</sup>V. Gherman, H.-J. Wunderlich, J. Schloeffel, M. Garbers: "Deterministic Logic BIST for Transition Fault Testing", Proceedings of the 11th European Test Symposium (ETS'06), Southampton, United Kingdom, 21-24 May 2006, pp. 123-130.

<sup>6</sup>At ETS 2023 in Venice, some of these challenges were described in a keynote which was later published in a journal: J. Rajski, V. Chickermane, J.-F. Côté, S. Eggersglüß, N. Mukherjee, J. Tyszer, "The Future of Design for Test and Silicon Lifecycle Management," IEEE Design & Test, vol. 41, no. 4, pp. 35-49, Aug. 2024.

<sup>7</sup>Originally presented at ETW: S. Gerstendorfer, H.-J. Wunderlich: "Minimized Power Consumption for Scan-Based BIST", Journal of Electronic Testing 16, 203–212 (2000).

<sup>8</sup>F. Reimann, M. Glaß, J. Teich, A. Cook, L. Rodríguez Gómez, D. Ull, H.-J. Wunderlich, P. Engelke, and U. Abelein: "Advanced Diagnosis: SBST and BIST Integration in Automotive E/E Architectures", 51st Annual Design Automation Conference (DAC '14), 2014.

<sup>9</sup>U. Abelein, A. Cook, P. Engelke, M. Glaß, F. Reimann, L.R. Gomez, T. Russ, J. Teich, D. Ull: "Non-intrusive integration of advanced diagnosis features in automotive E/E-architectures", 2014 Design, Automation & Test in Europe, Dresden, Germany, 2014.

at ETS<sup>1</sup>. Eventually, they created the notion of Built-in Self-diagnosis (BISD)<sup>2</sup> [103] which is now a recognized part of silicon lifecycle management schemes.

In modern technologies, the determination of fault models and the restriction to certain fault dictionaries can limit the resolution and accuracy of diagnostic algorithms. The first algorithm which has overcome these limitations received 2007 the ETS best paper award [104].

#### 4. ETS Steering Committee and Test Spring School

In 2008, H.-J. Wunderlich followed Christian Landrault in the position of the chair of the steering committee of ETS. One major action was the creation of the ETS Test Spring School, TSS, which took place in Praha 2010 for the first time. Renowned researchers from industry and academia introduce young researchers into most advanced topics of semiconductor test and reliability. Up to now, approximately 450 participants, mostly young PhD students, received a certificate for attending the school or even for successful passing the final online exam.

#### 5. Reliability and Early Life Failures

The further scaling of semiconductors and the increasing complexity of the manufacturing process caused reliability concerns especially with respect to early life failures due to marginalities which are not observable at the time of manufacturing but will increase rather soon later. Fault tolerance and repair help to overcome these reliability threats, and test generation and fault simulation algorithms have to be robust against variations and exploit non-functional observables.

In random logic, small delay faults caused by resistive opens or bridges are such a reliability threat even if they do not change the system speed at the time of manufacturing. These hidden delay faults are only detectable if the circuit is operated at a higher frequency than the specified one. This leads to a faster-than-at-speed test which poses additional challenges due to variations, signal stability or X-masking of undefined outputs. Various BIST schemes to master these challenges have been developed<sup>3</sup>.

The developed test and BIST schemes had to be complemented by online monitoring techniques. Cost and efficiency of these techniques have been significantly improved by reusing them for manufacturing test, early life failure detection and aging monitoring<sup>4</sup>. To validate and extend the methods for deterministic testing precise and timing aware fault simulation and test generation are required. A corresponding paper on variation-aware ATPG received the ETS Best Paper Award in 2014 [105].

<sup>1</sup>H.-J. Wunderlich, "From Embedded Test to Embedded Diagnosis," European Test Symposium (ETS'05), Tallinn, Estonia, 2005, pp. 216-221.

<sup>2</sup>M. Elm; H.-J. Wunderlich: "BISD: Scan-Based Built-In Self-Diagnosis", ACM/IEEE Design Automation and Test in Europe (DATE'10), Dresden, Germany, 2010.

<sup>3</sup>M. Kampmann, M. A. Kochte, C. Liu, E. Schneider, S. Hellebrand and H.-J. Wunderlich, "Built-In Test for Hidden Delay Faults," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 38, no. 10, pp. 1956-1968, Oct. 2019.

<sup>4</sup>C. Liu, E. Schneider and H.-J. Wunderlich, "Using Programmable Delay Monitors for Wear-Out and Early Life Failure Prediction," 2020 Design, Automation & Test in Europe (DATE), Grenoble, France, 2020, pp. 804-809.

Highly recognized are the results obtained for online repair of embedded memories which do not require storing large failure bit maps in the system but intertwine test and repair on the fly and still obtain an optimal solution [106]. The corresponding search procedure is combined with an efficient technique to continuously reduce the problem complexity and keep the test and analysis time low.

#### 6. Reconfigurable Scan Networks

Test, debug and diagnosis require easy access to internal instruments, sensors, monitors and also signature registers and pattern generators for DLBIST. Important means for this are reconfigurable scan networks (RSNs) as for instance standardized as IEEE 1687. The Computer Architecture group published an early paper in this area for optimized retargeting<sup>5</sup> and analyzed the use of RSNs offline and also online in the field for health monitoring and silicon lifecycle management. RSNs allow the access to internal structures and need special effort for ensuring security. Several techniques have been developed for analyzing and preserving security properties during RSN integration<sup>6</sup>.

#### 7. Test under Variations

With the appearance of the FinFET and Gate-All-Around technologies, the effects of process variations were only partly and temporarily mitigated. Process variations are again increasing and have severe impact on the circuit timing. They must be differentiated from circuit marginalities and defects which may enlarge during lifetime and lead to a complete circuit failure, and a first classification scheme has been published at ETS<sup>7</sup>. Process variations and temperature and voltage fluctuations aggravate this problem, and DLBIST schemes have to be adapted [102].

#### 22.C. Outlook

After 29 years, the Computer Architecture Group of the University of Stuttgart has been closed when the last PhD student left February 2025. In total, more than 400 contributions to books, journals, conferences or workshops have been published. From these, 40 papers were presented at the ETW or ETS, respectively. The group received 11 best paper awards at various conferences, from these, 2 awards are from ETS.

While the group is closed, the authors are still offering to share their expertise if requested.

<sup>5</sup>R. Baranowski, M. A. Kochte and H. -J. Wunderlich, "Scan pattern retargeting and merging with reduced access time," European Test Symposium (ETS), Avignon, France, 2013.

<sup>6</sup>N. Lylina, C.-H. Wang and H.-J. Wunderlich, "SCAR: Security Compliance Analysis and Resynthesis of Reconfigurable Scan Networks," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 41, no. 12, pp. 5644-5656, Dec. 2022.

<sup>7</sup>Z. P. Najafi-Haghi, M. Hashemipour-Nazari and H.-J. Wunderlich, "Variation-Aware Defect Characterization at Cell Level", IEEE European Test Symposium (ETS), Tallinn, Estonia, 2020.

## REFERENCES

### INTRODUCTION

[1] S. Eggersglüß, S. Hamdioui, A. Jutman, M. K. Michael, J. Raik, M. S. Reorda, M. Tahoori, and E.-I. Vatajelu. “IEEE European Test Symposium (ETS)”. In: *2019 IEEE International Test Conference (ITC)*. 2019, pp. 1–4. doi: 10.1109/ITC44170.2019.9000148.

### TALTECH: FROM DECISION DIAGRAMS TO EDGE AI RELIABILITY

[2] R. Ubar, J. Raik, M. Jenihhin, and A. Jutman. *Structural Decision Diagrams in Digital Test: Theory and Applications*. Springer Basel AG, 2024.

[3] A. Tsertov, A. Jutman, S. Devadze, M. S. Reorda, E. Larsson, F. G. Zadegan, R. Cantoro, M. Montazeri, and R. Krenz-Baath. “A suite of IEEE 1687 benchmark networks”. In: *2016 IEEE International Test Conference (ITC)*. 2016, pp. 1–10. doi: 10.1109/TEST.2016.7805840.

[4] M. Jenihhin et al. “Identification and Rejuvenation of NBTI-Critical Logic Paths in Nanoscale Circuits”. In: *Journal of Electronic Testing* 32 (2016), pp. 273–289.

[5] M. H. Ahmadiilvani, M. Taheri, J. Raik, M. Daneshtalab, and M. Jenihhin. “DeepVigor: Vulnerability Value RanGes and FactORs for DNNs’ Reliability Assessment”. In: *2023 IEEE European Test Symposium (ETS)*. 2023, pp. 1–6. doi: 10.1109/ETSS6758.2023.10174133.

[6] M. Taheri, N. Cherezova, S. Nazari, A. Rafiq, A. Azarpeyvand, T. Ghasempouri, M. Daneshtalab, J. Raik, and M. Jenihhin. “AdAM: Adaptive Fault-Tolerant Approximate Multiplier for Edge DNN Accelerators”. In: *2024 IEEE European Test Symposium (ETS)*. 2024, pp. 1–4. doi: 10.1109/ETSS61313.2024.10567161.

### A LEGACY OF DEPENDABILITY IN CLUJ-NAPOCA, ROMANIA

[7] I. Stoian. *IPA CLUJ 1976-2021, ACTA AUTOMATICA. SCIENTIA*. Cluj-Napoca, Romania: Risoprint, 2024. 508 pp. ISBN: 978-973-53-3189-4.

[8] S. Enyedi and L. Miclea. “IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR)”. In: *2019 IEEE International Test Conference (ITC)*. Washington, DC, USA: IEEE, Nov. 2019, pp. 1–4. doi: 10.1109/ITC44170.2019.9000144.

[9] T. Sanislav and L. Miclea. “Cyber-Physical Systems - Concept, Challenges and Research Areas”. In: *Control Engineering and Applied Informatics* 14.2 (June 2012), pp. 28–33.

[10] S. Folea, D. Bordencea, C. Hotea, and H. Valean. “Smart home automation system using Wi-Fi low power devices”. In: *Proceedings of 2012 IEEE International Conference on Automation, Quality and Testing, Robotics*. Cluj-Napoca, Romania: IEEE, May 2012, pp. 569–574.

[11] D. G. Mois, S. Flonta, I. Stefan, S. Enyedi, and L. C. Miclea. “Reconfiguration security for hardware agents in testing”. In: *2010 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR)*. Cluj-Napoca, Romania: IEEE, May 2010, pp. 1–5.

### ESLAB/LINKÖPING U: SOC TESTING AND DFT

[12] J. Pouget, E. Larsson, Z. Peng, M.-L. Flottes, and B. Rouzeyre. “An Efficient Approach to SoC Wrapper Design, TAM Configuration and Test Scheduling”. In: *Proc. IEEE European Test Workshop (ETW)*. Maastricht, The Netherlands, May 2003, pp. 51–56.

[13] G. Jervan, R. Ubar, T. Shchenova, and Z. Peng. “Energy Minimization for Hybrid BIST in a System-on-Chip Test Environment”. In: *Proc. IEEE European Test Symposium*. Tallinn, Estonia, May 2005, pp. 2–7.

[14] V. Izosimov, M. Lora, G. Pravadelli, F. Fummi, Z. Peng, G. D. Guglielmo, and M. Fujita. “Optimization of Assertion Placement in Time-Constrained Embedded Systems”. In: *Proc. IEEE European Test Symposium*. Trondheim, Norway, May 2011, pp. 171–176.

[15] N. Aghaei, Z. Peng, and P. Eles. “Temperature-Gradient Based Burn-In and Test Scheduling for 3D Stacked ICs”. In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 23.12 (Dec. 2015), pp. 2992–3005.

[16] Y. Zhang, Y. Ding, Z. Peng, H. Li, M. Fujita, and J. Jiang. “BMC-Based Temperature-Aware SBST for Worst-Case Delay Fault Testing Under High Temperature”. In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 30.11 (Nov. 2022), pp. 1677–1690.

### WHEN SAT-BASED ATPG BECAME PRACTICAL

[17] R. Drechsler, S. Eggersglüß, G. Fey, A. Glowatz, F. Hapke, J. Schlöffel, and D. Tille. “On Acceleration of SAT-Based ATPG for Industrial Designs”. In: *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* 27.7 (2008), pp. 1329–1333.

[18] J. Shi, G. Fey, R. Drechsler, A. Glowatz, F. Hapke, and J. Schlöffel. “PASSAT: Efficient SAT-Based Test Pattern Generation for Industrial Circuits”. In: *IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design (ISVLSI)*. 2005, pp. 212–217.

[19] S. Eggersglüß, G. Fey, A. Glowatz, F. Hapke, J. Schlöffel, and R. Drechsler. “MONSOON: SAT-Based ATPG for Path Delay Faults Using Multiple-Valued Logics”. In: *Journal of Electronic Testing* 26.3 (2010), pp. 307–322.

[20] D. Tille, S. Eggersglüß, and R. Drechsler. “Incremental Solving Techniques for SAT-based ATPG”. In: *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* 29.7 (2010), pp. 1125–1130.

[21] D. Tille, S. Eggersglüß, R. Krenz-Baath, J. Schlöffel, and R. Drechsler. “Improving CNF representations in SAT-based ATPG for industrial circuits using BDDs”. In: *IEEE European Test Symposium*. 2010, pp. 176–181.

### ONSEMI-KU LEUVEN

[22] B. Esen, A. Coyette, G. Gielen, W. Dobbelaere, and R. Vanhooren. “Effective DC fault models and testing approach for open defects in analog circuits”. In: *2016 IEEE International Test Conference (ITC)*. 2016, pp. 1–9. doi: 10.1109/TEST.2016.7805830.

[23] J. Gomez, N. Xama, D. Lootens, A. Coyette, R. Vanhooren, W. Dobbelaere, and G. Gielen. “Experimental validation of a compact pinhole latent defect model for MOS transistors”. In: *IEEE Transactions on Electron Devices* 69.9 (2022), pp. 4796–4802.

[24] N. Xama, A. Coyette, B. Esen, W. Dobbelaere, R. Vanhooren, and G. Gielen. “Automatic testing of analog ICs for latent defects using topology modification”. In: *2017 22nd IEEE European Test Symposium (ETS)*. 2017, pp. 1–6. doi: 10.1109/ETS.2017.7968215.

[25] A. Coyette, B. Esen, R. Vanhooren, W. Dobbelaere, and G. Gielen. “Automatic generation of autonomous built-in observability structures for analog circuits”. In: *2015 20th IEEE European Test Symposium (ETS)*. 2015, pp. 1–6. doi: 10.1109/ETS.2015.7138754.

[26] A. Coyette, W. Dobbelaere, R. Vanhooren, N. Xama, J. Gomez, and G. Gielen. “Latent Defect Screening with Visually-Enhanced Dynamic Part Average Testing”. In: *2020 IEEE European Test Symposium (ETS)*. 2020, pp. 1–6. doi: 10.1109/ETS48528.2020.9131593.

### IMEC IN TEST

[27] E. J. Marinissen et al. “Solutions to Multiple Probing Challenges for Test Access to Multi-Die Stacked Integrated Circuits”. In: *Proceedings IEEE International Test Conference (ITC)*. Oct. 2018. doi: 10.1109/TEST.2018.8624731.

[28] E. J. Marinissen, T. McLaurin, and H. Jiao. “IEEE Std P1838: DfT Standard-under-Development for 2.5D-, 3D-, and 5.5D-SICs”. In: *Proceedings IEEE European Test Symposium (ETS)*. May 2016. doi: 10.1109/ETS.2016.7519330.

[29] P.-Y. Chuang. *Chiplet Interconnect Test and Repair*. PhD thesis: National Tsing-Hua University, Hsinchu, Taiwan, Dec. 2024, pp. 1–108. URL: <https://etd.lib.nycu.edu.tw/cgi-bin/gs32/hugsweb.cgi/ced=bhlUZa/record;r1=2&h1=1>.

[30] Z. Gao. *Optimization of Cell-Aware Test*. PhD thesis: TU Eindhoven, the Netherlands, Mar. 2023. URL: <https://www.tue.nl/en/our-university/calendar-and-events/10-03-2023-phd-defense-zhan-gao>.

[31] L. Wu. *Testing STT-MRAM: Manufacturing Defects, Fault Models, and Test Solutions*. PhD thesis: TU Delft, the Netherlands, Feb. 2021. doi: 10.4233/uuid:088a3991-4ea9-48a0-9b92-cc763748868c.

### TIMA: TEST, RELIABILITY AND SECURITY

[32] E. I. Vatajelu, R. Rodriguez-Montañés, M. Renovell, and J. Figueras. “Mitigating read & write errors in STT-MRAM memories under DVS”. In: *2017 22nd IEEE European Test Symposium (ETS)*. 2017, pp. 1–2. doi: 10.1109/ETS.2017.7968209.

[33] F. Cilici, M. J. Barragan, S. Mir, E. Lauga-Larroze, and S. Bourdel. “Assisted test design for non-intrusive machine learning indirect test of millimeter-wave circuits”. In: *2018 23rd European Test Symposium (ETS)*. 2018, pp. 1–6. doi: 10.1109/ETS.2018.8400689.

[34] M. Portolan, A. Savino, R. Leveugle, S. Di Carlo, A. Bosio, and G. Di Natale. “Alternatives to Fault Injections for Early Safety/Security Evaluations”. In: *2019 IEEE European Test Symposium (ETS)*. 2019, pp. 1–10. doi: 10.1109/ETS.2019.8791555.

[35] J. Suzano, A. Chastand, E. Valea, G. Di Natale, A. Philippe, F. Abouzeid, and P. Roche. “IEEE 1838 compliant scan encryption and integrity for 2.5/3D ICs”. In: *2024 IEEE European Test Symposium (ETS)*. 2024, pp. 1–6. doi: 10.1109/ETS61313.2024.10567195.

[36] A. Douadi, E.-I. Vatajelu, P. Maistri, D. Hely, V. Beroule, and G. Di Natale. “Modeling Thermal Effects For Biasing PUFs”. In: *2024 IEEE European Test Symposium (ETS)*. 2024, pp. 1–4. doi: 10.1109/ETS61313.2024.10567656.

#### POLITO’S CONTINUOUS CONTRIBUTION TO THE EUROPEAN COMMUNITY OF TEST AND RELIABILITY

[37] M. Sonza Reorda, L. Sterpone, and M. Violante. “Multiple errors produced by single upsets in FPGA configuration memory: a possible solution”. In: *European Test Symposium (ETS’05)*. 2005, pp. 136–141. doi: 10.1109/ETS.2005.29.

[38] P. Bernardi, L. Ciganda, M. de Carvalho, M. Grossi, J. Lagos-Benites, E. Sanchez, M. S. Reorda, and O. Ballan. “On-line software-based self-test of the Address Calculation Unit in RISC processors”. In: *2012 17th IEEE European Test Symposium (ETS)*. 2012, pp. 1–6. doi: 10.1109/ETS.2012.6233004.

[39] F. Angione, P. Bernardi, G. Filippioni, M. S. Reorda, D. Appello, V. Tancorre, and R. Ugioli. “An Optimized Burn-In Stress Flow targeting Interconnections logic to Embedded Memories in Automotive Systems-on-Chip”. In: *2022 IEEE European Test Symposium (ETS)*. 2022, pp. 1–6. doi: 10.1109/ETS5426.2022.9810396.

[40] A. Vallero, A. Savino, S. Tselonis, N. Foutris, M. Kaliorakis, G. Politano, D. Gizopoulos, and S. Di Carlo. “A Bayesian model for system level reliability estimation”. In: *2015 20th IEEE European Test Symposium (ETS)*. 2015, pp. 1–2. doi: 10.1109/ETS.2015.7138745.

[41] M. Portolan, A. Savino, R. Leveugle, S. Di Carlo, A. Bosio, and G. Di Natale. “Alternatives to Fault Injections for Early Safety/Security Evaluations”. In: *2019 IEEE European Test Symposium (ETS)*. 2019, pp. 1–10. doi: 10.1109/ETS.2019.8791555.

#### RESEARCH AT SORBONNE UNIVERSITÉ, CNRS, LIP6: ANALOG AND MIXED-SIGNAL ICS TESTING AND SECURITY, AI FOR ICS TESTING, TESTING ICS FOR AI, AND AI SECURITY

[42] A. Pavlidis, M.-M. Louérat, E. Faehn, A. Kumar, and H.-G. Stratigopoulos. “SymBIST: Symmetry-Based Analog and Mixed-Signal Built-In Self-Test for Functional Safety”. In: *IEEE Trans. Circuits Syst. I, Reg. Papers* 68.6 (2021), pp. 2580–2593.

[43] A. R. Díaz-Rizo, H. Aboushady, and H.-G. Stratigopoulos. “Anti-Piracy Design of RF Transceivers”. In: *IEEE Trans. Circuits Syst. I, Reg. Papers* 70.1 (2023), pp. 492–505.

[44] H. Stratigopoulos, T. Spyrou, and S. Raptis. “Testing and Reliability of Spiking Neural Networks: A Review of the State-of-the-Art”. In: *Proc. IEEE Int. Symp. Defect Fault Toler. VLSI Nanotechnol. Syst. (DFT)*. 2023.

[45] T. Spyrou, S. Hamdioui, and H.-G. Stratigopoulos. “SpikeFI: A Fault Injection Framework for Spiking Neural Networks”. In: *arXiv:2412.06795* (2024). URL: <https://arxiv.org/abs/2412.06795>.

[46] S. Raptis and H.-G. Stratigopoulos. “Minimum Time Maximum Fault Coverage Testing of Spiking Neural Networks”. In: *Proc. Design, Automat. Test Eur. Conf. Exhib. (DATE)*. 2025.

#### UCY: TEST AND RELIABILITY RESEARCH

[47] S. Neophytou and M. K. Michael. “On the Relaxation of n-detect Test Sets”. In: *26th IEEE VLSI Test Symposium (VTS 2008)*. 2008, pp. 187–192. doi: 10.1109/VTS.2008.14.

[48] S. Hadjitheophanous, S. N. Neophytou, and M. K. Michael. “Utilizing shared memory multi-cores to speed-up the ATPG process”. In: *21th IEEE European Test Symposium (ETS 2016)*. 2016, pp. 1–6. doi: 10.1109/ETS.2016.7519328.

[49] K. Christou, M. K. Michael, and S. Neophytou. “Identification of critical primitive path delay faults without any path enumeration”. In: *28th VLSI Test Symposium (VTS 2010)*. 2010, pp. 9–14. doi: 10.1109/VTS.2010.5469629.

[50] S. Neophytou, K. Christou, and M. K. Michael. “An Approach for Quantifying Path Correlation in Digital Circuits without any Path or Segment Enumeration”. In: *16th IEEE European Test Symposium (ETS 2011)*. 2011, pp. 141–146. doi: 10.1109/ETS.2011.44.

[51] M. A. Skitsas, C. A. Nicopoulos, and M. K. Michael. “DaemonGuard: Enabling O/S-Orchestrated Fine-Grained Software-Based Selective-Testing in Multi-/Many-Core Microprocessors”. In: *IEEE Transactions on Computers* 65.5 (2016), pp. 1453–1466. doi: 10.1109/TC.2015.2449840.

#### WHEN ETS BECAME APPROXIMATED: THE FRENCH CONNECTION

[52] M. Traiola, A. Virazel, P. Girard, M. Barbareschi, and A. Bosio. “A Survey of Testing Techniques for Approximate Integrated Circuits”. In: *Proceedings of the IEEE* 108.12 (2020), pp. 2178–2194. doi: 10.1109/JPROC.2020.2999613.

[53] L. Anghel, M. Benabdenni, A. Bosio, M. Traiola, and E. I. Vatajelu. “Test and Reliability in Approximate Computing”. In: *Journal of Electronic Testing* 34.4 (May 2018), pp. 375–387. ISSN: 1573-0727. doi: 10.1007/s10836-018-5734-9. URL: <http://dx.doi.org/10.1007/s10836-018-5734-9>.

[54] G. Rodrigues, F. Lima Kastensmidt, and A. Bosio. “Survey on Approximate Computing and Its Intrinsic Fault Tolerance”. In: *Electronics* 9.4 (Mar. 2020), p. 557. ISSN: 2079-9292. doi: 10.3390/electronics9040557. URL: <http://dx.doi.org/10.3390/electronics9040557>.

[55] B. Deveautour, M. Traiola, A. Virazel, and P. Girard. “QAMR: an Approximation-Based Fully Reliable TMR Alternative for Area Overhead Reduction”. In: *2020 IEEE European Test Symposium (ETS)*. 2020, pp. 1–6. doi: 10.1109/ETS48528.2020.9131574.

[56] M. Traiola, J. Echavarria, A. Bosio, J. Teich, and I. O’Connor. “Design Space Exploration of Approximation-Based Quadruple Modular Redundancy Circuits”. In: *2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)*. 2021, pp. 1–9. doi: 10.1109/ICCAD51958.2021.9643561.

#### INRIA, TARAN: FROM RADIATION EXPERIMENTS TO MICROARCHITECTURE AND SOFTWARE: AN HOLISTIC PERSPECTIVE ON RELIABILITY RESEARCH

[57] A. Kritikakou, O. Senteys, G. Hubert, Y. Helen, J.-F. Coulon, and P. Deroux-Dauphin. “FLODAM: Cross-Layer Reliability Analysis Flow for Complex Hardware Designs”. In: *25th IEEE/ACM Design, Automation and Test in Europe (DATE)*. Antwerp, Belgium: IEEE, Mar. 2022, pp. 1–6. URL: <https://hal.science/hal-03485386>.

[58] L. Roquet, F. Fernandes dos Santos, P. Rech, M. Traiola, O. Senteys, and A. Kritikakou. “Cross-Layer Reliability Evaluation and Efficient Hardening of Large Vision Transformers Models”. In: *IEEE ACM Design Automation Conference (DAC)*. San Francisco, United States, June 2024. URL: <https://hal.science/hal-04456702>.

[59] M. Traiola, A. Kritikakou, and O. Senteys. “harDNNing: a machine-learning-based framework for fault tolerance assessment and protection of DNNs”. In: *IEEE European Test Symposium (ETS)*. Venise, Italy: IEEE, May 2023, pp. 1–6. URL: <https://hal.science/hal-04087375>.

[60] M. Traiola, F. F. dos Santos, P. Rech, C. Cazzaniga, O. Senteys, and A. Kritikakou. “Impact of High-Level-Synthesis on Reliability of Artificial Neural Network Hardware Accelerators”. In: *IEEE Transactions on Nuclear Science* (2024), pp. 1–9. URL: <https://inria.hal.science/hal-04514579>.

[61] P. R. Nikiema, A. Kritikakou, M. Traiola, and O. Senteys. “Impact of Transient Faults on Timing Behavior and Mitigation with Near-Zero WCET Overhead”. In: *35th Euromicro Conference on Real-Time Systems (ECRTS)*. Vienna, Austria: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, July 2023, pp. 1–22. URL: <https://hal.science/hal-04397374>.

#### HICREST: RELIABILITY IN THE AI AND QUANTUM COMPUTING ERA

[62] P. Rech. “Artificial Neural Networks for Space and Safety-Critical Applications: Reliability Issues and Potential Solutions”. In: *IEEE Transactions on Nuclear Science* 71.4 (2024), pp. 377–404. doi: 10.1109/TNS.2024.3349956.

- [63] D. Sabena, M. S. Reorda, L. Sterpone, P. Rech, and L. Carro. “On the evaluation of soft-errors detection techniques for GPGPUs”. In: *2013 8th IEEE Design and Test Symposium*. 2013, pp. 1–6. DOI: 10.1109/IDT.2013.6727092.
- [64] F. Libano, B. Wilson, J. Anderson, M. J. Wirthlin, C. Cazzaniga, C. Frost, and P. Rech. “Selective Hardening for Neural Networks in FPGAs”. In: *IEEE Transactions on Nuclear Science* 66.1 (2019), pp. 216–222. DOI: 10.1109/TNS.2018.2884460.
- [65] D. Oliveira, E. Giusto, B. Baheri, Q. Guan, B. Montruccio, and P. Rech. “A Systematic Methodology to Compute the Quantum Vulnerability Factors for Quantum Circuits”. In: *IEEE Transactions on Dependable and Secure Computing* 21.4 (2024), pp. 2631–2644. DOI: 10.1109/TDSC.2023.3313934.
- [66] M. Vallero, G. Casagrande, F. Vella, and P. Rech. “On the Efficacy of Surface Codes in Compensating for Radiation Events in Superconducting Devices”. In: *SC24: International Conference for High Performance Computing, Networking, Storage and Analysis*. 2024, pp. 1–15. DOI: 10.1109/SC41406.2024.00075.

#### IHP: RELIABILITY-AWARE HARDWARE DESIGN FOR EMERGING APPLICATIONS

- [67] A. Simevski, O. Schrape, C. Benito, M. Krstic, and M. Andjelkovic. “PISA: Power-robust Multiprocessor Design for Space Applications”. In: *2020 IEEE 26th International Symposium on On-Line Testing and Robust System Design (IOLTS)*. 2020, pp. 1–6. DOI: 10.1109/IOLTS50870.2020.9159716.
- [68] M. Ulbricht, L. Lu, J. Chen, and M. Krstic. “The TETRISC SoC—A resilient quad-core system based on the ResiliCell approach”. In: *Microelectronics Reliability* 148 (2023), p. 115173. ISSN: 0026-2714. DOI: <https://doi.org/10.1016/j.microrel.2023.115173>. URL: <https://www.sciencedirect.com/science/article/pii/S0026271423002731>.
- [69] L. Lu, J. Chen, A. Balakrishnan, M. Ulbricht, and M. Krstic. “Accelerate SEU Simulation-Based Fault Injection With Spatio-Temporal Graph Convolutional Networks”. In: *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* (2025), pp. 1–1. DOI: 10.1109/TCAD.2025.3526748.
- [70] G. C. Medeiros, M. Taouil, M. Fieback, L. B. Poehls, and S. Hamdioui. “DFT Scheme for Hard-to-Detect Faults in FinFET SRAMs”. In: *2019 IEEE European Test Symposium (ETS)*. 2019, pp. 1–2. DOI: 10.1109/ETS.2019.8791517.
- [71] M. Fieback and L. M. B. Pöhls. “Lifecycle Management of Emerging Memories”. In: *2024 IEEE European Test Symposium (ETS)*. 2024, pp. 1–6. DOI: 10.1109/ETS61313.2024.10567697.

#### SYNOPSYS: ENSURING QUALITY, SAFETY AND RELIABILITY IN THE EVOLUTION FROM MONOLITHIC SOCS TO MULTI-CHIPLET DESIGNS

- [72] G. Harutyunyan, V. Vardanian, and Y. Zorian. “Minimal March Tests for Dynamic Faults in Random Access Memories”. In: *IEEE European Test Symposium (ETS'06)*. 2006, pp. 43–48. DOI: 10.1109/ETS.2006.32.
- [73] S. Shoukourian, V. Vardanian, and Y. Zorian. “SoC yield optimization via an embedded-memory test and repair infrastructure”. In: *IEEE Design & Test of Computers* 21.3 (2004), pp. 200–207. DOI: 10.1109/MDT.2004.19.
- [74] G. Harutyunyan, G. Tshagharyan, V. Vardanian, and Y. Zorian. “Fault modeling and test algorithm creation strategy for FinFET-based memories”. In: *2014 IEEE 32nd VLSI Test Symposium (VTS)*. 2014, pp. 1–6. DOI: 10.1109/VTS.2014.6818747.
- [75] G. Tshagharyan, G. Harutyunyan, and Y. Zorian. “An effective functional safety solution for automotive systems-on-chip”. In: *2017 IEEE International Test Conference (ITC)*. 2017, pp. 1–10. DOI: 10.1109/TEST.2017.8242075.
- [76] C. Argyrides, G. Tshagharyan, G. Harutyunyan, and Y. Zorian. “Utilizing ECC Analytics to Improve Memory Lifecycle Management”. In: *2023 IEEE International Test Conference (ITC)*. 2023, pp. 383–387. DOI: 10.1109/ITC51656.2023.00057.

#### SMU & UST: IDENTIFYING PROBLEMS EARLY—EFFICIENT ONLINE TEST AND PREDICTIVE DELAY DETECTION

- [77] S. Shamshiri, H. Esmaeilzadeh, and Z. Navabi. “Instruction-level test methodology for CPU core self-testing”. In: *ACM Transactions on Design Automation of Electronic Systems (TODAES)* 10.4 (2005), pp. 673–689.
- [78] E. Yassien, Y. Xu, H. Jiang, T. Nguyen, J. Dworak, T. Manikas, and K. Nepal. “Harvesting Wasted Clock Cycles for Efficient Online Testing”. In: *2023 IEEE European Test Symposium (ETS)*. IEEE. 2023, pp. 1–6.
- [79] F. Zhang, D. Hwong, Y. Sun, A. Garcia, S. Alhelaly, G. Shofner, L. Winemberg, and J. Dworak. “Putting wasted clock cycles to use: Enhancing fortuitous cell-aware fault detection with scan shift capture”. In: *Intl. Test Conf. (ITC)*. 2016, pp. 1–10. DOI: 10.1109/TEST.2016.7805828.
- [80] H. Jiang, F. Zhang, J. Dworak, K. Nepal, and T. Manikas. “Increased Detection of Hard-to-Detect Stuck-at Faults during Scan Shift”. In: *Journal of Electronic Testing: Theory and Applications (JETTA)* (Apr. 2023). ISSN: 1573-0727. DOI: 10.1007/s10836-023-06060-z.
- [81] A. Coyle, H. Jiang, J. Dworak, T. Manikas, and K. Nepal. “Dual Use Circuitry for Early Failure Warning and Test”. In: *2024 25th International Symposium on Quality Electronic Design (ISQED)*. IEEE. 2024, pp. 1–8.

#### HARDWARE DEPENDABILITY RESEARCH AT TU DELFT

- [82] A. van de Goor. *Testing Semiconductor Memories: Theory and practice*. A.J. van de Goor, Gouda, The Netherlands, 1998. ISBN: 9080427616.
- [83] M. Taouil and S. Hamdioui. “Device aware test for memory units”. In: *filed Sept 3, 2020 and Granted Jan 9, 2024 ()*. URL: <https://worldwide.espacenet.com/patent/search/family/068073139/publication/NL2023751B1?q=NL2023751B1> (visited on 07/25/2022).
- [84] M. Fieback, S. Nagarajan, R. Bishnoi, M. Tahoori, M. Taouil, and S. Hamdioui. “Testing Scouting Logic-Based Computation-in-Memory Architectures”. In: *European Test Symposium*. Vol. 2020-May. IEEE, May 1, 2020. ISBN: 978-1-72814-312-5. DOI: 10.1109/ETS48528.2020.9131604.
- [85] D. Kraak, M. Taouil, S. Hamdioui, P. Weckx, F. Catthoor, A. Chatterjee, A. Singh, H.-J. Wunderlich, and N. Karimi. “Device aging: A reliability and security concern”. In: *IEEE 23rd European Test Symposium (ETS)*. 2018, pp. 1–10.
- [86] B. Forlin, R. Husemann, L. Carro, C. Reinbrecht, S. Hamdioui, and M. Taouil. “G-PUF: An Intrinsic PUF Based on GPU Error Signatures”. In: *IEEE European Test Symposium (ETS)*. 2020, pp. 1–2. DOI: 10.1109/ETS48528.2020.9131562.

#### MACHINE LEARNING ASSISTED TESTING AND POST-MANUFACTURE TUNING OF MIXED-SIGNAL ELECTRONICS: FROM AMPLIFIERS TO DNNs

- [87] P. N. Variyam, S. Cherubal, and A. Chatterjee. “Prediction of analog performance parameters using fast transient testing”. In: *IEEE Transactions on CAD* 21.3 (2002), pp. 349–361.
- [88] X. Wang, K. Blanchard, S. Estella, and A. Chatterjee. “Alternative “safe” test of hysteretic power converters”. In: *2014 IEEE 32nd VLSI Test Symposium (VTS)*. IEEE. 2014, pp. 1–6.
- [89] S. Devarakond, D. Banerjee, A. Banerjee, S. Sen, and A. Chatterjee. “Efficient system-level testing and adaptive tuning of MIMO-OFDM wireless transmitters”. In: *2013 18th IEEE European Test Symposium (ETS)*. IEEE. 2013, pp. 1–6.
- [90] K. Ma, A. Saha, C. Amarnath, and A. Chatterjee. “Learning Assisted Post-Manufacture Testing and Tuning of RRAM-Based DNNs for Yield Recovery”. In: *2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)*. IEEE. 2024, pp. 1–6.
- [91] A. Saha, K. Ma, C. Amarnath, and A. Chatterjee. “Post-Manufacture Criticality-Aware Gain Tuning of Timing Encoded Spiking Neural Networks for Yield Recovery”. In: *2024 IEEE European Test Symposium (ETS)*. IEEE. 2024, pp. 1–4.

#### KIT: BEYOND CMOS AND CMOS 2.0: RELIABLE, TESTABLE AND SECURE

- [92] D. A. Moussa, M. Hefenbrock, and M. Tahoori. “Compressed Test Pattern Generation for Deep Neural Networks”. In: *IEEE Transactions on Computers* 74.1 (2025), pp. 307–315.

- [93] V. Meyers, M. Hefenbrock, M. Sadeghipourrudsari, D. Gnad, and M. Tahooori. "Towards Functional Safety of Neural Network Hardware Accelerators: Concurrent Out-of-Distribution Detection in Hardware Using Power Side-Channel Analysis". In: *Proceedings of the 30th Asia and South Pacific Design Automation Conference*. Association for Computing Machinery, 2025, pp. 1413–1419.
- [94] S. T. Ahmed, S. Hemaram, and M. B. Tahooori. "NN-ECC: Embedding Error Correction Codes in Neural Network Weight Memories using Multi-task Learning". In: *2024 IEEE 42nd VLSI Test Symposium (VTS)*. 2024, pp. 1–7.
- [95] M. Mayahinia, S. Thomann, P. R. Genssler, C. Münch, H. Amrouch, and M. B. Tahooori. "Algorithm to Technology Co-Optimization for CiM-Based Hyperdimensional Computing". In: *2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)*. 2024, pp. 1–6.
- [96] B. Sapui and M. B. Tahooori. "Side-channel Collision Attacks on Hyper-Dimensional Computing based on Emerging Resistive Memories". In: *Proceedings of the 30th Asia and South Pacific Design Automation Conference*. Association for Computing Machinery, 2025, pp. 447–453.

## CONNECTING HIGH TO LOW ABSTRACTION LEVELS IN MAKING DESIGN AND TEST DECISIONS IN EMBEDDED SYSTEMS

- [97] M. Rajabalipanah, M. S. Roodsari, Z. Jahanpeima, G. Roascio, P. Prinetto, and Z. Navabi. "AFTAB A RISC-V Implementation with Configurable Gateways for Security". In: *2021 IEEE East-West Design & Test Symposium (EWDTs)*. Batumi, Georgia, 2021, pp. 1–6.
- [98] M. R. Roshanshah, K. Basharkhah, and Z. Navabi. "Online Testing of a Row-Stationary Convolution Accelerator". In: *2021 IEEE European Test Symposium (ETS)*. Bruges, Belgium, 2021, pp. 1–2.
- [99] K. Basharkhah, R. S. Mirhashemi, N. Nosrati, M.-J. Zare, and Z. Navabi. "Learning Electrical Behavior of Core Interconnects for System-Level Crosstalk Prediction". In: *2023 IEEE European Test Symposium (ETS)*. Venezia, Italy, 2023, pp. 1–6.
- [100] F. Mohammadzadeh, K. Basharkhah, and Z. Navabi. *Golden Signature Extraction and Test Configuration for Hardware IP Cores*. Presented at eARTS workshop. Netherlands, May 2024, 2024.
- [101] A. Joudi, N. Safari, F. Mohammadzadeh, K. Basharkhah, F. Sheikhshoaei, and Z. Navabi. "An Integrated Framework for Aging Analysis Based on an Age-Aware Cell Library". In: *IEEE European Test Symposium (ETS)*. Tallinn, Estonia, 2025.

## ETS AND THE COMPUTER ARCHITECTURE GROUP OF THE UNIVERSITY OF STUTTGART

- [102] H. Jafarzadeh, F. Klemme, H. Amrouch, S. Hellebrand, and H.-J. Wunderlich. "Time and Space Optimized Storage-based BIST under Multiple Voltages and Variations". In: *2024 IEEE European Test Symposium (ETS)*. 2024, pp. 1–6.
- [103] A. Cook, S. Hellebrand, and H.-J. Wunderlich. "Built-in self-diagnosis exploiting strong diagnostic windows in mixed-mode test". In: *2012 17th IEEE European Test Symposium (ETS)*. 2012, pp. 1–6.
- [104] S. Holst and H.-J. Wunderlich. "Adaptive Debug and Diagnosis without Fault Dictionaries". In: *12th IEEE European Test Symposium (ETS'07)*. 2007, pp. 7–12.
- [105] M. Sauer, I. Polian, M. E. Imhof, A. Mumtaz, E. Schneider, A. Czutro, H.-J. Wunderlich, and B. Becker. "Variation-aware deterministic ATPG". In: *2014 19th IEEE European Test Symposium (ETS)*. 2014, pp. 1–6.
- [106] P. Ohler, S. Hellebrand, and H.-J. Wunderlich. "An Integrated Built-In Test and Repair Approach for Memories with 2D Redundancy". In: *12th IEEE European Test Symposium (ETS'07)*. 2007, pp. 91–96.