GN

G. Nazarian

info

Please Note

3 records found

Doctoral thesis (2019) - Ghazaleh Nazarian
Microprocessors are used in an expanding range of applications from small embedded system devices to supercomputers and mainframes. Moreover, embedded microprocessor based systems became essential in modern societies. Depending on the application domain, embedded systems have to satisfy different constraints. The major challenges today are cost, performance, energyconsumption, reliability, real-time (reactive-operation) and silicon area. In traditional computer systems some of these constraints can be less crucial than others, while performance, area and power-consumption will always remain valid constraints for embedded systems. However, in modern systems reliability has emerged as a new, highly important requirement. Among all above factors performance, power, reactive-operation and reliability can be addressed by software-only techniques that do not require any hardware modifications or additions. Such optimization techniques, however, may impact the performance and power characteristics of the system. The main goal of this work is to find novel software based reliability techniques with affordable power and performance overheads. For this reason the reliability optimization methods are studied in detail and a diligent categorization of existing software techniques is proposed. The strong and the weak points of each category are carefully studied. Using the information obtained from our categorization, two novel optimization techniques for fault detection and one new optimization technique for fault recovery are proposed. Our optimization techniques minimize the required code instrumentation points while guaranteeing equivalent reliability as compared to state of the art approaches. Moreover, a generic methodology is proposed to help with the process of identifying the minimum set of code instrumentation points. For the evaluation we select a challenging baseline that consists of the best known techniques for fault detection and fault recovery found in the public literature. The experimental results on a set of biomedical benchmarks show that using the proposed design methodology and fault detection and recovery methods, the performance and power overheads are significantly reduced while the fault coverage remains in line with previously proposed and widely used methods. ...
Conference paper (2015) - Diego Rodrigues, Ghazaleh Nazarian, Álvaro Moreira, Luigi Carro, Georgi Gaydadjiev
Software-based methods for the detection of control-flow errors caused by transient fault usually consist in the introduction of protecting instructions both at the beginning and at the end of basic blocks. These methods are conservative in nature, in the sense that they assume that all blocks have the same probability of being the target of control flow errors. Because of that assumption they can lead to a considerable increase both in memory and performance overhead during execution time. In this paper, we propose a static analysis that provide a more refined information about which basic blocks can be the target of control-flow-errors caused by single-bit flips. This information can then be used to guide a program transformation in which only susceptible blocks have to be protected. We implemented the static analysis and program transformation in the context of the LLVM framework and performed an extensive fault injection campaign. Our experiments show that this less conservative approach can potentially lead to gains both in memory usage and in execution time while keeping high fault coverage. ...