Approximation Algorithms for Model-Based Diagnosis

More Info
expand_more

Abstract

Model-based diagnosis is an area of abductive inference that uses a system model, together with observations about system behavior, to isolate sets of faulty components (diagnoses) that explain the observed behavior, according to some minimality criterion. This thesis presents greedy approximation algorithms for three problems closely related to model-based diagnosis: (1) computation of cardinality-minimal diagnoses, (2) computation of max-fault min-cardinality observation vectors, and (3) computation of control assignments that optimally reduce the expected number of remaining cardinality-minimal diagnoses. All three problems are NP-hard or worse (for example, problem (1) is known to be in the second class of the polynomial hierarchy for arbitrary propositional models), thus making deterministic algorithms impractical for solving large models. The three main algorithms of this thesis are SAFARI (StochAstic Fault diagnosis AlgoRIthm) for problem (1), MIRANDA (Max-fault mIn-caRdinAlity observatioN Deduction Algorithm) for problem (2), and FRACTAL-G (FRamework for ACtive Testing ALgorithms - Greedy) for problem (3). These three algorithms, the analysis of their properties, the empirical research about their performance, and the comparison to other approaches are the main theoretical contributions of this thesis. SAFARI, MIRANDA, and FRACTAL-G employ approximation methods such as greedy stochastic search, stochastic sampling (in the case of FRACTAL-G) and, as a result, efficiently solve worst-case NP-hard or worse problems by trading relatively small degradation in the optimality of the diagnostic results for much larger gain in the computational performance (in some cases several orders-of-magnitude improvement in speed). This efficient trade-off is partly due to exploitation of specific properties in the search landscape of the diagnostic problems. All three algorithms achieve good optimality at low computational cost in diagnostic search problems that exhibit certain amount of continuity. We have validated the theoretical claims related to SAFARI, MIRANDA, and FRACTAL-G with extensive experimentation on fault-models of 74XXX/ISCAS85 combinational circuits. These models are of variable size (19 - 3512 components), different weakness (ignorance of abnormal behaviour, stuck-at), and different topology. We show analytically that for a large class of models (weak-fault models), SAFARI can be configured to efficiently compute one subset-minimal diagnosis. In the general case (strong-fault model) and in the case of multiple diagnoses we show topologically-dependent probabilistic properties of SAFARI. For example, in the case of multiple diagnoses, SAFARI computes diagnoses of minimal-cardinality with probability which is negative exponential to the cardinality of the minimal-cardinality diagnosis. SAFARI is computationally very efficient. For example, SAFARI computes a subset-minimal diagnosis in a weak-fault model of a digital circuit having more than 1500 gates in less than 1 s and the memory footprint is less than 24 Mb of RAM. The average degradation in the optimality of the diagnoses computed by SAFARI is 13% only (averaged over all models and observations leading to a single or double-fault). We have compared SAFARI to a range of deterministic and stochastic algorithms for computation of cardinality-minimal diagnoses. State-of-the-art deterministic algorithms such as HA* and CDA* are typically several orders-of-magnitude slower than SAFARI. Furthermore HA* and CDA* often time out with higher cardinality faults (and larger circuits) whereas SAFARI is not sensitive to the cardinality of the cardinality-minimal diagnosis. SAFARI has resemblance to Max-SAT and to further empirically analyze the performance and optimality of SAFARI, we have implemented a method for computing cardinality-minimal diagnoses that uses a deterministic partial Max-SAT solver. In addition to that we have compared SAFARI to a diagnostic algorithm based on SLS Max-SAT. We have established that the performance of the complete Max-SAT and the optimality of the SLS-based Max-SAT degrade when increasing the circuit size or the cardinality of the faults. For example, a diagnosis algorithm based on W-MaxSatz computed diagnoses with only 9.2% of the c880 observation vectors, whereas SAFARI always computes at least one nearly optimal diagnosis. Algorithms based on SLS-based Max-SAT, typically did not compute any diagnoses or computed very suboptimal diagnoses (for example one of the best performing Max-SAT algorithms, SAPS, found a single-fault diagnosis in c7552 after 77264 restarts, while SAFARI computed the same diagnosis with 11 restarts only). In this thesis, we propose an algorithm, MIRANDA, for creating a benchmark of "difficult" observation vectors for testing the performance and optimality of diagnostic algorithms such as SAFARI. MIRANDA computes observation vectors leading to faults of high cardinality. These, so-called MFMC observation vectors, lead to diagnoses of cardinality close to the maximally distinguishable given the fault-model. For small combinational circuits (74XXX) the approximations of MIRANDA coincide with the optimal MFMC observation vectors (in terms of diagnostic cardinality) and these results are computed several orders of magnitude faster than exhaustive algorithms. With large circuits, MIRANDA computes an MFMC observation vector leading to a diagnosis of approximate cardinality of 36 in less than 6 min (c5315). The reason for the good performance of MIRANDA is the fact that it exploits continuity in the space of output assignments in a class of well-formed benchmark circuits. MIRANDA searches the space of all possible observations, and, at each step, uses SAFARI to estimate the cardinality of the cardinality-minimal diagnosis. By varying the number of MIRANDA and SAFARI retries, we have empirically established lower bounds on the approximation error of MIRANDA. The theoretical analysis of the cardinality estimation optimality of SAFARI leads us to believe that the small lower bound on the error of MIRANDA is a good approximation of the overall approximation error and hence, the MFMC values and observation vectors computed by MIRANDA are nearly optimal. FRACTAL-G is an algorithm that, given a model and an observation leading to multiple minimal-cardinality diagnoses, computes a control assignment (a test) that optimally reduces the set of diagnoses. As the problem is computationally hard, we use greedy stochastic search and stochastic sampling. The result is a fast algorithm (computing a whole FRACTAL scenario takes between 1 s for 74182 and 15 min for c7552) that decreases the diagnostic uncertainty according to a near-geometric decay curve. The reason for the efficiency of FRACTAL-G is that it exploits continuity in the space of control assignments. Experimentation shows that this is true even for a small number of control inputs. FRACTAL-G has achieved better exponential decay compared to alternative approaches, except exhaustive control search. The difference in the decay rate between FRACTAL-G and exhaustive search for 74182 is 5.4%. The exhaustive control approach, however, takes minutes to complete even for 74182 and times out with any model having more than 20 controls. We have implemented a tool kit for system modeling called LYDIA (Language for sYstem DIAgnosis). LYDIA is also the framework in which we have developed all the algorithms in this thesis. In addition to the implementations of SAFARI, MIRANDA, and FRACTAL-G, the LYDIA tool kit also contains model translators (e.g., to CNF, DNF, OBDDs, etc), utilities, and all reference algorithms. We have applied LYDIA to, amongst others, (1) a model of the Electrical Power System (EPS) testbed in the ADAPT lab at NASA Ames Research Center, as a participant in the First International Diagnostic Competition (DXC'09), and (2) a Paper Input Module (PIM), part of a heavy-duty printer of Oce Technologies. The ADAPT EPS consists of electrical components only and the PIM model is a combination of electrical, mechanical and pneumatic ones. The results from the applications of LYDIA to models of physical systems such as ADAPT and PIM are consistent with the results from experiments on models of digital circuits. For example, the average running time of SAFARI for diagnosing ADAPT, a system that is modeled with 723 variables and 1567 clauses, was below 6 s which makes SAFARI one of the fastest MBD algorithms today.

Files