Modeling System Behaviour from Log Analysis Using Meta-Heuristic Search

More Info
expand_more

Abstract

Understanding the behaviour of a software system plays a key role in its development and maintenance processes. Unfortunately, accurate and concise models are not always available during development, due to the rapid changes in the structure and scale systems may undergo during this phase. Finite State Machines (FSM) are a natural and prevalent template for modeling system behaviour, and inferring such models through profiling, source code analysis, and log analysis has been an active area of research over the last decades. However, the applicability of existing techniques is restricted by different factors: profiling introduces significant overhead, which is infeasible for many real-time systems; the source code of a system may not always be available for analysis; and current log analysis approaches do not scale well with the number of logs produced by the system. In this paper, four meta-heuristic search approaches are used to create FSM models for the XRP Ledger consensus algorithm from its logs, addressing the scalability problem by approximating a FSM while only exploring a small portion of the search space. Random Selection Stochastic Hill Climber (RSSHC), Tournament Selection Stochastic Hill Climber (TSSHC), Simulated Annealing (SA), and Pareto Simulated Annealing (PSA) are considered. The performance of the model produced by each algorithm is quantified using an accuracy-based evaluation technique, and the accuracy, conciseness, and runtime of the approaches are compared. The results indicate that PSA produces the most accurate and least concise models with a consistent accuracy of over 92%. RSSHC and TSSHC obtain solutions with a slightly lower accuracy, but significantly better conciseness, while SA produces the most compact and least accurate models. All four algorithms produce results within minutes and scale linearly with the size of the problem.