CG
C.A. Georgescu
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Kotlin is a programming language best known for its interoperability with Java, as well as the measurable improvements it offers over it. Since it became Android’s go-to language in 2019, the popularity and impact of Kotlin have risen greatly. Amidst this surge in popularity, the Kotlin developer team is working on a new version of the compiler that introduces sweeping changes to the ecosystem. Traditional compiler testing is a manual and laborious task that requires extensive developer effort and expertise. In an attempt to mitigate this, researchers have invested great resources in developing and perfecting automated compiler testing tools over the last decades. These approaches generate new pieces of code to test the behavior of compilers, which is assessed through differential testing. However, the usage of heuristics as guidance for the generative process is not well understood, and no approach that generates Kotlin code from scratch currently exists. In this thesis, we propose a novel method of enriching standard grammar specifications with language-targeted semantic context that is integrated in the sampling process. We structure generated code hierarchically and use it as the base of an evolutionary computation framework. Within this framework, we introduce two classes of algorithms that are novel to the field of compiler fuzzing, based on syntactic diversity and semantic proximity, respectively. We carry out an empirical analysis spanning 200K generated Kotlin files, which we analyzed through different Kotlin compiler versions. Our results uncovered five previously unreported categories of bugs, which we reported to the Kotlin compiler developer team. The developers verified and replicated our instances on the current re- lease of the Kotlin compiler, and have assigned target release dates for fixes within the current major version of the compiler. The study also provides new insight into the effects of heuristic-specific hyperparameters such as expression simplicity, dissimilarity measurements, and target selection.
...
Kotlin is a programming language best known for its interoperability with Java, as well as the measurable improvements it offers over it. Since it became Android’s go-to language in 2019, the popularity and impact of Kotlin have risen greatly. Amidst this surge in popularity, the Kotlin developer team is working on a new version of the compiler that introduces sweeping changes to the ecosystem. Traditional compiler testing is a manual and laborious task that requires extensive developer effort and expertise. In an attempt to mitigate this, researchers have invested great resources in developing and perfecting automated compiler testing tools over the last decades. These approaches generate new pieces of code to test the behavior of compilers, which is assessed through differential testing. However, the usage of heuristics as guidance for the generative process is not well understood, and no approach that generates Kotlin code from scratch currently exists. In this thesis, we propose a novel method of enriching standard grammar specifications with language-targeted semantic context that is integrated in the sampling process. We structure generated code hierarchically and use it as the base of an evolutionary computation framework. Within this framework, we introduce two classes of algorithms that are novel to the field of compiler fuzzing, based on syntactic diversity and semantic proximity, respectively. We carry out an empirical analysis spanning 200K generated Kotlin files, which we analyzed through different Kotlin compiler versions. Our results uncovered five previously unreported categories of bugs, which we reported to the Kotlin compiler developer team. The developers verified and replicated our instances on the current re- lease of the Kotlin compiler, and have assigned target release dates for fixes within the current major version of the compiler. The study also provides new insight into the effects of heuristic-specific hyperparameters such as expression simplicity, dissimilarity measurements, and target selection.
Understanding the behaviour of a software system plays a key role in its development and maintenance processes. Unfortunately, accurate and concise models are not always available during development, due to the rapid changes in the structure and scale systems may undergo during this phase. Finite State Machines (FSM) are a natural and prevalent template for modeling system behaviour, and inferring such models through profiling, source code analysis, and log analysis has been an active area of research over the last decades. However, the applicability of existing techniques is restricted by different factors: profiling introduces significant overhead, which is infeasible for many real-time systems; the source code of a system may not always be available for analysis; and current log analysis approaches do not scale well with the number of logs produced by the system. In this paper, four meta-heuristic search approaches are used to create FSM models for the XRP Ledger consensus algorithm from its logs, addressing the scalability problem by approximating a FSM while only exploring a small portion of the search space. Random Selection Stochastic Hill Climber (RSSHC), Tournament Selection Stochastic Hill Climber (TSSHC), Simulated Annealing (SA), and Pareto Simulated Annealing (PSA) are considered. The performance of the model produced by each algorithm is quantified using an accuracy-based evaluation technique, and the accuracy, conciseness, and runtime of the approaches are compared. The results indicate that PSA produces the most accurate and least concise models with a consistent accuracy of over 92%. RSSHC and TSSHC obtain solutions with a slightly lower accuracy, but significantly better conciseness, while SA produces the most compact and least accurate models. All four algorithms produce results within minutes and scale linearly with the size of the problem.
...
Understanding the behaviour of a software system plays a key role in its development and maintenance processes. Unfortunately, accurate and concise models are not always available during development, due to the rapid changes in the structure and scale systems may undergo during this phase. Finite State Machines (FSM) are a natural and prevalent template for modeling system behaviour, and inferring such models through profiling, source code analysis, and log analysis has been an active area of research over the last decades. However, the applicability of existing techniques is restricted by different factors: profiling introduces significant overhead, which is infeasible for many real-time systems; the source code of a system may not always be available for analysis; and current log analysis approaches do not scale well with the number of logs produced by the system. In this paper, four meta-heuristic search approaches are used to create FSM models for the XRP Ledger consensus algorithm from its logs, addressing the scalability problem by approximating a FSM while only exploring a small portion of the search space. Random Selection Stochastic Hill Climber (RSSHC), Tournament Selection Stochastic Hill Climber (TSSHC), Simulated Annealing (SA), and Pareto Simulated Annealing (PSA) are considered. The performance of the model produced by each algorithm is quantified using an accuracy-based evaluation technique, and the accuracy, conciseness, and runtime of the approaches are compared. The results indicate that PSA produces the most accurate and least concise models with a consistent accuracy of over 92%. RSSHC and TSSHC obtain solutions with a slightly lower accuracy, but significantly better conciseness, while SA produces the most compact and least accurate models. All four algorithms produce results within minutes and scale linearly with the size of the problem.