X.D.M. Devroey
Please Note
34 records found
1
Researchers and practitioners have designed and implemented various automated test case generators to support effective software testing. Such generators exist for various languages (e.g., Java, C#, or Python) and various platforms (e.g., desktop, web, or mobile applications). The generators exhibit varying effectiveness and efficiency, depending on the testing goals they aim to satisfy (e.g., unit-testing of libraries versus system-testing of entire applications) and the underlying techniques they implement. In this context, practitioners need to be able to compare different generators to identify the most suited one for their requirements, while researchers seek to identify future research directions. This can be achieved by systematically executing large-scale evaluations of different generators. However, executing such empirical evaluations is not trivial and requires substantial effort to select appropriate benchmarks, setup the evaluation infrastructure, and collect and analyse the results. In this Software Note, we present our JUnit Generation Benchmarking Infrastructure (JUGE) supporting generators (search-based, random-based, symbolic execution, etc.) seeking to automate the production of unit tests for various purposes (validation, regression testing, fault localization, etc.). The primary goal is to reduce the overall benchmarking effort, ease the comparison of several generators, and enhance the knowledge transfer between academia and industry by standardizing the evaluation and comparison process. Since 2013, several editions of a unit testing tool competition, co-located with the Search-Based Software Testing Workshop, have taken place where JUGE was used and evolved. As a result, an increasing amount of tools (over 10) from academia and industry have been evaluated on JUGE, matured over the years, and allowed the identification of future research directions. Based on the experience gained from the competitions, we discuss the expected impact of JUGE in improving the knowledge transfer on tools and approaches for test generation between academia and industry. Indeed, the JUGE infrastructure demonstrated an implementation design that is flexible enough to enable the integration of additional unit test generation tools, which is practical for developers and allows researchers to experiment with new and advanced unit testing tools and approaches.
Search-based techniques have been widely used for white-box test generation. Many of these approaches rely on the approach level and branch distance heuristics to guide the search process and generate test cases with high line and branch coverage. Despite the positive results achieved by these two heuristics, they only use the information related to the coverage of explicit branches (e.g., indicated by conditional and loop statements), but ignore potential implicit branchings within basic blocks of code. If such implicit branching happens at runtime (e.g., if an exception is thrown in a branchless-method), the existing fitness functions cannot guide the search process. To address this issue, we introduce a new secondary objective, called Basic Block Coverage (BBC), which takes into account the coverage level of relevant basic blocks in the control flow graph. We evaluated the impact of BBC on search-based unit test generation (using the DynaMOSA algorithm) and search-based crash reproduction (using the STDistance and WeightedSum fitness functions). Our results show that for unit test generation, BBC improves the branch coverage of the generated tests. Although small (∼ 1.5%), this improvement in the branch coverage is systematic and leads to an increase of the output domain coverage and implicit runtime exception coverage, and of the diversity of runtime states. In terms of crash reproduction, in the combination of STDistance and WeightedSum, BBC helps in reproducing 3 new crashes for each fitness function. BBC significantly decreases the time required to reproduce 43.5% and 45.1% of the crashes using STDistance and WeightedSum, respectively. For these crashes, BBC reduces the consumed time by 71.7% (for STDistance) and 68.7% (for WeightedSum) on average.
This is an extended abstract of the article: Pouria Derakhshanfar, Xavier Devroey, Gilles Perrouin, Andy Zaidman and Arie van Deursen. 2019. Search-based crash reproduction using behavioural model seeding. In: Software Testing, Verification and Reliability (May 2020). http://doi.org/10.1002/stvr.1733.
Business processes have to manage variability in their execution, e.g., to deliver the correct building permit in different municipalities. This variability is visible in event logs, where sequences of events are shared by the core process (building permit authorisation) but may also be specific to each municipality. To rationalise resources (e.g., derive a configurable business process capturing all municipalities' permit variants) or to debug anomalous behaviour, it is mandatory to identify to which variant a given trace belongs. This paper supports this task by training Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) algorithms on two datasets: a configurable municipality and a travel expenses workflow. We demonstrate that variability can be identified accurately (>87%) and discuss the challenges of learning highly entangled variants.
This study presents the initial step towards a thorough analysis of the difficulty to reproduce a crash using searchbased crash reproduction. Traditionally, code size and complexity are considered representative indicators of the difficulty for search-based approaches, like search-based unit test generation, to generate tests. However, unlike unit test generation, crash reproduction does not seek to cover a set of behaviors but instead to generate one or more tests exercising a specific behavior reproducing a given crash. In this context, there is no guarantee that the indicators used for unit testing are still valid for crash reproduction. In this study, we seek to identify such indicators by considering various code metrics, code smells, and change metrics. We report our effort to collect those metrics for JCRASHPACK, a state-of-the-art crash reproduction benchmark, and an initial assessment by considering metrics individually. Our results show that although JCRASHPACK is larger than benchmarks used in previous studies, additional crashes should be added to improve its diversity and representativeness, and that no individual metric can be used to characterize the difficulty to reproduce a crash.
Objective: This study investigates the effects of the pandemic on developers’ wellbeing and productivity.
Method: A questionnaire survey was created mainly from existing, validated scales and translated into 12 languages. The data was analyzed using non-parametric inferential statistics and structural equation modeling.
Results: The questionnaire received 2225 usable responses from 53 countries. Factor analysis supported the validity of the scales and the structural model achieved a good fit (CFI = 0.961, RMSEA = 0.051, SRMR = 0.067). Confirmatory results include: (1) the pandemic has had a negative effect on developers’ wellbeing and productivity; (2) productivity and wellbeing are closely related; (3) disaster preparedness, fear related to the pandemic and home office ergonomics all affect wellbeing or productivity. Exploratory analysis suggests that: (1) women, parents and people with disabilities may be disproportionately affected; (2) different people need different kinds of support.
Conclusions: To improve employee productivity, software companies should focus on maximizing employee wellbeing and improving the ergonomics of employees’ home offices. Women, parents and disabled persons may require extra support. ...
Objective: This study investigates the effects of the pandemic on developers’ wellbeing and productivity.
Method: A questionnaire survey was created mainly from existing, validated scales and translated into 12 languages. The data was analyzed using non-parametric inferential statistics and structural equation modeling.
Results: The questionnaire received 2225 usable responses from 53 countries. Factor analysis supported the validity of the scales and the structural model achieved a good fit (CFI = 0.961, RMSEA = 0.051, SRMR = 0.067). Confirmatory results include: (1) the pandemic has had a negative effect on developers’ wellbeing and productivity; (2) productivity and wellbeing are closely related; (3) disaster preparedness, fear related to the pandemic and home office ergonomics all affect wellbeing or productivity. Exploratory analysis suggests that: (1) women, parents and people with disabilities may be disproportionately affected; (2) different people need different kinds of support.
Conclusions: To improve employee productivity, software companies should focus on maximizing employee wellbeing and improving the ergonomics of employees’ home offices. Women, parents and disabled persons may require extra support.
Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48
Botsing website: https://stamp-project.github.io/botsing/ ...
Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48
Botsing website: https://stamp-project.github.io/botsing/
through Advanced Software Engineering and Artificial Intelligence (EASEAI 2020)
to be held virtually, November 9, 2020, co-located with ESEC/FSE 2020.
In the past years, with the development and widespread of digital technologies,
everyday life has been profoundly transformed. The general public, as well as
specialized audiences, have to face an ever-increasing amount of knowledge and
learn new abilities. The EASEAI workshop addresses that challenge by looking
at software engineering, education, and artificial intelligence research fields to
explore how they can be combined. Specifically, this workshop brings together
researchers, teachers, and practitioners who use advanced software engineering
tools and artificial intelligence techniques in the education field and through a
transgenerational and transdisciplinary range of students to discuss the current
state of the art and practices, and establish new future directions. In total, EASEAI 2020 received 13 submissions, out of which 5 papers were accepted after a thorough reviewprocess. Three members of the program committee reviewed each submission. We also received two presentation abstracts, both selected for presentation during the workshop. We sincerely thank the program committee members, authors, and participants who will make EASEAI an exciting and successful event! ...
through Advanced Software Engineering and Artificial Intelligence (EASEAI 2020)
to be held virtually, November 9, 2020, co-located with ESEC/FSE 2020.
In the past years, with the development and widespread of digital technologies,
everyday life has been profoundly transformed. The general public, as well as
specialized audiences, have to face an ever-increasing amount of knowledge and
learn new abilities. The EASEAI workshop addresses that challenge by looking
at software engineering, education, and artificial intelligence research fields to
explore how they can be combined. Specifically, this workshop brings together
researchers, teachers, and practitioners who use advanced software engineering
tools and artificial intelligence techniques in the education field and through a
transgenerational and transdisciplinary range of students to discuss the current
state of the art and practices, and establish new future directions. In total, EASEAI 2020 received 13 submissions, out of which 5 papers were accepted after a thorough reviewprocess. Three members of the program committee reviewed each submission. We also received two presentation abstracts, both selected for presentation during the workshop. We sincerely thank the program committee members, authors, and participants who will make EASEAI an exciting and successful event!