LogChunks: A Data Set for Build Log Analysis

Conference Paper (2020)
Author(s)

C.E. Brandt (TU Delft - Software Engineering)

Annibale Panichella (TU Delft - Software Engineering)

Andy Zaidman (TU Delft - Software Engineering)

M.M. Beller (TU Delft - Software Engineering)

Research Group
Software Engineering
Copyright
© 2020 C.E. Brandt, A. Panichella, A.E. Zaidman, M.M. Beller
DOI related publication
https://doi.org/10.1145/3379597.3387485
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 C.E. Brandt, A. Panichella, A.E. Zaidman, M.M. Beller
Research Group
Software Engineering
Pages (from-to)
583-587
ISBN (electronic)
9781450379571
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Build logs are textual by-products that a software build process creates, often as part of its Continuous Integration (CI) pipeline. Build logs are a paramount source of information for developers when debugging into and understanding a build failure. Recently, attempts to partly automate this time-consuming, purely manual activity have come up, such as rule- or information-retrieval-based techniques. We believe that having a common data set to compare different build log analysis techniques will advance the research area. It will ultimately increase our understanding of CI build failures. In this paper, we present logchunks, a collection of 797 annotated Travis CI build logs from 80 GitHub repositories in 29 programming languages. For each build log, logchunks contains a manually labeled log part (chunk) describing why the build failed. We externally validated the data set with the developers who caused the original build failure. The width and depth of the logchunks data set are intended to make it the default benchmark for automated build log analysis techniques.

Files

3379597.3387485.pdf
(pdf | 0.259 Mb)
- Embargo expired in 01-07-2022
License info not available