Print Email Facebook Twitter LogChunks: A Data Set for Build Log Analysis Title LogChunks: A Data Set for Build Log Analysis Author Brandt, C.E. (TU Delft Software Engineering) Panichella, A. (TU Delft Software Engineering) Zaidman, A.E. (TU Delft Software Engineering) Beller, M.M. (TU Delft Software Engineering) Date 2020 Abstract Build logs are textual by-products that a software build process creates, often as part of its Continuous Integration (CI) pipeline. Build logs are a paramount source of information for developers when debugging into and understanding a build failure. Recently, attempts to partly automate this time-consuming, purely manual activity have come up, such as rule- or information-retrieval-based techniques. We believe that having a common data set to compare different build log analysis techniques will advance the research area. It will ultimately increase our understanding of CI build failures. In this paper, we present logchunks, a collection of 797 annotated Travis CI build logs from 80 GitHub repositories in 29 programming languages. For each build log, logchunks contains a manually labeled log part (chunk) describing why the build failed. We externally validated the data set with the developers who caused the original build failure. The width and depth of the logchunks data set are intended to make it the default benchmark for automated build log analysis techniques. Subject Continuous IntegrationBuild Log AnalysisBuild FailureChunk RetrievalCI To reference this document use: http://resolver.tudelft.nl/uuid:7f7f9d19-339f-4f59-b6ba-f92bae1fa447 DOI https://doi.org/10.1145/3379597.3387485 Embargo date 2022-07-01 ISBN 9781450379571 Source Proceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020 Event 17th International Conference on Mining Software Repositories, 2020-10-05 → 2020-10-06, Seoul, Korea, Republic of Series Proceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020 Bibliographical note Virtual/online event due to COVID-19 Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public. Part of collection Institutional Repository Document type conference paper Rights © 2020 C.E. Brandt, A. Panichella, A.E. Zaidman, M.M. Beller Files PDF 3379597.3387485.pdf 265.5 KB Close viewer /islandora/object/uuid:7f7f9d19-339f-4f59-b6ba-f92bae1fa447/datastream/OBJ/view