Recommending Log Placement Based on Code Vocabulary

None, None

Recommending Log Placement Based on Code Vocabulary

Bachelor Thesis (2021)

Author(s)

K. Lyrakis (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J. Cândido – Mentor (TU Delft - Software Engineering)

Mauricio Aniche – Mentor (TU Delft - Software Engineering)

A Katsifodimos – Graduation committee member (TU Delft - Web Information Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine Learning Log Recommendation Code Vocabulary Empirical Software Engineering

To reference this document use:

https://resolver.tudelft.nl/uuid:6c76c94e-2b89-4712-9125-ccc100f764b5

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

02-07-2021

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Logging is a common practice of vital importance that enables developers to collect runtime information from a system. This information is then used to monitor a system's performance as it runs in production and to detect the cause of system failures. Besides its importance, logging is still a manual and difficult process. Developers rely on their experience and domain expertise in order to decide where to put log statements. In this paper, we tried to automatically suggest log placement by treating code as plain text that is derived from a vocabulary. Intuitively, we believe that the Code Vocabulary can indicate whether a code snippet should be logged or not. In order to validate this hypothesis, we trained machine learning models based solely on the Code Vocabulary in order to suggest log placement at method level. We also studied which words of the Code Vocabulary are more important when it comes to deciding where to put log statements. We evaluated our experiments on three open source systems and we found that i) The Code Vocabulary is a great source of training data when it comes to suggesting log placement at method level, ii) Classifiers trained solely on Vocabulary data are hard to interpret as there are no words in the Code Vocabulary significantly more valuable than others.

Files

Log_recommendation_paper.pdf

(pdf | 0.283 Mb)

License info not available