Analysing the Impact of Inline Comments for the Task of Code Captioning

Bachelor Thesis (2022)
Author(s)

V. Bacevičius (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Annibale Panichella – Mentor (TU Delft - Software Engineering)

L.H. Applis – Mentor (TU Delft - Software Engineering)

B.H.M. Gerritsen – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Vidas Bacevičius
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Vidas Bacevičius
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

AI-assisted development tools use Machine Learning models to help developers achieve tasks such as Method Name Generation, Code Captioning, Smart Bug Finding and others. A common practice among data scientists training these models is to omit inline code comments from training data. We hypothesize that including inline comments in the training code will provide more information to the model and improve the model's performance for natural-language related tasks, specifically Code Captioning. We adjust one of these models, code2seq, to include inline comments in its data processing, then train and compare it to a commentless version. We find that including inline comments tends to increase the performance of the model by making it faster and producing more verbose results, and then reflect on the results of this work to formulate suggestions on how to improve upon this body of research.

Files

License info not available