An Empirical Analysis of InCoder on the Statement Prediction Task
F.N.M. van der Heijden (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Maliheh Izadi – Mentor (TU Delft - Software Engineering)
A. Van Van Deursen – Mentor (TU Delft - Software Technology)
A. Lukina – Graduation committee member (TU Delft - Algorithmics)
More Info
expand_more
Code used for the collection of the P1K-22 and TS1K-22 dataset, and for the calculation of the predictions and metrics in the InCoder analysis.
https://github.com/FrankHeijden/incoder-analysisOther than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Automatic code completions are a widely used feature when programming code efficiently. These completions can be made by various code language models, and these can be differentiated in three categories: single token completion, statement (line) completion and block completions. These completions, and in particular statement predictions are usually created using only the left context, missing key information and context on the other side. InCoder, a novel state of the art model is capable of using both contexts. In this study we aim to show the impact of using both contexts in statement completions. The results show that on average, an improvement of 9.9% exact match and similar results for Edit Similarity, BLEU-4, ROUGE-L F1, and METEOR when using both contexts instead of only the left context.