A User Evaluation of UniXcoder Using Statement Completion in a Real-World Setting

Bachelor Thesis (2022)
Author(s)

J.C.H.P. de Weerdt (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Izadi – Mentor (TU Delft - Software Engineering)

A. Van van Deursen – Mentor (TU Delft - Software Technology)

A. Lukina – Graduation committee member (TU Delft - Algorithmics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Jorit de Weerdt
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Jorit de Weerdt
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Related content

The repository that was used for and during the study.

https://github.com/code4me-me/code4me
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

State-of-the-art machine learning-based models provide automatic intelligent code completion based on large pre-trained language models. The theoretical accuracy of these models reaches 70%. However, the research on the practicality of these models is limited. Our paper will discuss the usefulness of UniXcoder, a machine learning-based cross-modal auto-completion model, in a normal environment through user evaluation. These models incorporate context around the requested completion and then return a prediction of code based on the context. To accomplish this, two plugins were made called 'Code4Me'. One for Visual Studio Code and PyCharm. These plugins work with a remote API that requires a segment of 3966 characters of the left and right context at the trigger point. The data collected consists of the inserted code completion, verification of the code completion, the IDE used, the trigger point, and the inference time. To evaluate the data the following metrics are used: BLUE~4, ROUGE-L, Exact Match, Edit Similarity, and the METEOR score. The results point out that developers accept once every 8 suggestions with an Exact Match of 62.5%, and the user evaluated, albeit with limited responses, are favourable towards the model and Code4Me. The accuracy of UniXcoder is lower in a real-world setting than when it is predicting source code. However, the usefulness of UniXcoder as an auto-completion model is apparent.

Files

License info not available