Improvement of Source Code Conversion for Code Completion

Bachelor Thesis (2022)
Authors

M.J. Turk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Supervisors

Maliheh Izadi (TU Delft - Software Engineering)

Arie van Deursen (TU Delft - Software Technology)

Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Mika Turk
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Mika Turk
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Related content

The code produced during the research

https://github.com/mikaturk/codefill-conversion-improvement
Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Code Completion is advancing constantly, with new research coming out all the time. One such advancement is CodeFill, which converts source files into token sequences for type prediction. To train the CodeFill model, a lot of source files are needed which take a long time to convert before training can begin. Converting the file the end-user is working on for completions is also essential for the total latency as longer files can affect the experience of using the model. In this study we aimed to improve the performance and success rate of this conversion. Our results indicate that we increased both the performance by 83 times or more depending on the input file length and the success rate by up to 45%.

Files

CSE3000_Paper_Mika_Turk.pdf
(pdf | 0.427 Mb)
License info not available