Print Email Facebook Twitter Cross-lingual Performance of CodeGPT on the Code Completion Task Title Cross-lingual Performance of CodeGPT on the Code Completion Task Author Kuo, Nadine (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Izadi, M. (mentor) Katzy, J.B. (mentor) van Deursen, A. (mentor) Nadeem, A. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-06-28 Abstract The development of contemporary source code auto-completion tools have significantly boosted productivity and efficiency of developers. In 2021, the GPT-2-based Transformer CodeGPT was developed to support code completion and text-to-code generation. Similarly to most code models however, CodeGPT was trained on a limited set of widely-used languages (Java, Python) - leading to constrained efficacy in lower-resource languages. This motivated us to research CodeGPT's performance on the token-level code completion task across high- and low-resource languages. We investigate in which scenarios CodeGPT predicts incorrect tokens with high certainty using a tuned lens, followed by studying attention patterns that underlie the observed behaviour. Our findings indicate that CodeGPT is most competent in Java and Python code (Top-1 accuracies: 69.2% and 68.2% respectively). It generates false predictions with highest confidence when it encounters unfamiliar constructs in low-resource languages, or code structures that cannot be predicted from left context only. Moreover, we find a positive correlation between null attention and model confidence. Subject Code completionLarge Language Models (LLMs)TransformersCodeGPTSelf-attentionAutocompletion To reference this document use: http://resolver.tudelft.nl/uuid:2b2386e8-f9a9-4d77-9a4a-a4e7a5208d38 Bibliographical note https://github.com/AISE-TUDelft/CodeShop Part of collection Student theses Document type bachelor thesis Rights © 2023 Nadine Kuo Files PDF CodeShop_Nadine.pdf 1.38 MB Close viewer /islandora/object/uuid:2b2386e8-f9a9-4d77-9a4a-a4e7a5208d38/datastream/OBJ/view