A Transformer-Based Approach for Smart Invocation of Automatic Code Completion

Conference Paper (2024)
Author(s)

A.D. de Moor (TU Delft - Software Engineering)

Arie van Van Deursen (TU Delft - Software Engineering)

M. Izadi (TU Delft - Software Engineering)

Research Group
Software Engineering
DOI related publication
https://doi.org/10.1145/3664646.3664760
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Software Engineering
Pages (from-to)
28-37
ISBN (electronic)
979-8-4007-0685-1
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Transformer-based language models are highly effective for code completion, with much research dedicated to enhancing the content of these completions. Despite their effectiveness, these models come with high operational costs and can be intrusive, especially when they suggest too often and interrupt developers who are concentrating on their work. Current research largely overlooks how these models interact with developers in practice and neglects to address when a developer should receive completion suggestions. To tackle this issue, we developed a machine learning model that can accurately predict when to invoke a code completion tool given the code context and available telemetry data. To do so, we collect a dataset of 200k developer interactions with our cross-IDE code completion plugin and train several invocation filtering models. Our results indicate that our small-scale transformer model significantly outperforms the baseline while maintaining low enough latency. We further explore the search space for integrating additional telemetry data into a pre-trained transformer directly and obtain promising results. To further demonstrate our approach’s practical potential, we deployed the model in an online environment with 34 developers and provided real-world insights based on 74k actual invocations.