Reducing Carbon Emissions of Code Generation in Large Language Models with Line-level Completions

None, None

Reducing Carbon Emissions of Code Generation in Large Language Models with Line-level Completions

Master Thesis (2025)

Author(s)

T.J. Nulle (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Arie Van Van Deursen – Mentor (TU Delft - Software Engineering)

Luis Cruz – Mentor (TU Delft - Software Engineering)

J Yang – Graduation committee member (TU Delft - Web Information Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Sustainability Code Generation LLM SSE Large Language Model Sustainable Software Engineering Line-level Completions Function-level Completions Carbon Emissions

To reference this document use:

https://resolver.tudelft.nl/uuid:849a1188-9a61-4ba2-a49c-14c2908aeac9

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

11-04-2025

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Artificial Intelligence']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This thesis investigates reducing carbon emissions in code generation using large language models (LLMs) by comparing function-level and line-level code completions across models of different sizes (1.5B and 9B parameters). The study utilises the BigCodeBench dataset, comprising 1,140 Python programming problems, to evaluate the energy consumption, test accuracy, and time efficiency of code completions. The models, 4-bit quantised and run on a CPU, performed 30 function-level completions and 30 line-level completions for each line, which were tested for correctness. Results indicate that, while line-level completions require slightly more energy per token, they are more efficient overall in terms of total energy consumption and token usage. The smaller model with line-level completions showed significant reductions in carbon emissions, achieving an average tenfold reduction compared to the large model with function-level completions. With the large model, line-level completions achieved a $4.5\times$ reduction in carbon emissions compared to function-level completions. Line-level completions were more token-efficient, wasting less than 1\% of energy, compared to 20\% for function-level completions. From a sustainability perspective, line-level completions offer a practical strategy to reduce the environmental impact of code generation tasks while maintaining strong performance. The study suggests that optimising completion strategies could help balance energy consumption, test accuracy, and time efficiency. Future research could explore a broader range of model sizes, fine-tuning models specifically for line-level completions, a performance decrease in solution length, and alternative validation metrics to assess code generation performance.

Files

MSc_Thesis_Final.pdf

(pdf | 6.85 Mb)

License info not available