Reducing LLM Hallucinations with Retrieval Prompt Engineering

None, None

Reducing LLM Hallucinations with Retrieval Prompt Engineering

Minimising the Need for Re-prompting in Automatic Understandable Test Generation

Bachelor Thesis (2024)

Author(s)

A. Mentzelopoulou (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Deljouyi – Mentor (TU Delft - Software Engineering)

Andy Zaidman – Mentor (TU Delft - Software Technology)

A. Katsifodimos – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Large Language Models Prompt engineering Automated Test Generation

To reference this document use:

https://resolver.tudelft.nl/uuid:c5176a21-c793-4eea-8893-87d151b8bfe7

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

25-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automated test generation is the means to produce correct and usable code while maintaining an efficient and effective development process. UTGen is a tool that utilizes a Large Language Model (LLM) to improve the understandability of a test suite generated by a Search-Based Software Testing tool, namely EvoSuite. Often while the LLM attempts to improve a given test case, it generates code that is too far from the original, changing the test's purpose. Alternatively, it may generate code that does not compile. Such behaviour is called ``LLM Hallucination".

The current hallucination handling of UTGen is time-consuming and resource-expensive. To address this, we propose two alternative approaches that use information retrieval prompt engineering techniques to minimise hallucinations. Our respective techniques include incorporating the source code under test and the errors thrown by the latest generated test case to the LLM prompt. We assess our methods through a comparison study against the base UTGen version. We observe that source code retrieval enhances the generation of compilable test cases for complex classes. Error code retrieval shows similar hallucination performance to base UTGen, with a decrease in the number of re-prompts for classes with a high normalised Lack of Cohesion of Methods (*LCOM).

Index Terms - Automated Test Generation, Large Language Models (LLMs), LLM Hallucination, Prompt Engineering

Files

FinalPaper.pdf

(pdf | 0.832 Mb)

License info not available