Enhancing Sentence Decomposition in Large Language Models Through Linguistic Features
X. XU (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M.S. Pera – Mentor (TU Delft - Web Information Systems)
J. Yang – Mentor (TU Delft - Web Information Systems)
Sebastijan Dumančić – Graduation committee member (TU Delft - Algorithmics)
G. He – Mentor (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This thesis investigates the enhancement of sentence decomposition in Large Language Models (LLMs) through the integration of linguistic features, including constituency parsing, dependency parsing, and abstract meaning representation. Traditional decomposition methods, which often rely on rule-based approaches, struggle with highly intricate sentences. By incorporating detailed linguistic features and employing reasoning steps and supervision in prompts, we aim to improve the comprehension and decomposition capabilities of LLMs. Experimental results show that integrating these linguistic features has the potential to improve decomposition performance. However, these enhancements also introduce new error types and increase computational costs. This study identifies several error types in the programs generated by LLMs, including format, decomposition, and conversion errors, emphasizing the need for further refinement in model training and prompt design. This research lays a step toward more accurate and efficient processing of complex sentences in LLMs, encouraging ongoing development and optimization in this field.