Augmenting Program Synthesis with Large Language Models

Incorporating Natural Language Understanding for Efficient Program Synthesis

More Info
expand_more

Abstract

This research introduces a Language Model Augmented Program Synthesis (LMAPS) workflow to enhance traditional Programming by Example (PBE). PBE is a method to automatically generate a program that satisfies a specification that consists of a set of input-output examples. These program specifications are often defined by a few examples, which can lead to multiple programs that satisfy the given examples. In addition, PBE synthesisers have to explore a huge inefficient search space to solve these problems. The LMAPS workflow incorporates three components to overcome these limitations of PBE by using the language understanding capabilities of Large Language Models (LLM). LLMs can assist in generating a well-defined specification to mitigate the ambiguity issue inherent in PBE. The core component of LMAPS leverages the capabilities of LLMs to generate programs. These programs can be decomposed into building blocks to create a concise grammar for an inductive program synthesiser. This optimized grammar makes it able to synthesise correct programs at lower depths, make the workflow more efficient. LLMs can also aid in understanding the automatically generated programs, as these programs can be hard to interpret by humans. We compare LMAPS to a traditional PBE workflow in the task of synthesising regular expressions across four data sets. The results demonstrated that LMAPS can significantly reduce the search space for program synthesis and achieve up to 40% higher accuracy than PBE-only systems. Our research indicates that integrating LLMs into a typical PBE workflow shows significant improvements because of their combined strengths, resulting in a more accurate, efficient, and human-aligned workflow.

Files