Natural Language Processing Techniques for Code Generation

More Info
expand_more

Abstract

Introduction: Software development is difficult and requires knowledge on many different levels such as understanding programming algorithms, languages, and frameworks. In addition, before code is being worked on, the system requirements and functionality are first discussed in natural language, after which it is sometimes visualized for the developers in a more formal language such as Unified Modeling Language. Recently, researchers have tried to close the gap between natural language description of the system and the actual implementation in code using natural language processing techniques. The techniques from NLP have also proven to be useful at generating code snippets while developers work on source code. This literature survey aims to present an overview of the field of code generation using Natural Language Processing techniques. Method: Google Scholar search engine was used to search for papers regarding code generation using NLP. Results: A total of 428 abstracts were screened to reveal 36 papers suitable for the survey. The found papers were categorized into 6 groups by application type. Conclusion: Source code has similarities to natural language, hence NLP techniques have been successfully used to generate code. Additionally, the area has also benefited from recent deep learning based advances in NLP.