AI for GovTech

Exploring the use of LLMs for GovTech Benchmark Operationalization

More Info
expand_more

Abstract

This research explores the use of Artificial Intelligence (AI), specifically Large Language Models (LLMs), into the operationalization of Government Technology (GovTech) benchmarks to increase their utility for policymakers. Research and practice consistently highlight persistent challenges in GovTech benchmarking, such as resource-intensive methodologies that provide retrospective rather than real-time analysis, a lack of complexity that overlooks digital infrastructures and emerging technologies in favor of simpler metrics, and improper levels of aggregation that can render results less useful. Considering that benchmarks can significantly influence political outcomes and shape the development of GovTech services, refining benchmarking methodologies using LLMs can mitigate these issues and potentially improve the responsiveness and relevance of government actions that better serve societal needs. Using Design Science Research Methodology and Activity Theory, an artefact is developed that combines an LLM with Retrieval-Augmented Generation (RAG), fine-tuning and prompt-engineering. The artefact is used to operationalize the GTMI-benchmark by the World Bank. The development of a benchmark specifically tailored for operationalization by LLMs is proposed, with a preliminary design for an AI-Supported GovTech Index (AGTI) outlined.