Creating a Retrieval-Augmented Generation Pipeline for the Guidelines of the Dutch College of General Practitioners
L. Bindt (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Yannick ter Heerdt – Mentor
J. Yang – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
P.K. Murukannaiah – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
As general practitioners currently experience high workloads, Large Language Models (LLMs) offer a promising opportunity to relieve some of this work by enabling faster searching of medical guidelines, saving doctors time and allowing them to deliver better care. This research aimed to answer the primary research question: How can a Retrieval-Augmented Generation (RAG) pipeline be constructed for Dutch NHG guidelines? By breaking this problem down into four distinct sub-questions focused on data processing, retrieval optimization, storage scalability, and model grounding, the research successfully demonstrates a complete, factually correct, and scalable system for general practitioners in The Netherlands. Specifically, the findings show that context-aware data splitting with minimized block sizes optimally preserves clinical cohesion while keeping costs low. For retrieval optimization, combining a traditional BM25 keyword search with an AI meaning-based vector search via Reciprocal Rank Fusion captures edge-case guidelines more effectively than either method alone. Storage scalability is achieved by pairing a Hierarchical Navigable Small World graph with memory-mapped storage, allowing the system to offload data to the disk while maintaining high throughput and low latency. Finally, the application of prompt instructions successfully enforces grounded refusal, preventing the AI from falling back on internal training data when valid clinical context is missing.