Creating a Retrieval-Augmented Generation Pipeline for the Guidelines of the Dutch College of General Practitioners

None, None

Creating a Retrieval-Augmented Generation Pipeline for the Guidelines of the Dutch College of General Practitioners

Bachelor Thesis (2026)

Author(s)

L. Bindt (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Yannick ter Heerdt – Mentor

J. Yang – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

P.K. Murukannaiah – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Medical AI Large language model RAG-pipeline

To reference this document use

https://resolver.tudelft.nl/uuid:2109847d-2218-47ae-9648-316151e0b0c6

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

25-06-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

10

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

As general practitioners currently experience high workloads, Large Language Models (LLMs) offer a promising opportunity to relieve some of this work by enabling faster searching of medical guidelines, saving doctors time and allowing them to deliver better care. This research aimed to answer the primary research question: How can a Retrieval-Augmented Generation (RAG) pipeline be constructed for Dutch NHG guidelines? By breaking this problem down into four distinct sub-questions focused on data processing, retrieval optimization, storage scalability, and model grounding, the research successfully demonstrates a complete, factually correct, and scalable system for general practitioners in The Netherlands. Specifically, the findings show that context-aware data splitting with minimized block sizes optimally preserves clinical cohesion while keeping costs low. For retrieval optimization, combining a traditional BM25 keyword search with an AI meaning-based vector search via Reciprocal Rank Fusion captures edge-case guidelines more effectively than either method alone. Storage scalability is achieved by pairing a Hierarchical Navigable Small World graph with memory-mapped storage, allowing the system to offload data to the disk while maintaining high throughput and low latency. Finally, the application of prompt instructions successfully enforces grounded refusal, preventing the AI from falling back on internal training data when valid clinical context is missing.

Files

Research_Paper_Leander_Bindt.p... (pdf)

(pdf | 1.08 Mb)

License info not available