Design and Evaluation of a Context-Aware Orchestration Framework for Vulnerability Prioritization in Financial Infrastructure
A.T. Meulien (TU Delft - Electrical Engineering, Mathematics and Computer Science)
G. Smaragdakis – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
K. Liang – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Jovan Aleksov – Mentor (ABN AMRO Bank N.V.)
Jérémie Decouchant – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Rui Wang – Mentor
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Vulnerability prioritization in financial infrastructure depends on both public vulnerability signals and organization-specific context, such as affected assets, ownership, exposure, compliance status, and remediation constraints. Generic LLM-based systems can help analysts summarize and explain such information, but their direct use in regulated security workflows raises concerns about grounding, traceability, reliability, and control.
This thesis designs and evaluates a context-aware orchestration framework for LLM-supported vulnerability prioritization. The framework separates deterministic enterprise-data retrieval from LLM-based synthesis. For each analyst query, predefined workflows retrieve and join relevant vulnerability, asset, ownership, exposure, and compliance records, prune unnecessary metadata, and validate the generated output before presenting it as analyst-facing decision support.
The framework was evaluated against an autonomous-agent baseline in a controlled offline proof-of-concept environment using an approved static production-derived CSV snapshot from selected operational security systems. The benchmark covered 14 questions across three vulnerability-management use cases: contextual asset triage, risk justification, and remediation guidance. The parametric framework achieved a mean weighted accuracy of 98.4%, compared with 90.0% for the autonomous baseline, with the largest difference in contextual asset triage. It also consumed fewer tokens and showed less degradation under stochastic workflow noise.
The results suggest that constrained, context-aware orchestration can improve grounding, traceability, robustness, and operational efficiency for structured vulnerability-prioritization tasks. The findings should be interpreted as proof-of-concept evidence for the selected benchmark, not as evidence of production-scale deployment performance.