CoCA: Extending GitHub Copilot with a Context-Aware Agentic Framework for Large and Domain-Specific Repositories at ASML

Master Thesis (2026)
Author(s)

K. Hoxha (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Izadi – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Oguzhan Yildiz – Mentor (ASML)

B. Özkan – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

P.K. Murukannaiah – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
24-06-2026
Awarding Institution
Delft University of Technology
Programme
Computer Science, Data Science and Technology, Computer Science, Artificial Intelligence
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
23
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Repository-level code generation remains difficult in industrial systems because tasks span multiple files, internal APIs, architectural conventions, tests, and quality constraints. We present CoCA (Copilot-Orchestrated Contextual Agents), an IDE-constrained framework currently instantiated for Java repositories that extends GitHub Copilot Chat with task decomposition, deterministic repository-context retrieval, optional Test-Driven Generation, and persistent domain-context injection for enterprise settings where external embeddings, fine-tuning, and third-party LLM services are not permitted.

We evaluate CoCA at ASML using CoCABench, an internal suite with a long-horizon task focus composed of 5 epics from 2 proprietary Java repositories with 44 developer-identified subtasks, ranging from a 2-day bug fix to 3-month feature work. Full CoCA is associated with higher ground-truth alignment than the single-agent baseline, from 0.25 to 0.44, on the LLM-judge metric with the strongest inter-rater reliability (Krippendorff's α=0.46). However, it achieves only 0.20 pass@1 despite 0.60 build@1, while the single-agent baseline achieves the highest pass@1.

These research findings suggest that IDE-constrained agentic workflows can move generated implementations closer to the intended developer solution, but do not yet solve reliable executable integration. CoCA is therefore best understood as a developer-in-the-loop assistance workflow rather than a fully autonomous implementation system or a replacement for direct Copilot prompting. It appears most appropriate for long, integration-heavy feature epics where planning, context continuity, and repository awareness are valuable. For small localized fixes, the orchestration overhead may outweigh these gains.

Files

License info not available
warning

File under embargo until 01-01-2027