KH
K. Hoxha
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Repository-level code generation remains difficult in industrial systems because tasks span multiple files, internal APIs, architectural conventions, tests, and quality constraints. We present CoCA (Copilot-Orchestrated Contextual Agents), an IDE-constrained framework currently instantiated for Java repositories that extends GitHub Copilot Chat with task decomposition, deterministic repository-context retrieval, optional Test-Driven Generation, and persistent domain-context injection for enterprise settings where external embeddings, fine-tuning, and third-party LLM services are not permitted.
We evaluate CoCA at ASML using CoCABench, an internal suite with a long-horizon task focus composed of 5 epics from 2 proprietary Java repositories with 44 developer-identified subtasks, ranging from a 2-day bug fix to 3-month feature work. Full CoCA is associated with higher ground-truth alignment than the single-agent baseline, from 0.25 to 0.44, on the LLM-judge metric with the strongest inter-rater reliability (Krippendorff's α=0.46). However, it achieves only 0.20 pass@1 despite 0.60 build@1, while the single-agent baseline achieves the highest pass@1.
These research findings suggest that IDE-constrained agentic workflows can move generated implementations closer to the intended developer solution, but do not yet solve reliable executable integration. CoCA is therefore best understood as a developer-in-the-loop assistance workflow rather than a fully autonomous implementation system or a replacement for direct Copilot prompting. It appears most appropriate for long, integration-heavy feature epics where planning, context continuity, and repository awareness are valuable. For small localized fixes, the orchestration overhead may outweigh these gains. ...
We evaluate CoCA at ASML using CoCABench, an internal suite with a long-horizon task focus composed of 5 epics from 2 proprietary Java repositories with 44 developer-identified subtasks, ranging from a 2-day bug fix to 3-month feature work. Full CoCA is associated with higher ground-truth alignment than the single-agent baseline, from 0.25 to 0.44, on the LLM-judge metric with the strongest inter-rater reliability (Krippendorff's α=0.46). However, it achieves only 0.20 pass@1 despite 0.60 build@1, while the single-agent baseline achieves the highest pass@1.
These research findings suggest that IDE-constrained agentic workflows can move generated implementations closer to the intended developer solution, but do not yet solve reliable executable integration. CoCA is therefore best understood as a developer-in-the-loop assistance workflow rather than a fully autonomous implementation system or a replacement for direct Copilot prompting. It appears most appropriate for long, integration-heavy feature epics where planning, context continuity, and repository awareness are valuable. For small localized fixes, the orchestration overhead may outweigh these gains. ...
Repository-level code generation remains difficult in industrial systems because tasks span multiple files, internal APIs, architectural conventions, tests, and quality constraints. We present CoCA (Copilot-Orchestrated Contextual Agents), an IDE-constrained framework currently instantiated for Java repositories that extends GitHub Copilot Chat with task decomposition, deterministic repository-context retrieval, optional Test-Driven Generation, and persistent domain-context injection for enterprise settings where external embeddings, fine-tuning, and third-party LLM services are not permitted.
We evaluate CoCA at ASML using CoCABench, an internal suite with a long-horizon task focus composed of 5 epics from 2 proprietary Java repositories with 44 developer-identified subtasks, ranging from a 2-day bug fix to 3-month feature work. Full CoCA is associated with higher ground-truth alignment than the single-agent baseline, from 0.25 to 0.44, on the LLM-judge metric with the strongest inter-rater reliability (Krippendorff's α=0.46). However, it achieves only 0.20 pass@1 despite 0.60 build@1, while the single-agent baseline achieves the highest pass@1.
These research findings suggest that IDE-constrained agentic workflows can move generated implementations closer to the intended developer solution, but do not yet solve reliable executable integration. CoCA is therefore best understood as a developer-in-the-loop assistance workflow rather than a fully autonomous implementation system or a replacement for direct Copilot prompting. It appears most appropriate for long, integration-heavy feature epics where planning, context continuity, and repository awareness are valuable. For small localized fixes, the orchestration overhead may outweigh these gains.
We evaluate CoCA at ASML using CoCABench, an internal suite with a long-horizon task focus composed of 5 epics from 2 proprietary Java repositories with 44 developer-identified subtasks, ranging from a 2-day bug fix to 3-month feature work. Full CoCA is associated with higher ground-truth alignment than the single-agent baseline, from 0.25 to 0.44, on the LLM-judge metric with the strongest inter-rater reliability (Krippendorff's α=0.46). However, it achieves only 0.20 pass@1 despite 0.60 build@1, while the single-agent baseline achieves the highest pass@1.
These research findings suggest that IDE-constrained agentic workflows can move generated implementations closer to the intended developer solution, but do not yet solve reliable executable integration. CoCA is therefore best understood as a developer-in-the-loop assistance workflow rather than a fully autonomous implementation system or a replacement for direct Copilot prompting. It appears most appropriate for long, integration-heavy feature epics where planning, context continuity, and repository awareness are valuable. For small localized fixes, the orchestration overhead may outweigh these gains.
Exploring the Spatial Characteristics of MARS
Assessing the Impact of Neural Net Depth Increase and PointNet Architecture Integration on MARS Performance
The modern workplace often exposes individuals to privacy risks, such as the unauthorised visibility of their computer screens. MARS (mmWave-based Assistive Rehabilitation System for Smart Healthcare), coupled with VideowindoW screens, offers an innovative solution to these threats by using mmWave radar to reconstruct human poses and estimate the position of 19 key joints. This enables the screens to become opaque based on the viewer's position, ensuring privacy. Although originally designed as a rehabilitation system, MARS can be utilised for its pose estimation capabilities to enhance workplace privacy. This research explores two modifications to the MARS architecture to assess their impact on the system's accuracy and performance. Specifically, we modify the MARS architecture by increasing the depth of its convolutional neural network (CNN) and integrating the PointNet architecture. Results establish that an optimal CNN configuration with two convolutional and two dense layers, followed by the output layer, modestly improves joint location estimation. However, integrating PointNet does not improve performance, likely due to PointNet's limitations in capturing the necessary local structural details of point clouds. These findings inform future research of possible improvements when leveraging the MARS dataset in the fields of privacy enhancement and smart healthcare applications.
...
The modern workplace often exposes individuals to privacy risks, such as the unauthorised visibility of their computer screens. MARS (mmWave-based Assistive Rehabilitation System for Smart Healthcare), coupled with VideowindoW screens, offers an innovative solution to these threats by using mmWave radar to reconstruct human poses and estimate the position of 19 key joints. This enables the screens to become opaque based on the viewer's position, ensuring privacy. Although originally designed as a rehabilitation system, MARS can be utilised for its pose estimation capabilities to enhance workplace privacy. This research explores two modifications to the MARS architecture to assess their impact on the system's accuracy and performance. Specifically, we modify the MARS architecture by increasing the depth of its convolutional neural network (CNN) and integrating the PointNet architecture. Results establish that an optimal CNN configuration with two convolutional and two dense layers, followed by the output layer, modestly improves joint location estimation. However, integrating PointNet does not improve performance, likely due to PointNet's limitations in capturing the necessary local structural details of point clouds. These findings inform future research of possible improvements when leveraging the MARS dataset in the fields of privacy enhancement and smart healthcare applications.