H. Vanhuynegem

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Bachelor thesis (1)

Master thesis (1)

2 records found

Visual Question Answering in Mobile AI Assistants: A Benchmark of Proprietary cloud-based multimodal LLMs

Evaluating Monetary Cost, Accuracy, Token Usage, Payload, and Latency

Master thesis (2026) - H. Vanhuynegem, G. Lan

Mobile AR assistants must offload visual queries to cloud multimodal large language models (MLLMs): on-device inference exceeds the power, memory, and thermal budgets of wearable hardware. This thesis measures how image preprocessing affects pipeline latency, payload size, token usage, cost, and visual question answering (VQA) accuracy when applied to the captured frame before transmission.

A controlled paired experiment—pairing each preprocessed sample with its unprocessed counterpart to eliminate confounding from per-image difficulty variance—compared 12 techniques across four provider-model configurations and three VQA datasets, logging all five dimensions per sample. The techniques span JPEG compression, downsampling, grayscale conversion, gaze-based region-of-interest (ROI) cropping, and saliency- and YOLO-based cropping. Relative to unprocessed images, JPEG at quality 85 reduces latency by 25% and payload by 50% with no detectable accuracy loss; gaze-based ROI cropping reduces latency by 38% and payload by over 85% at a 3-percentage-point accuracy cost, provided eye-tracking data are available. On Realtime-class streaming models, both techniques are recommended as deployment defaults.

This thesis introduces a principled taxonomy distinguishing compression-only preprocessing—which reduces payload without altering image geometry and therefore cannot discard task-relevant content—from geometry-changing preprocessing, which crops or resizes the image and can remove information the model would otherwise receive; this distinction predicts which technique classes incur accuracy costs and which do not. The open benchmark VQABench supports replication across additional providers, models, and strategies; the results are limited to still-frame VQA and should be validated separately for video or streaming queries. The findings extend beyond AR: any multimodal pipeline that transmits images to a cloud model can apply preprocessing-first optimisations before investing in prompt compression or model-level architectural changes. ...

Rover Deployment Software System

The Brains Behind the Rover's pod

Bachelor thesis (2024) - H. Vanhuynegem, D.Y. Aris, C.J.M. Verhoeven

The Rover Deployment Software System (RDSS) is a critical component designed to ensure the successful deployment of the Lunar Zebro rover onto the lunar surface. This thesis presents the design, implementation, and testing of the RDSS, which consists of three primary subsystems: a communication system between the lander and the RDSS, an electronic control system, and an integration with an existing rover communication system. Moreover, the existing rover communication system will not be covered in this thesis due to the implementation being done by the Lunar Zebro team in the future. The RDSS is tasked with managing the deployment sequence, providing power during transit, and facilitating communication between the rover and the lander. Key challenges addressed include handling the harsh lunar environment, ensuring reliable communication, and adhering to strict weight constraints. Extensive testing, including unit, integration, system, and performance tests, validated the system’s robustness and reliability. The insights and methodologies developed are intended to support the Lunar Zebro mission and inform future projects involving space deployment systems. ...