Exploiting the Test-time Reference Map for Visual Place Recognition

None, None

doi:10.4233/uuid:23100238-1ab6-40a6-8012-ccca0473d230

Exploiting the Test-time Reference Map for Visual Place Recognition

Doctoral Thesis (2026)

Author(s)

M. Zaffar (TU Delft - Aerospace Engineering)

Contributor(s)

J.F.P. Kooij – Promotor (TU Delft - Mechanical Engineering)

L. Nan – Copromotor (TU Delft - Architecture and the Built Environment)

Research Group

Control & Simulation

Localization Feature extraction Domain adaptation Representation learning Visual place recognition Uncertainty estimation Description and matching

DOI related publication

https://doi.org/10.4233/uuid:23100238-1ab6-40a6-8012-ccca0473d230 Final published version

To reference this document use

https://doi.org/10.4233/uuid:23100238-1ab6-40a6-8012-ccca0473d230

More Info

expand_more

Publication Year

2026

Language

English

Defense Date

06-05-2026

Awarding Institution

Delft University of Technology

Research Group

Control & Simulation

ISBN (print)

978-94-6518-287-2

Downloads counter

96

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Visual Place Recognition (VPR) is a key task in computer vision and robotics, enabling loop closure in SLAM, image-based localization, landmark retrieval, and navigation. While deep-learning approaches have improved robustness to viewpoint, illumination, seasonal, and dynamic changes, three less explored challenges—domain generalization, uncertainty estimation, and localization accuracy—remain critical. This thesis demonstrates that test-time reference maps, typically used only for retrieval, can be exploited to address these challenges without additional sensors or retraining.

A unified evaluation framework, VPR-Bench, is introduced to standardize datasets, metrics, and evaluation practices across robotics and vision communities. VPR-Bench enables meta-analyses of descriptor size, runtime trade-offs, viewpoint and illumination invariance, and retrieval efficiency, highlighting that no single VPR method is universally best.

To improve cross-domain robustness, Reference-Set Finetuning (RSF) is proposed: a self-supervised finetuning strategy using test-time reference images to reduce train-test domain gaps. For reliability, Spatial Uncertainty Estimation (SUE) leverages reference map metadata to quantify the spatial spread of top-ranked poses, outperforming lightweight methods and complementing geometric verification. Finally, Continuous Place-descriptor Regression (CoPR) densifies the feature space by regressing descriptors at novel poses, reducing localization errors caused by map quantization and enhancing accuracy when combined with viewpoint-variant encoders.

Overall, this thesis reframes the reference map from a passive database to an active, exploitable resource. By systematically leveraging map information through RSF, SUE, and CoPR, it delivers measurable improvements in robustness, reliability, and localization accuracy, advancing map-aware VPR for real-world robotics and autonomous systems.

Files

Phd_thesis_mzaffar_final_libra... (pdf)

(pdf | 67.3 Mb)

License info not available