The Impact of the Retrieval Stage in Interpolation-based Re-Ranking
D.C. Ciacu (TU Delft - Electrical Engineering, Mathematics and Computer Science)
L.J.L. Leonhardt – Mentor (TU Delft - Web Information Systems)
A. Anand – Graduation committee member (TU Delft - Web Information Systems)
A Hanjalic – Coach (TU Delft - Intelligent Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Efficient and effective information retrieval (IR) systems are needed to fetch a large number of relevant documents and present them based on their relevance to the input queries. Previous work reported the use of sparse and dense retrievers. Sparse retrievers offer low latency but suffer from term mismatch issues, while dense retrievers improve performance at the cost of higher processing times. The literature proposed Fast-Forward Indexes, an interpolation-based re-ranking framework that leverages the benefits of both sparse and dense retrievers.
Although a lot of work was done in the field, most studies evaluate the performance of the proposed models only on the MS Marco dataset, neglecting other datasets. This study extends previous work by exploring how different sparse retrievers, employing no-encoder, uni-encoder, and bi-encoder architectures, perform in an interpolation-based re-ranking setting on datasets originating from various domains. Results show that bi-encoder-based retrievers outperform the other sparse retrievers in terms of recall but with a substantial increase in latency compared to simpler retrievers, which generally showed good performance. Further, when the retrievers were used in an interpolation-based re-ranking setting, they performed similarly in terms of ranking quality.