Quantization for compact neural passage re-ranking

Master Thesis (2024)
Author(s)

C.P. Lupău (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

L.J.L. Leonhardt – Mentor (TU Delft - Web Information Systems)

A. Anand – Mentor (TU Delft - Web Information Systems)

Kubilay Atasu – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
24-06-2024
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Passage re-ranking is a fundamental problem in information retrieval, which deals with reordering a small set of passages based on their relevancy to a query. It is a crucial component in various web information systems, such as search engines or question-answering systems. Modern approaches for building re-ranking systems rely on neural language models such as BERT, or its derivatives, to create dense indexes for the target document corpus. While such approaches bring significant performance gains compared to classical lexical re-rankers, they have the disadvantage of increased memory costs.

A family of methods that can be used to reduce the memory footprint of a dense index is called vector quantization. Vector quantization algorithms usually rely on a combination of clustering and space manipulation operations to perform a lossy compression of the dense index at the expense of index performance. While vector quantization is widely used for first-stage retrieval, its use in the context of re-ranking is underexplored. To this end, this thesis evaluates the effectiveness of product quantization, a well-known vector quantization method, on single-vector dual-encoders, specifically TCT-ColBERT and Aggretriever. In addition to this, we show how linear interpolation of sparse scores can be leveraged to improve the performance of quantized dense indices with negligible costs to the memory footprint or speed. Last but not least, we propose WolfPQ, a learnable quantization method aimed at further improving quantization for re-ranking by bridging the gap between the objective functions used in training product quantization and re-ranking systems, respectively.

Files

License info not available