Quantization for compact neural passage re-ranking

Master thesis (2024)

Authors

C.P. Lupău Electrical Engineering, Mathematics and Computer Science

Contributors

L.J.L. Leonhardt Web Information Systems - (mentor)

A. Anand Web Information Systems - (mentor)

Kubilay Atasu Data-Intensive Systems - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science

Quantization Re-ranking Ad-hoc Retrieval Index Compression Search Engine Neural Ranker

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:f6fa94f8-ca16-4362-8858-08ca8f0dcd21

Published Date

24-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Passage re-ranking is a fundamental problem in information retrieval, which deals with reordering a small set of passages based on their relevancy to a query. It is a crucial component in various web information systems, such as search engines or question-answering systems. Modern approaches for building re-ranking systems rely on neural language models such as BERT, or its derivatives, to create dense indexes for the target document corpus. While such approaches bring significant performance gains compared to classical lexical re-rankers, they have the disadvantage of increased memory costs.

A family of methods that can be used to reduce the memory footprint of a dense index is called vector quantization. Vector quantization algorithms usually rely on a combination of clustering and space manipulation operations to perform a lossy compression of the dense index at the expense of index performance. While vector quantization is widely used for first-stage retrieval, its use in the context of re-ranking is underexplored. To this end, this thesis evaluates the effectiveness of product quantization, a well-known vector quantization method, on single-vector dual-encoders, specifically TCT-ColBERT and Aggretriever. In addition to this, we show how linear interpolation of sparse scores can be leveraged to improve the performance of quantized dense indices with negligible costs to the memory footprint or speed. Last but not least, we propose WolfPQ, a learnable quantization method aimed at further improving quantization for re-ranking by bridging the gap between the objective functions used in training product quantization and re-ranking systems, respectively.

Files

Quantization-for-neural-passag... (.pdf)

(.pdf | 5.29 Mb)