Exploring methods to improve effectiveness of ad-hoc retrieval systems for long and complex queries

More Info
expand_more

Abstract

Ad-hoc retrieval involves ranking a list of documents from a large collection based on their relevance to a given input query. These retrieval systems often show poorer performances when handling longer and more complex queries. This paper aims to explore methods of improving retrieval effectiveness on these types of queries across different information retrieval (IR) tasks, within the context of Fast-Forward indexes. An analysis is conducted to determine the actual impact of query length and complexity. Interestingly, the hypothesis that longer queries are more challenging does not hold true for all cases, and in some datasets the opposite is true. To improve the performance of long and complex queries, two approaches are explored: utilising multiple dense models during the re-ranking stage instead of the traditional single model and reducing the queries via large language models. The use of multiple dense models for re-ranking proves to be effective, with two models providing the best balance between performance and ranking quality. Utilising LLM's for query reduction achieves performance similar to the original queries but fails to improve their ranking scores.