Performance Comparison of Different Query Expansion and Pseudo-Relevance Feedback Methods

A comparison of Bo1, KL, RM3, and Axiomatic Query Expansion against BM25

More Info
expand_more

Abstract

This paper is an analysis of the performance and logic behind different query expansion models. Query expansion and pseudo relevance feedback are techniques for adding more terms to a query based on the results of an initial query and the data in the body of documents. Four different query expansion models that are provided in the pyterrier python library and its extensions have been analysed, namely Bo1, KL, RM3, and Axiomatic query expansion. It was found that Axiomatic query expansion often does not perform any query expansion, and when it does, has no increase in performance. Bo1 and KL, although different in exact logic, have similar results most of the time. The most significant difference is the execution time, with Bo1 being faster with larger datasets and KL being faster with many documents on smaller datasets. Lastly, RM3 while not having a dominant performance has a lot of potential for good results with the right combination or parameters.