Do the Findings of Document and Passage Retrieval Generalize to the Retrieval of Responses for Dialogues?

Conference Paper (2023)
Author(s)

Gustavo Penha (TU Delft - Web Information Systems)

C Hauff (TU Delft - Web Information Systems)

Research Group
Web Information Systems
Copyright
© 2023 G. Penha, C. Hauff
DOI related publication
https://doi.org/10.1007/978-3-031-28241-6_9
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 G. Penha, C. Hauff
Research Group
Web Information Systems
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Pages (from-to)
132-147
ISBN (print)
978-3-031-28240-9
ISBN (electronic)
978-3-031-28241-6
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A number of learned sparse and dense retrieval approaches have recently been proposed and proven effective in tasks such as passage retrieval and document retrieval. In this paper we analyze with a replicability study if the lessons learned generalize to the retrieval of responses for dialogues, an important task for the increasingly popular field of conversational search. Unlike passage and document retrieval where documents are usually longer than queries, in response ranking for dialogues the queries (dialogue contexts) are often longer than the documents (responses). Additionally, dialogues have a particular structure, i.e. multiple utterances by different users. With these differences in mind, we here evaluate how generalizable the following major findings from previous works are: (F1) query expansion outperforms a no-expansion baseline; (F2) document expansion outperforms a no-expansion baseline; (F3) zero-shot dense retrieval underperforms sparse baselines; (F4) dense retrieval outperforms sparse baselines; (F5) hard negative sampling is better than random sampling for training dense models. Our experiments (https://github.com/Guzpenha/transformer_rankers/tree/full_rank_retrieval_dialogues.)—based on three different information-seeking dialogue datasets—reveal that four out of five findings (F2–F5) generalize to our domain.

Files

978_3_031_28241_6_9.pdf
(pdf | 0.452 Mb)
- Embargo expired in 16-09-2023
License info not available