Do the Findings of Document and Passage Retrieval Generalize to the Retrieval of Responses for Dialogues?

None, None; None, None

Do the Findings of Document and Passage Retrieval Generalize to the Retrieval of Responses for Dialogues?

Conference Paper (2023)

Author(s)

Gustavo Penha (TU Delft - Web Information Systems)

C Hauff (TU Delft - Web Information Systems)

Research Group

Web Information Systems

Copyright

DOI related publication

https://doi.org/10.1007/978-3-031-28241-6_9

To reference this document use:

https://resolver.tudelft.nl/uuid:cc985c12-ac55-4fc6-9bd5-d6a197571e75

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Research Group

Web Information Systems

Bibliographical Note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en

Pages (from-to)

132-147

ISBN (print)

978-3-031-28240-9

ISBN (electronic)

978-3-031-28241-6

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A number of learned sparse and dense retrieval approaches have recently been proposed and proven effective in tasks such as passage retrieval and document retrieval. In this paper we analyze with a replicability study if the lessons learned generalize to the retrieval of responses for dialogues, an important task for the increasingly popular field of conversational search. Unlike passage and document retrieval where documents are usually longer than queries, in response ranking for dialogues the queries (dialogue contexts) are often longer than the documents (responses). Additionally, dialogues have a particular structure, i.e. multiple utterances by different users. With these differences in mind, we here evaluate how generalizable the following major findings from previous works are: (F1) query expansion outperforms a no-expansion baseline; (F2) document expansion outperforms a no-expansion baseline; (F3) zero-shot dense retrieval underperforms sparse baselines; (F4) dense retrieval outperforms sparse baselines; (F5) hard negative sampling is better than random sampling for training dense models. Our experiments (https://github.com/Guzpenha/transformer_rankers/tree/full_rank_retrieval_dialogues.)—based on three different information-seeking dialogue datasets—reveal that four out of five findings (F2–F5) generalize to our domain.

Files

978_3_031_28241_6_9.pdf

(pdf | 0.452 Mb)

- Embargo expired in 16-09-2023

License info not available