MANtIS: a novel information seeking dialogues dataset

Master thesis (2019)

Authors

A. Bălan Electrical Engineering, Mathematics and Computer Science

Contributors

C. Hauff Web Information Systems - (mentor)

N. Tintarev Web Information Systems - (graduation committee member)

Z. Al-Ars Computer Engineering - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Ranking Conversational Agent Information Retrieval Conversational search Conversation

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:0ab2d1e4-385e-43cf-9883-cfc6c2f3f19c

Published Date

09-12-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Nowadays, most users access the web through search engine portals. However, information needs can often be ill-defined or too broad to be solvable by a list of results the user has to scroll through, which implies that he is most likely required to refine the need by himself to reach the desired result. In recent years, researchers have attempted to tackle these issues through conversations, more specifically through conversational search. This topic has seen an increase of interest from the research community, proven by the appearance of specialized workshops and seminars. The general public has also started to show interest, proven by the emergence of a wide range of virtual assistants, such as Google Assistant, Microsoft Cortana or Amazon Alexa. As such conversational systems seek to fulfill an information need of a user, they should be able to elicit and fully understand his requirements regardless of the domain, track the conversation as it evolves while attempting to clarify the initial information need and provide suggestions and answers that are based on concrete knowledge sources. Although various developments in domains adjacent to conversational search enabled us to better understand natural language, there is a lack of large-scale datasets that are appropriate for training models to perform conversational search tasks. Through our research, we have built a collection of over 80,000 conversations that fulfill the requirements of a conversational search dataset. We have benchmarked this dataset on three distinct tasks using multiple baselines.

Files

Alexandru_Balan_Masters_Thesis... (.pdf)

(.pdf | 4.01 Mb)