MANtIS: a novel information seeking dialogues dataset

None, None

MANtIS: a novel information seeking dialogues dataset

Master Thesis (2019)

Author(s)

A. Bălan (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Hauff – Mentor (TU Delft - Web Information Systems)

Nava Tintarev – Graduation committee member (TU Delft - Web Information Systems)

Zaid Al-Ars – Graduation committee member (TU Delft - Computer Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Ranking Conversational Agent Information Retrieval Conversational search Conversation

To reference this document use:

https://resolver.tudelft.nl/uuid:0ab2d1e4-385e-43cf-9883-cfc6c2f3f19c

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Graduation Date

09-12-2019

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Data Science and Technology']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Nowadays, most users access the web through search engine portals. However, information needs can often be ill-defined or too broad to be solvable by a list of results the user has to scroll through, which implies that he is most likely required to refine the need by himself to reach the desired result. In recent years, researchers have attempted to tackle these issues through conversations, more specifically through conversational search. This topic has seen an increase of interest from the research community, proven by the appearance of specialized workshops and seminars. The general public has also started to show interest, proven by the emergence of a wide range of virtual assistants, such as Google Assistant, Microsoft Cortana or Amazon Alexa. As such conversational systems seek to fulfill an information need of a user, they should be able to elicit and fully understand his requirements regardless of the domain, track the conversation as it evolves while attempting to clarify the initial information need and provide suggestions and answers that are based on concrete knowledge sources. Although various developments in domains adjacent to conversational search enabled us to better understand natural language, there is a lack of large-scale datasets that are appropriate for training models to perform conversational search tasks. Through our research, we have built a collection of over 80,000 conversations that fulfill the requirements of a conversational search dataset. We have benchmarked this dataset on three distinct tasks using multiple baselines.

Files

Alexandru_Balan_Masters_Thesis... (pdf)

(pdf | 4.01 Mb)

License info not available