Query Answerability Classifier for Direct Answer Module in Web Search Engines

Bachelor Thesis (2021)
Author(s)

Y. Wang (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C Hauff – Mentor (TU Delft - Web Information Systems)

George Iosifidis – Graduation committee member (TU Delft - Embedded Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Yiran Wang
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Yiran Wang
Graduation Date
30-06-2021
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In order determine when we can show direct answer module to user queries in web search engine, an independent classifier is designed in this study to assess the answerability of each user query. Real user queries are sampled from MS MARCO Question Answering and Natural Langauge Generation dataset \cite{MSMARCO} and manually labelled with query answerability to train and evaluate the classifier. As a result, the XGboost model has an overall better performance than the random forest model with prediction accuracy score 0.83 and F1 score 0.89. Once the classifier determines the user query is answerable, a MRC model may be used to find the direct answer within provided passages. Else, no direct answer shall be provided to this query.

Files

Research_Project_19_3_.pdf
(pdf | 0.548 Mb)
License info not available