Vertical Selection for Heterogeneous Search Engine Result Pages

None, None

Vertical Selection for Heterogeneous Search Engine Result Pages

Bachelor Thesis (2021)

Author(s)

A. Vilčinskas (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Hauff – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

G. Iosifidis – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Aggregated search Vertical selection Heterogeneous search engine result page

To reference this document use

https://resolver.tudelft.nl/uuid:d72f2b0e-ad8a-4337-8649-80be3df405b8

More Info

expand_more

Publication Year

2021

Language

English

Graduation Date

01-07-2021

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

183

Collections

thesis

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Items that a user can see when he uses the general result page of a modern search engine can be categorized as verticals. Some examples of verticals are images, videos, news, shopping. Heterogeneous search engine result pages encompass result pages that contain results from different verticals. It is widely used and has been proven to improve the user experience over the result pages that only contain a list of websites. Different verticals are appropriate for each query. We study how to define, develop, and evaluate a vertical selection model, that for a query selects and presents the appropriate verticals. We give an approach for collecting a corpus of documents that represent different verticals. Later corpus documents are used as training data for query result classification. Features were extracted from the documents to train a classifier. The model that uses the Random Forest classifier and features extracted from the query itself achieved an f-score of 0.4921 on the TREC 2014 dataset. The score and the analysis of the results show that the proposed vertical selection methodology is viable. To better capture the difference between documents in different verticals, the corpus collection approach should be improved.

Files

Research_Paper.pdf

(pdf | 1.28 Mb)

License info not available