Document performance prediction for automatic text classification

Conference Paper (2019)
Author(s)

Gustavo Penha (TU Delft - Web Information Systems)

Raphael Campos (Universidade Federal de Minas Gerais)

Sérgio Canuto (Universidade Federal de Minas Gerais)

Marcos André Gonçalves (Universidade Federal de Minas Gerais)

Rodrygo L.T. Santos (Universidade Federal de Minas Gerais)

Research Group
Web Information Systems
DOI related publication
https://doi.org/10.1007/978-3-030-15719-7_17
More Info
expand_more
Publication Year
2019
Language
English
Research Group
Web Information Systems
Pages (from-to)
132-139
ISBN (print)
978-3-030-15718-0
ISBN (electronic)
978-3-030-15719-7

Abstract

Query performance prediction (QPP) is a fundamental task in information retrieval, which concerns predicting the effectiveness of a ranking model for a given query in the absence of relevance information. Despite being an active research area, this task has not yet been explored in the context of automatic text classification. In this paper, we study the task of predicting the effectiveness of a classifier for a given document, which we refer to as document performance prediction (DPP). Our experiments on several text classification datasets for both categorization and sentiment analysis attest the effectiveness and complementarity of several DPP inspired by related QPP approaches. Finally, we also explore the usefulness of DPP for improving the classification itself, by using them as additional features in a classification ensemble.

No files available

Metadata only record. There are no files for this record.