Searched for: subject%3A%22statistics%22
(1 - 2 of 2)
document
Urbano, Julián (author), De Lima, H.A. (author), Hanjalic, A. (author)
Statistical significance testing is widely accepted as a means to assess how well a difference in effectiveness reflects an actual difference between systems, as opposed to random noise because of the selection of topics. According to recent surveys on SIGIR, CIKM, ECIR and TOIS papers, the t-test is the most popular choice among IR researchers....
conference paper 2019
document
Urbano, Julián (author), De Lima, H.A. (author), Hanjalic, A. (author)
In test collection based evaluation of IR systems, score standardization has been proposed to compare systems across collections and minimize the effect of outlier runs on specific topics. The underlying idea is to account for the difficulty of topics, so that systems are scored relative to it. Webber et al. first proposed standardization...
conference paper 2019