Rank similarity quantifies the difference between two ordered sets of items. Rank-Biased Overlap (RBO) is a top-weighted measure of rank similarity that can be used for a pair of indefinite rankings, such that only a prefix is known and that items need not be present in both rank
...
Rank similarity quantifies the difference between two ordered sets of items. Rank-Biased Overlap (RBO) is a top-weighted measure of rank similarity that can be used for a pair of indefinite rankings, such that only a prefix is known and that items need not be present in both rankings. This method is frequently used in Information Retrieval (IR), such as to compare search engine results. RBO defines tight lower and upper bounds, RBO_min and RBO_max, which give the uncertainty due to items in the unseen suffix. Another source of uncertainty are ties: two items are tied in a ranking if their true order is not known. Recent work on the treatment of ties in RBO has made it a tie-aware measure. However, unlike the uncertainty due to unseen items, uncertainty due to ties does not disappear for longer prefixes. Determining the distribution of possible scores is O((n!)^2) if all arrangements of ties are considered, and existing methods only find the lower and upper bound for RBO with respect to ties. We investigate whether a probabilistic estimator for the uncertainty distribution can be constructed. We use an iterative convolution method to compose the marginal PMFs of each item. By evaluating against synthetic data, we show that this estimate distribution can be used to reliably compute confidence intervals, mean, and variance. We conclude that a probabilistic method is a viable solution when seeking deterministic results with fast computation.