Towards Minimal Necessary Data
The Case for Analyzing Training Data Requirements of Recommender Algorithms
M.A. Larson (TU Delft - Multimedia Computing, Radboud Universiteit Nijmegen)
Alessandro Zito (Politecnico di Milano)
B. Loni (TU Delft - Multimedia Computing)
Paolo Cremonesi (Politecnico di Milano)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This paper states the case for the principle of minimal necessary data: If two recommender algorithms achieve the same effectiveness, the better algorithm is the one that requires less user data. Applying this principle involves carrying out training data requirements analysis, which we argue should be adopted as best practice for the development and evaluation of recommender algorithms. We take
the position that responsible recommendation is recommendation that serves the people whose data it uses. To minimize the imposition on users’ privacy, it is important that a recommender system does not collect or store more user information than it absolutely needs. Further, algorithms using minimal necessary data reduce training time and address the cold start problem. To illustrate the trade-off between training data volume and accuracy, we carry out
a set of classic recommender system experiments. We conclude that
consistently applying training data requirements analysis would represent a relatively small change in researchers’ current practices, but a large step towards more responsible recommender systems.