Evaluating Recommendation Algorithms Based on U-I Matrix Property Analysis

More Info
expand_more

Abstract

Recent years, recommender systems are more and more important for solving information overload problem. They sort through massive data to provide users with personalized content and services. Most researchers focus on designing new algorithms to increase the performance of recommender systems. However, some open challenges stand: Why the performance of an algorithm on different data sets can vary quite a lot? Which property of the data set influences the accuracy of the algorithms? In this thesis, we introduce methodologies to investigate the impact of user-item interactions properties on the accuracy of classical collaborative filtering recommendation algorithms. Firstly, we propose to characterize U-I matrix properties from three domains: network topology, spectrum and information domains. Furthermore, we design several network modification algorithms to systematically modify basic topology properties of a given U-Imatrix to create more U-Imatrices. Meanwhile, the properties of the spectrum and information domains are also changed as topological features are modified. We finally evaluated several classical collaborative filtering algorithms on a large number of U-I matrices and explore which properties in the three domains can influence or better explain the accuracy of the algorithms. We find that the effect of U-I matrix properties on the accuracy of recommendation algorithms is approximately consistent across various data sets. We identify two properties from the network topology and information domain respectively that could better explain the accuracy of algorithms. Understanding how U-I matrix properties affect the accuracy of algorithms has practical significance. The recommender system designers can estimate and explain the accuracy of their recommender systems and are inspired in the design of policies to orient the user item interactions such that the accuracy of their recommendation algorithms could be improved.