Collaborative recommender systems often provide disappointing suggestions to new users who volunteered very few or no ratings of their own for processing: this is known as the cold-start problem. Mitigating such shortcomings with cross-domain information helps; this paper builds on previous work by the authors to introduce incremental improvements to the cold-start problem.
The authors previously published an approach to single-domain recommendations based on clustering of latent factors using matrix factorization and k-means. Here, they translate the same principle into seeding cold-start recommendations with cross-domain information, exploiting multiple domains where each domain is endowed with shared users, hence with some measure of user overlap. Validation was carried out using two datasets, one from Amazon comprising ratings for media (video, music, DVD) and goods (electronics, kitchen, toys) and another from Epinions comprising ten disparate categories of items. Validation compared results for single-domain, traditional cross-domain, and their new clustered cross-domain top-N ratings using recall for N of 5, 10, 15 or 20 as a metric. Cross-domain cold-start performed better than single-domain; clustered cross-domain performed as well as traditional cross-domain for low N, and better than traditional for larger N.
The main contribution of this paper is the novelty of a relatively simple implementation for the underlying idea, which is not entirely original but new in this form. Its main limitation is in the difficulty of assessing impact on the basis of a single machine-driven metric on only two datasets; albeit traditionally acceptable, validation would be more informative if carried out with more data, multiple metrics, and ideally before a live audience.