Recommender systems are widely used in product recommendations such as recommendations of music, movies, books, news, research articles, restaurants, etc. [1][5].
There are two popular methods for building recommender systems:
The collaborative filtering method [5] predicts (filters) the interests of a user on a product by collecting preferences information from many other users (collaborating). The assumption behind the collaborative filtering method is that if a person P1 has the same opinion as another person P2 on an issue, P1 is more likely to share P2’s opinion on a different issue than that of a randomly chosen person [5].
Content-based filtering method [6] utilizes product features/attributes to recommend other products similar to what the user likes, based on other users’ previous actions or explicit feedback such as rating on products.
A recommender system may use either or both of these two methods.
In this article, I use the Kaggle Netflix prize data [2] to demonstrate how to use model-based collaborative filtering method to build a recommender system in Python.
The rest of the article is arranged as follows:
As described in [5], the main idea behind collaborative filtering is that one person often gets the best recommendations from another with similar interests. Collaborative filtering uses various techniques to match people with similar interests and make recommendations based on shared interests.
The high-level workflow of a collaborative filtering system can be described as follows:
Typically a collaborative filtering system recommends products to a given user in two steps [5]:
This is called user-based collaborative filtering. One specific implementation of this method is the user-based Nearest Neighbor algorithm.
As an alternative, item-based collaborative filtering (e.g., users who are interested in x also interested in y) works in an item-centric manner:
There are two types of collaborative filtering system:
In a model-based system, we develop models using different machine learning algorithms to predict users’ rating of unrated items [5]. There are many model-based collaborative filtering algorithms such as singular value decomposition (SVD), Bayesian networks, clustering models, etc.[5].
A memory-based system uses users’ rating data to compute the similarity between users or items. Typical examples of this type of systems are neighbourhood-based method and item-based/user-based top-N recommendations [5].
This article describes how to build a model-based collaborative filtering system using the SVD model.
#data-science #machine-learning