Model-Based Collab Filtering

Introducing Recommendation Systems

  • The goal of any recommender system is to do the following:

    • Predict the ratings that a certain user would give to different catalog items
    • Create a list of recommendations by selecting
    • Rank those items with the highest predicted ratings
  • Collaborative filtering is used for creating recommendation systems

Introducing Collaborative Filtering

  • Under the hood, collaborative filtering uses a predictive model to predict a rating for a given pair of user and item
  • These predictions are based on how similar users rate similar items
  • Specifically, the model captures interactions between known users and items from the rating matrix
  • The benefit of collaborative filtering is that they're capable of making recommendations based only on the patterns and similarities available in the rating matrix
  • It has the following disadvantages:

    • Difficult to build reliable rating predictions if matrix is too sparse
    • Difficult to handle new users or items (cold start problem)
    • Somewhat biased towards popular items (difficult to pick up on unusual tastes)

Defining Types of Collaborative Filtering

  • Collaborative filtering algorithms are usually categorized into two subgroups:

    • Neighborhood-based methods
    • Model-based methods
  • Neighborhood-based methods predict unknown ratings for a given user or item by doing the following:

    • Finding the most similar known users or items
    • Averaging ratings from their records
  • Model-based methods go beyond the nearest neighbor approach

    • Specifically, they use more sophisticated, predictive models

Introducing Model-Based Collaborative Filtering

  • Often, a neighborhood-based algorithm does the following:

    • Optionally, impute missing data (using neighborhood or model based methods)
    • Computes the similarity between each user and item
    • Optionally, group similar users beforehand using locality-sensitive hashing
    • Predict the rating of a user and item by calculating the weighted average of the kk nearest neighbors
  • Often, a model-based algorithm does the following instead:

    • Optionally, impute missing data (using neighborhood or model based methods)
    • Compute latent factors by performing matrix factorization (using SVD or PCA)

      • Naturally, this will act as a form of dimensionality reduction
    • Predict the rating of a user and item wtih matrix multiplication of latent factors

Introducing Matrix Factorization

  • Matrix factorization (or latent factor models) are the most popular model-based method
  • The following are popular forms of matrix factorization:

    • Unconstrained matrix factorization (or funk MF)

      • Optimization using stochastic gradient descent (SGD)
      • Optimization using alternating least squares (ALS)
    • Singular value decomposition (SVD)
    • Non-negative matrix factorization
  • Most matrix factorization methods follow these steps:

    • Receive defined model of matrices
    • Define the objective function
    • Optimize with gradient descent

Advantages of Model-Based Collaborative Filtering

  • This approach offers the following advantages over neighborhood-based methods:

    • Accuracy

      • There are more accurate models compared to kNN
    • Stability

      • Offers dimensionality reduction methods
      • Thus, sparse matrices can be transformed into more condensed representations
      • This improves the stability of predictions
    • Scalability

      • Models can be trained offline
      • Then, these models can be evaluated for online requests
  • Some model-based methods can provide all of these improvements

    • Others can only provide some of them

Defining Singular Value Decomposition

  • Again, the rating matrix RR consists of:

    • A uu number of rows representing individual users
    • A ii number of columns representing individual items
    • Each entry represents a rating
  • Latent factor models decompose the ratings matrix into:

    • A u×ku \times k user matrix UU
    • A k×ik \times i item matrix II
    • A k×kk \times k weight matrix Σ\Sigma

      • Essentially, this represents a matrix denoting which kthk^{th} latent factor carriest most of the information
      • Or, each value located in the diagonal of the matrix represents the strength of the kthk^{th} latent factor
  • Here, uu is the number of users in our rating matrix
  • Here, ii is the number of items in our rating matrix
  • Here, kk is the number of latent factors

    • Each latent factor represents some learned context
    • Increasing the number of latent factors improves personalization
    • Increasing the number of latent factors by too much harms performance and leads to overfitting
    • Typically, a regularization term is introduced to avoid overfitting

svd

Illustrating Latent Factor Models

  • Suppose we have a ratings matrix RR consisting of customers and movies
  • For this example, we'll assign the number of latent factors kk to equal 22
  • Thus, we'll learn two different types of context from our customers UU and movies II
  • Notice, our optimization algorithm will learn 22 different contexts:

    • The degree to which a movie is meant for children or a customer is a child
    • The degree to which a movie is a blockbuster of a customer enjoys blockbuster movies
  • The following image illustrates the learned latent factors:

latentfactors

References

Previous
Next

Neighborhood-Based Collab Filtering

Hybrid Filtering Methods