[关闭]
@nrailgun 2015-11-10T16:28:04.000000Z 字数 2914 阅读 1566

Recommender System

机器学习


Key Problems:

  1. How to collect data.
  2. Extrapolate unknown rating from the known ones.
  3. Evaluating extrapolation methods.

3 Approaches to recommender systems:

  1. Content-based
  2. Collaborative
  3. Latent factor based

Content-based Recommender System

Recommend items to customer x similar item highly rated by x.

TF-IDF

TFij=fijmaxkfkj

where fij= frequency of term (feature) i in doc (item) j.
IDFi=log(Nni)

where N is the number of docs, ni is the num of docs that mention term i.
TF-IDF score: wij=TFij×IDFi. Doc profiles = set of words with highest TF-IDF scores.

Pros:

Cons:


Collaborative Filtering

User-user collaborative filtering: Find set N of users whos ratings are similar to user x, and estimate user x's ratings based on users in N.

Item-item collaborative filtering:

  1. For item i, find other similar items.
  2. Estimate ratings for item i based on similar items:
    rxi=jN(i;x)Sij×rxjjN(i;x)Sij

    where Sij is similarity, rxj is rating on j, N(i;x) is similar items rated by user x.

In practice, estimate rxi as the weighted average:

rxi=bxi+jN(i;x)Sij×(rxjbxj)jN(i;x)Sij

where bxj=μ+bx+bj, μ is overall movie rating, bx=μxμ, bj=μjμ.

Pros:

Cons:


Practical Tips

  1. Compare predictions with known ratings:
    • Root mean square error: 1|R|(i,x)R(r^xirxi)2
    • % of those in top 10
  2. In pratice, we care only about high ratings (recommender).
  3. Finding k most similar is expensive: LSH.

Interpolation Weights

rxi=bxi+jN(i;x)Wij×(rxjbxj)

Learn Wij that minimizes SSE (i,x)R(r^xirxi)2 on training data. Minimize

J(W)=xbxi+jN(i;x)wij(rxjbxj)rxi2

by gradient descent. The gradient is
J(W)Wij=2x,ibxi+kN(i;x)wik(rxkbxk)rxi(rxjbxj)


Latent Factor Models

R is rating matrix, where Rix represents x's rating for item i. SVD (A=UΣVT) on R: R=QPT, where Q=U, PT=ΣVT, and rxi=qipx.

SVD isn’t defined when entries are missing! Use specialized methods to find P, Q

minP,Q(i,x)R(rxiqipx)2

Introducing regularization:
minP,Q(i,x)R(rxiqipx)2+[λ1xpx2+λ2iqi2]

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注