Wednesday, October 17, 2012

movie recommendation demo with matrix factorization

I was experimenting with Graphlab and Mahout for Matrix Factorization these days.

Matrix factorization transform both items and users to the same latent factor space so they can be compared directly.

Even though Mahout and Graphlab is great tool for matrix factorization, these are designed for batch process. to get recommendations for new users who rate existing movies in rate matrix, following two steps are necessary.

1) transform user-rating vector to user-latent feature vector.
2) compare all movie-latent feature vectors with 1) and calculate scores.

this demo ask user to rate movies and do 1), 2) step.

most of work is just glue codes from Mahout with Jetty. 

check out this and feel free to give me any feedback.

Update: added label propagation to find serendipities. since the training data is small enough(1.7 million user, 40 K movies,  19 million edge), just load training data into memory.

Todo: I will update with evaluation metric(RMSE, MAP, Precision-Recall) after running batch jobs for this dataset using mahout/Graphlab for ALS, Giraph for label propagation.
also, add item-based cf as baseline to compare result