Recommender engines and inner product

Last week, I introduced inner product of vectors as an essential tool for statistical models.  Let us apply inner product to recommender engines this week.

Could you remember a utility function?  Let me review it a little here. The utility function is expressed as follows.

U:θ*x→R

U:utility of customers,  θ:customers’preferences,  x:Item features,  R:ratings of the items for the customers

As you know,  θ:customers’preferences,  x:Item features, both are vectors.  Let us take an example of movies. Movie features are expressed as follows.

A1: Science fiction movie

A2: Love romance movie

A3.: Historical movie

A4: US movie

A5: Japanese movie

A6: Hong Kong movie

θ,x=[A1,A2,A3,A4,A5,A6]

First let us consider customer’s preferences. If you like some of the features of movies, assign 1 to the features.  If you like them very much, assign 2,  if you do not like it, just put 0 to the features.  I like Science fiction movie and US movie very much and like Japanese and Hong Kong movie,  while I do not like love romance movie and historical movie. These preferences can be expressed as a vector. My preference vector θ is [2,0,0,2,1,1] because A1=2, A2=0, A3=0,A4=2, A5=1, A6=1 according to my preference. I recommend you to make your own preference vector the same way as I did here.

Then let us move on to item features.  StarWars, A Chinese ghost story, Seven samurai and Titanic are taken as our selections of movies. Then what movies are recommended to me?

OK, let us make item feature vector of each movie. For example, if the movie is US movie, A4=1, A5=0, A6=0.

StarWars : x=[1,0,0,1,0,0]

A Chinese ghost story : x=[0,1,0,0,0,1]

Seven samurai : x=[0,0,1,0,1,0]

Titanic : x=[0,1,0,1,0,0]

Finally, let us calculate the value of the utility function for each movie. If the value is bigger, it means that I like this movie more and recommendations should be provided for me to watch the movie.  The value can be obtained by calculate inner product of  θ:customers’preferences and  x:Item features.  In StarWars case, the value of utility function is [2,0,0,2,1,1]*[1,0,0,1,0,0]’ = 4.

StarWars : U=4

Chinese ghost story : U=1

Seven samurai : U=1

Titanic : U=2

So the highest value goes to StarWars. So it should be recommended to me. the second is Titanic so it may be recommended. If you prepare your own preference vector, you can calculate the value of your utility functions and find what movie should be recommended to you !

Anyway this is one of the most simple model to calculate the value of utility for each movie. It uses inner product of vectors as I said before. Inner product can transform a lot of data into a single number. In this case, only six features are selected. Even thought number of features can be far more than six, inner product can transform a lot of data into a single number, which can be used for better business decisions!