Software Programming

Kunuk Nykjaer

How does the Amazon recommendation system work? – Analyze the algorithm and make a prototype that visualizes the algorithm

with 6 comments


I think the recommendation systems are interesting.

I decided that I wanted to learn how the Amazon recommendation works in theory and afterwards implement a demo-site that visualized the algorithm. My implementation is not identical as the Amazon’s version but it follows the same principle. My main focus is to illustrate and visualize the concept. I used a item-to-item matrix table for simplicity.

Reference reading materials:
(1) http://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf
(2) Amazon Recommendation Patent
(3) http://stackoverflow.com/questions/2323768/how-does-the-amazon-recommendation-feature-work
(4) http://maya.cs.depaul.edu/mobasher/papers/ewmf04-web/node1.html
(5) http://blog.echen.me/2011/02/15/an-overview-of-item-to-item-collaborative-filtering-with-amazons-recommendation-system/

I recommend reading (1). This is only 5 pages long and easy to read. Amazon’s algorithm is based on item-to-item filtering. In short: Amazon developed their own recommendation system based on items rather than users, because there are fewer items than users (scalability). The list of visited items by any user is stored in a item-to-item matrix table. The recommendation algorithm are calculated by using the Cosine Similarity function on the vectors from the matrix table.

I recommend reading (2) for technical insights and design overview of how Amazon has implemented it.

For the demo-site I wrote a short abstract. The demo-site can be seen here.
http://jory.dk/AreaRecommendation/.
The demo project can be downloaded here

Abstract: 
How does the Amazon recommendation works? 

This is about visualizing the item to item collaborations filtering mechanism using a item-to-item matrix table.
The item-to-item matrix, the vectors and the calculated data values are displayed.

There are n different items and the item recommendation can display up to m items.

There are implemented different item-to-item neighborhood functions. 
A simple max count of seen neighbor items, the Cosine Similarity and the Jaccard Index.

A tracker keeps track of visited items for any user and is saved to a matrix table.
To make it simple only the relation between previous and current viewed item are tracked in this example.

Design

The demo-site has two pages: Home and item page. The item page shows specific information regarding the viewed item and the recommendation is displayed here.

There are 5 different components

  • Multiple view – shows multiple components, all the items are displayed here
  • User view – shows specific information about the current user in the session
  • Item view – shows detailed information about the current item
  • Recommendation view – shows recommended items based on the current item
  • Data view – visualizes the data structure used by the recommendation algorithm

Interactions


This shows how the tracker collects the data from the users in to the matrix table. (The illustrated tracking method is a simplified version. You could also iterate the viewed items when a user view a new item to save all the item-to-item relation for the viewed items).

When a new visitor user3 sees the item A,
the recommendation system founds out the closest match are the Items B and C.

The general idea of the Amazon recommendation engine is to locate item vectors which are similar in pattern for the current viewed item vector.

e.g. A vector with pattern [1,1,1,0,0,0,0] is more similar to vector [0,1,1,0,1,0,0] than to vector [0,0,0,1,0,1,1].

About these ads

Written by kunuk Nykjaer

March 4, 2012 at 6:43 pm

6 Responses

Subscribe to comments with RSS.

  1. [...] How does the Amazon recommendation system work? – Analyze the algorithm and make a prototype that … [...]

    • Hi. This is some cool stuff.

      I wish you had a more detailed explanation though:)

      Dee

      August 17, 2014 at 8:50 am

  2. Very good.

    Thomas Packer

    October 18, 2012 at 11:09 pm

  3. Hi. I just saw your blog. I loved the explanation part.
    Just one thing: Why do we need to calculate the similarity matrix? Can’t we just select the items from item-item matrix with maximum value? For example: in your matrix, if user is looking at product A then B and C would be recommended because those products have maximum value.

    adivvy

    September 24, 2013 at 10:31 pm

    • Here we are tracking data and behavior from the users. Then based on that we calculate some information we can apply to new visitors.

      It depends on what you want to achieve. There are different algorithms you can apply.
      This is just to show the general concept. I don’t think one algorithm is more correct than other. Depends on what you want (But usually it is about getting more income by giving people what you think they want).

      The similarity calculation is based on the idea that some people have identical interest and by tracking their behavior you get a DNA or some unique identifier for those group. Then when a new visitor arrives you try to identify which group that visitor belongs to and display the information which would be interesting for the group.

      By taking the maximum value how to you know you are displaying the most interesting item for the visitor?
      Maybe it will, or not. You will have to investigate how the max-value strategy works compared to the other options.

      I am not sure but I think taking max-value is about taking the most interesting item disregarding where the visitor might be grouped to and this is not so much about identifying similarity behavior. I could be wrong. I have just played with the concept lightly and implemented misc. algorithms. For more in dept answer you might seek some recommendation system forums.

      The concept is as I said, collect data and behavior. Then based on that implement a strategy which you think will give the visitors the items they are interested in by tracking the visitors behavior.

      kunuk Nykjaer

      September 29, 2013 at 1:15 pm

      • What is the cost function of your model?

        adivvy

        March 23, 2014 at 10:16 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 42 other followers

%d bloggers like this: