Tuesday, October 9, 2012

An alternative framework for Mahout : CRAB

Have you imagined every time you purchase any item from online sites like Amazon, BestBuy etc you might have figured out there are other items have been displayed as recommendations for you. For example I buy book from Amazon, it recommends list of books purchased by other shoppers with similar interest. This is possible as the online stores actual process millions of data and finds out the item purchased by similar users who are having a common buying behaviour. This helps the online sites sell more to their users based on the user preferences and their online behaviours. Most of the times users will not know what items they are looking for over the net. The recommendation system helps them to discover similar items based on their interest.  

In today’s overcrowded world with millions of items, it is very difficult to search and narrow down our requirements. In that context the online stores provides a filtration of data and presented in most pleasant way. At times we discover items which we might not heard of.

The same is not only true for online retailers. In most of the social sites, we discover friends and people with similar interest. This is done by processing all the socio interests expressed over the net and finding similarities between them. In linkedin you will find jobs, professionals, groups with similar interest. This is a facilitated by underline recommendation system infrastructure. Building a recommendation system could become a fairly complex process as the number of variables are going to increase. Of course the important variable is your amount of data to be processed.

Mahout played a very critical role in solving this problem. But it is not that trivial to build applications with Mahout. Though, it provides a comprehensive set of tools to work with Machine Learning. This is where CRAB fits the bill. The main objective of CRAB is to provide a very simple way to build the recommendation engine.

Crab is a flexible, fast recommender engine for python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (NumPy, SciPy, Metaplotlib).

The project is started in 2010 by Muricoca incorporated as an alternative to Mahout. It is developed using python so it is much easier to work with for an average programmer compared to Mahout which is built using Java. It has implemented User based, Item based and sloped based Collaborative filtering algorithms.

Demo Example can be found at:

No comments:

Post a Comment