Skip to Content

Simple co-occurrence-based recommendation on Hadoop

Tue, 2010-06-08 15:50 - 16:35
Speaker: 
Sean Owen

Recommender engines thrive on data -- lots of data. As such, scale inevitably becomes a challenge for recommenders. Distributed computing frameworks like Hadoop offer the infrastructure for applying many machines to such problems, and Apache Mahout has recently provided some first truly distributed recommender algorithms based on Hadoop. This talk explores the first such implementation, a simple algorithm based on item co-occurrence. We focus on how the algorithm is fit into a map-reduce paradigm, and how issues of scale inform the implementation.