Salford Analytics and Data Mining Conference 2012

Insight For Data Enthusiasts • San Diego, CA • May 24-25
Training May 21-23 • Welcome Reception May 23

You are here:Home»Conference»Sessions»Machine Learned Relevance at a Large Scale Search Engine

Machine Learned Relevance at a Large Scale Search Engine

Learning to rank or machine-learned ranking (MLR) is a type of supervised or semi-supervised machine learning problem in which the goal is to automatically construct a ranking model from training data. In web search MLR models are typically used to score billions of objects (documents) with respect to a user's information need (query); this score is subsequently used to rank and select the top N results to present to the user. At Quixey, a large-scale functional application search engine, TreeNet forms an integral part of the MLR framework. Used to generate meta-features, as well as overall scores, TreeNet is an effective way to reduce costs and speed up the learning process. This presentation describes the process of building an MLR-based search engine, and how TreeNet is an important component of that process. I draw on my experiences at Quixey and at other search companies and describe the process - step-by-step for how to build an MLR-based Search Engine. Covering areas including data collection, evaluation, feature engineering and model generation, we go in-depth into how a large-scale search engine works, and how a MLR system is built and evaluated.