Exploring Vowpal Wabbit with the Avazu Clickthrough Prediction Challenge

Dec 01, 2014

∙ Paid

In online advertising, click-through rate (CTR) is a very important metric for evaluating ad performance. As a result, click prediction systems are essential and widely used for sponsored search and real-time bidding.

For this competition, we have provided 11 days worth of Avazu data to build and test prediction models. Can you find a strategy that beats standard classification algorithms? The winning models from this competition will be released under an open-source license.

Data Fields

id: ad identifier
click: 0/1 for non-click/click
hour: format is YYMMDDHH, so 14091123 means 23:00 on Sept. 11, 2014 UTC.
C1 -- anonymized categorical variable
banner_pos
site_id
site_domain
site_category
app_id
app_domain
app_category
device_id
device_ip
device_model
device_type
device_conn_type
C14-C21 -- anonymized categorical variables

Loading Data

## Loading the data

import pandas as pd
import numpy as np
import string as stri

#too large data not keeping it in memory.
# will be using line by line s…

Keep reading with a 7-day free trial

Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.