Beginning steps
In this exercise, you will get a quick look at sample data using some basic DataFrame operations and taking a first look at CTR. The data comes from Avazu, a leading global advertising platform and captures user interactions on various device types for different websites and apps.
The target variable will be in the click
column. The hour is in a YYMMDDHH
format, and there are a few integer columns: device_type
for the type of device, banner_pos
for the position of a banner ad (also known as a display ad), etc. There will also be other variables discussed in later chapters.
Sample data in DataFrame form is loaded as df
.pandas
as pd
are available in your workspace.
This exercise is part of the course
Predicting CTR with Machine Learning in Python
Exercise instructions
- Define variable
X
using.isin()
.X
will be all of the columns except for theclick
column. - Define variable
y
, which can be accessed usingdf.click
. - Print out the proportion of rows of
y
that are a1
- this represents the sample CTR, usingy.sum()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Look at basics of Dataframe
print(df.head(5))
print(df.columns)
# Define X and y
X = df.____[:, ~df.columns.____(['click'])]
y = df.____
# Sample CTR
print("Sample CTR :\n",
y.____/len(y))