A first look
Using the full Avazu dataset, you will explore various new features by looking at the data types of columns. The new data includes categorical columns such as site_id
, app_id
, device_id
, etc. all of which are various identifiers for a given site, app, and user respectively. To start off, you will identify and print out the numerical and categorical columns.
Sample data in DataFrame form is loaded as df
. pandas
as pd
is also available in your workspace.
This exercise is part of the course
Predicting CTR with Machine Learning in Python
Exercise instructions
- Print the columns of
df
using.columns
. - Print the corresponding data types of
df
using.dtypes
. - Select the subset of
df
with numerical columns (by usinginclude = ['int', 'float']
) and print those columns. - Select the subset of
df
with categorical columns (by usinginclude = ['object']
) and print those columns.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print columns
print(df.____)
# Print data types of columns
print(df.____)
# Select and print numeric columns
numeric_df = df.____(include=['____', 'float'])
print(numeric_df.____)
# Select and print categorical columns
categorical_df = df.____(include=['____'])
print(categorical_df.____)