Distributions by CTR
For any individual feature, it is useful to look at both the distribution of a feature, as well as how it varies with the variable of interest. In this exercise, you will explore the search_engine_type
feature, which represents an integer denoting the search engine, such as Google or Bing, by which the user expressed intent leading up to ad. Due to privacy reasons, these categories are anonymous. First you will construct and look at the distribution of search_engine_type
. Then you will look at how CTR varies based on search_engine_type
value, similar to how you looked at CTR breakdown by device type and banner position in the previous chapter.
Sample data in DataFrame form is loaded as df
. pandas
as pd
is also available in your workspace.
This exercise is part of the course
Predicting CTR with Machine Learning in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Construct bar chart for clicks by search engine type
se_df = df.____(['search_engine_type', 'click']).size().unstack()
se_df.plot(kind = 'bar', title = 'Value frequency for search engine type')
plt.show()