Analyzing datetime columns
Feature engineering is an important step in all machine learning workflows in order to process features from different data types. In particular, datetime columns are common in many datasets. In this exercise, you will explore the hour column in the dataset, which is stored as an integer but represents a datetime. First you will parse the hour column to convert it into a datetime column. Then you will extract the hour of the day from that datetime column, and calculate the total number of clicks based on that hour of the day.
The pandas module is available as pd in your workspace and the sample DataFrame is loaded as df.
Este ejercicio forma parte del curso
Predicting CTR with Machine Learning in Python
Instrucciones del ejercicio
- Convert the
hourcolumn from an integer to adatetimecolumn usingpd.to_datetime(). - Using the datetime accessor
.dt, extract the hour field from the converted column using.hour. - Compute total clicks by the extracted hour of day using
.sum().
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Change the hour column to a datetime and extract hour of day
df['hour'] = pd.____(df['hour'], format = '%y%m%d%H')
df['hour_of_day'] = df['hour'].____.____
print(df.head(5))
# Get and plot total clicks by hour of day
df.____('hour_of_day')['click'].____.plot.bar(figsize=(12,6))
plt.ylabel('Number of clicks')
plt.title('Number of clicks by hour of day')
plt.show()