Understand differences in variables
Now, you will analyze the averages and standard deviations of each variable by plotting them in a barplot. This is a complementary step to the one before, as you will visually explore the differences in variable scales and variances.
The pandas
library is loaded as pd
and matplotlib.pyplot
as plt
. Also, the wholesale
dataset has been loaded as a pandas
DataFrame, while the averages and standard deviations for each column of the wholesale
dataset are loaded as pandas
Series named averages
and std_devs
respectively. Make sure you explore them in the console.
This exercise is part of the course
Machine Learning for Marketing in Python
Exercise instructions
- Create a list with
wholesale
's column names and another one with sorted values from 0 to the number of columns inwholesale
. - Plot
averages
in grey andstd_devs
in orange, adjust the x-axis by 0.2 - Add
x_ix
as ticks andx_names
as labels and make sure you rotate them by 90 degrees. - Add the legend and display the chart.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create column names list and same length integer list
x_names = wholesale.___
x_ix = np.arange(wholesale.shape[1])
# Plot the averages data in gray and standard deviations in orange
plt.bar(x=x_ix-___, height=averages, color='grey', label='Average', width=0.4)
plt.bar(x=x_ix+___, height=std_devs, color='orange', label='Standard Deviation', width=0.4)
# Add x-axis labels and rotate
plt.xticks(ticks=___, labels=x_names, rotation=90)
# Add the legend and display the chart
plt.legend()
plt.___()