Understand differences in variables
Now, you will analyze the averages and standard deviations of each variable by plotting them in a barplot. This is a complementary step to the one before, as you will visually explore the differences in variable scales and variances.
The pandas library is loaded as pd and matplotlib.pyplot as plt. Also, the wholesale dataset has been loaded as a pandas DataFrame, while the averages and standard deviations for each column of the wholesale dataset are loaded as pandas Series named averages and std_devs respectively. Make sure you explore them in the console.
This exercise is part of the course
Machine Learning for Marketing in Python
Exercise instructions
- Create a list with
wholesale's column names and another one with sorted values from 0 to the number of columns inwholesale. - Plot
averagesin grey andstd_devsin orange, adjust the x-axis by 0.2 - Add
x_ixas ticks andx_namesas labels and make sure you rotate them by 90 degrees. - Add the legend and display the chart.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create column names list and same length integer list
x_names = wholesale.___
x_ix = np.arange(wholesale.shape[1])
# Plot the averages data in gray and standard deviations in orange
plt.bar(x=x_ix-___, height=averages, color='grey', label='Average', width=0.4)
plt.bar(x=x_ix+___, height=std_devs, color='orange', label='Standard Deviation', width=0.4)
# Add x-axis labels and rotate
plt.xticks(ticks=___, labels=x_names, rotation=90)
# Add the legend and display the chart
plt.legend()
plt.___()