Understand differences in variables

Now, you will analyze the averages and standard deviations of each variable by plotting them in a barplot. This is a complementary step to the one before, as you will visually explore the differences in variable scales and variances.

The pandas library is loaded as pd and matplotlib.pyplot as plt. Also, the wholesale dataset has been loaded as a pandas DataFrame, while the averages and standard deviations for each column of the wholesale dataset are loaded as pandas Series named averages and std_devs respectively. Make sure you explore them in the console.

This exercise is part of the course

Machine Learning for Marketing in Python

View Course

Exercise instructions

Create a list with wholesale's column names and another one with sorted values from 0 to the number of columns in wholesale.
Plot averages in grey and std_devs in orange, adjust the x-axis by 0.2
Add x_ix as ticks and x_names as labels and make sure you rotate them by 90 degrees.
Add the legend and display the chart.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create column names list and same length integer list
x_names = wholesale.___
x_ix = np.arange(wholesale.shape[1])

# Plot the averages data in gray and standard deviations in orange 
plt.bar(x=x_ix-___, height=averages, color='grey', label='Average', width=0.4)
plt.bar(x=x_ix+___, height=std_devs, color='orange', label='Standard Deviation', width=0.4)

# Add x-axis labels and rotate
plt.xticks(ticks=___, labels=x_names, rotation=90)

# Add the legend and display the chart
plt.legend()
plt.___()

Edit and Run Code