Variance and standard deviation
Variance and standard deviation are two of the most common ways to measure the spread of a variable, and you'll practice calculating these in this exercise. Spread is important since it can help inform expectations. For example, if a salesperson sells a mean of 20 products a day, but has a standard deviation of 10 products, there will probably be days where they sell 40 products, but also days where they only sell one or two. Information like this is important, especially when making predictions.
pandas
has been imported as pd
, numpy
as np
, and matplotlib.pyplot
as plt
; the food_consumption
DataFrame is also available.
This is a part of the course
“Introduction to Statistics in Python”
Exercise instructions
- Calculate the variance and standard deviation of
co2_emission
for eachfood_category
with the.groupby()
and.agg()
methods; compare the values of variance and standard deviation. - Create a histogram of
co2_emission
for thebeef
infood_category
and show the plot. - Create a histogram of
co2_emission
for theeggs
infood_category
and show the plot.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print variance and sd of co2_emission for each food_category
print(food_consumption.____(____)[____].agg([____]))
# Create histogram of co2_emission for food_category 'beef'
food_consumption[____]['____'].____()
plt.show()
# Create histogram of co2_emission for food_category 'eggs'
food_consumption[____]['____'].____()
plt.show()