Get Started

Variance and standard deviation

Variance and standard deviation are two of the most common ways to measure the spread of a variable, and you'll practice calculating these in this exercise. Spread is important since it can help inform expectations. For example, if a salesperson sells a mean of 20 products a day, but has a standard deviation of 10 products, there will probably be days where they sell 40 products, but also days where they only sell one or two. Information like this is important, especially when making predictions.

pandas has been imported as pd, numpy as np, and matplotlib.pyplot as plt; the food_consumption DataFrame is also available.

This is a part of the course

“Introduction to Statistics in Python”

View Course

Exercise instructions

  • Calculate the variance and standard deviation of co2_emission for each food_category with the .groupby() and .agg() methods; compare the values of variance and standard deviation.
  • Create a histogram of co2_emission for the beef in food_category and show the plot.
  • Create a histogram of co2_emission for the eggs in food_category and show the plot.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Print variance and sd of co2_emission for each food_category
print(food_consumption.____(____)[____].agg([____]))

# Create histogram of co2_emission for food_category 'beef'
food_consumption[____]['____'].____()
plt.show()

# Create histogram of co2_emission for food_category 'eggs'
food_consumption[____]['____'].____()
plt.show()

This exercise is part of the course

Introduction to Statistics in Python

IntermediateSkill Level
4.5+
185 reviews

Grow your statistical skills and learn how to collect, analyze, and draw accurate conclusions from data using Python.

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free