Variance and standard deviation
Variance and standard deviation are two of the most common ways to measure the spread of a variable, and you'll practice calculating these in this exercise. Spread is important since it can help inform expectations. For example, if a salesperson sells a mean of 20 products a day, but has a standard deviation of 10 products, there will probably be days where they sell 40 products, but also days where they only sell one or two. Information like this is important, especially when making predictions.
pandas
has been imported as pd
, numpy
as np
, and matplotlib.pyplot
as plt
; the food_consumption
DataFrame is also available.
This is a part of the course
“Introduction to Statistics in Python”
Exercise instructions
- Calculate the variance and standard deviation of
co2_emission
for eachfood_category
with the.groupby()
and.agg()
methods; compare the values of variance and standard deviation. - Create a histogram of
co2_emission
for thebeef
infood_category
and show the plot. - Create a histogram of
co2_emission
for theeggs
infood_category
and show the plot.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print variance and sd of co2_emission for each food_category
print(food_consumption.____(____)[____].agg([____]))
# Create histogram of co2_emission for food_category 'beef'
food_consumption[____]['____'].____()
plt.show()
# Create histogram of co2_emission for food_category 'eggs'
food_consumption[____]['____'].____()
plt.show()
This exercise is part of the course
Introduction to Statistics in Python
Grow your statistical skills and learn how to collect, analyze, and draw accurate conclusions from data using Python.
Summary statistics gives you the tools you need to boil down massive datasets to reveal the highlights. In this chapter, you'll explore summary statistics including mean, median, and standard deviation, and learn how to accurately interpret them. You'll also develop your critical thinking skills, allowing you to choose the best summary statistics for your data.
Exercise 1: What is statistics?Exercise 2: Descriptive and inferential statisticsExercise 3: Data type classificationExercise 4: Measures of centerExercise 5: Mean and medianExercise 6: Mean vs. medianExercise 7: Measures of spreadExercise 8: Variance and standard deviationExercise 9: Quartiles, quantiles, and quintilesExercise 10: Finding outliers using IQRWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.