Get Started

Computing the covariance

The covariance may be computed using the Numpy function np.cov(). For example, we have two sets of data x and y, np.cov(x, y) returns a 2D array where entries [0,1] and [1,0] are the covariances. Entry [0,0] is the variance of the data in x, and entry [1,1] is the variance of the data in y. This 2D output array is called the covariance matrix, since it organizes the self- and covariance.

To remind you how the I. versicolor petal length and width are related, we include the scatter plot you generated in a previous exercise.

This is a part of the course

“Statistical Thinking in Python (Part 1)”

View Course

Exercise instructions

  • Use np.cov() to compute the covariance matrix for the petal length (versicolor_petal_length) and width (versicolor_petal_width) of I. versicolor.
  • Print the covariance matrix.
  • Extract the covariance from entry [0,1] of the covariance matrix. Note that by symmetry, entry [1,0] is the same as entry [0,1].
  • Print the covariance.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Compute the covariance matrix: covariance_matrix


# Print covariance matrix


# Extract covariance of length and width of petals: petal_cov


# Print the length/width covariance

This exercise is part of the course

Statistical Thinking in Python (Part 1)

IntermediateSkill Level
4.6+
30 reviews

Build the foundation you need to think statistically and to speak the language of your data.

Chapter 1: Graphical Exploratory Data Analysis

Before diving into sophisticated statistical inference techniques, you should first explore your data by plotting them and computing simple summary statistics. This process, called exploratory data analysis, is a crucial first step in statistical analysis of data.

Exercise 1: Introduction to Exploratory Data AnalysisExercise 2: What is the goal of statistical inference?Exercise 3: Advantages of graphical EDAExercise 4: Plotting a histogramExercise 5: Plotting a histogram of iris dataExercise 6: Axis labels!Exercise 7: Adjusting the number of bins in a histogramExercise 8: Plot all of your data: Bee swarm plotsExercise 9: Bee swarm plotExercise 10: Interpreting a bee swarm plotExercise 11: Plot all of your data: ECDFsExercise 12: Computing the ECDFExercise 13: Plotting the ECDFExercise 14: Comparison of ECDFsExercise 15: Onward toward the whole story!

Chapter 2: Quantitative Exploratory Data Analysis

In this chapter, you will compute useful summary statistics, which serve to concisely describe salient features of a dataset with a few numbers.

Exercise 1: Introduction to summary statistics: The sample mean and medianExercise 2: Means and mediansExercise 3: Computing meansExercise 4: Percentiles, outliers, and box plotsExercise 5: Computing percentilesExercise 6: Comparing percentiles to ECDFExercise 7: Box-and-whisker plotExercise 8: Variance and standard deviationExercise 9: Computing the varianceExercise 10: The standard deviation and the varianceExercise 11: Covariance and the Pearson correlation coefficientExercise 12: Scatter plotsExercise 13: Variance and covariance by lookingExercise 14: Computing the covariance
Exercise 15: Computing the Pearson correlation coefficient

Chapter 3: Thinking Probabilistically-- Discrete Variables

Chapter 4: Thinking Probabilistically-- Continuous Variables

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free