Get startedGet started for free

Summarize group statistics

Sometimes you want to understand how a value varies across groups. For example, how does the maximum value per group vary across groups?

To find out, first summarize by group, and then compute summary statistics of the group results. One way to do this is to compute group values in a subquery, and then summarize the results of the subquery.

For this exercise, what is the standard deviation across tags in the maximum number of Stack Overflow questions per day? What about the mean, min, and max of the maximums as well?

This exercise is part of the course

Exploratory Data Analysis in SQL

View Course

Exercise instructions

  • Start by writing a subquery to compute the max() of question_count per tag; alias the subquery result as maxval.
  • Then compute the standard deviation of maxval with stddev().
  • Compute the min(), max(), and avg() of maxval too.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

-- Compute standard deviation of maximum values
SELECT ___(___),
	   -- min
       ___(___),
       -- max
       ___(___),
       -- avg
       ___(___)
  -- Subquery to compute max of question_count by tag
  FROM (SELECT ___(___) AS ___
          FROM stackoverflow
         -- Compute max by...
         GROUP BY ___) AS max_results; -- alias for subquery
Edit and Run Code