Get startedGet started for free

Showing relevant statistics

1. Showing relevant statistics

Great job identifying and profiling different types of audiences.

2. Data storytelling road

We now know what selecting the right finding means, and why it is important for storytelling. Now let's cover how specific metrics help convey different messages.

3. Variations of data

Sometimes we want to compare one or several specific variables over time. The difference can be expressed with an absolute number (3,000 more units sold) or a growth rate (10% more units sold). If we focus on one variable, an absolute number is OK, but when looking at several variables, the growth rate tends to be more informative as it allows comparing different scales (products sold in volumes of thousands and millions, for example). For example, for absolute change, we care about the difference between 2018 and 2017 sales. When we focus on the percentage change from 2017 to 2018, we use relative change. An absolute change for a small number can be small even if the relative change is large. On the opposite, absolute change for a large number can be large even if its relative change is small. Relative changes on small numbers can appear larger than they are because a small absolute change can result in a large percentage change. Which one we use depends on the question we want to answer.

4. Variations of data

Say we were asked to explore proportion changes in sales volumes between years. In the graph, we see the total units sold in 2017 and 2018. It's a good plot to show that more chocolate is sold than chips, or that chocolate sales decreased while chips increased. But since we're interested by proportion, we'd better show the percentage of change. Now we have the respective percentage changes for each product. So let's see different situations and which type of metrics we should use.

5. Ratio

A way to overcome this issues is to calculate a ratio. It is a comparison of two variables expressed as a quotient, such as the revenue per customer; calculated as total product revenue in dollars divided by number of customers as we can see in the graph. Ratios help normalize values, which in turn helps compare the distribution of data of originally different scales.

6. Aggregates

Sometimes we need to summarize numerical data into an aggregate: a number that gives an idea of an overall or representative value It can be a simple total or count, like total sales or how long a a campaign will last.

7. Aggregates

Another common aggregate is the mean. the average number of chocolates or chips sold per year, as we see in the graph.

8. Aggregates

or the median price for each product.

9. Aggregates

The mean can be misleading, particularly if there are outliers (extreme values, the data is not normally distributed). In these cases, the median is a more robust metric. For example, in the US, in 2019 the average annual wage in 2019 in the US was $51,916.27, and the median annual wage was $34,248.45. Using the mean, we'd think the common tendency is a $52K wage, when actually half the population is below $34K.

10. p-value

When communicating our results, we often need to provide proof that they are statistically significant (i.e., that they can't be due to randomness). A p-value lower than 0-point-05 is considered an indicator of significance by convention. The lower it gets below 0.5, the stronger the indicator. However, it is not a proof of evidence: it only rejects or accepts a hypothesis without saying anything about the truth of it. Because of how often p-value metrics are misunderstood or confusing the audience, consider alternative metrics or add some more in support.

11. Let's practice!

Now, it's your turn to choose relevant metrics.