1. P-values, R-squared values, and moving averages
Welcome back. In this demo, we’re going to dive deeper into specific analytics tools in Tableau including trend lines and moving averages. As mentioned in the conceptual video, organic traffic is believed by many marketers to be related, to some extent, to paid traffic. Therefore, in today’s lesson, we’re going to look at two key statistical metrics, p-values and R-squared values, that describe relationships between data as well as trend-line performance. While you don’t need to be able to calculate these metrics or build models to be a data analyst, understanding what they mean is important.
Let’s start by returning to our paid social data set. Imagine we want to understand whether there’s a relationship between two variables. For example, do we see a relationship between a page’s engagement rate and whether an ad generates a lot of account follows? Let’s quickly test using the analytics capacity in Tableau. I’m first going to build a chart of engagement rate and account follows, broken out by day.
Before I add a trend line, let’s look at the data quickly. We can see two clear clusters, one here, and the other, here, with a few outliers as well. What we do see generally, is that as engagement rate grows, so do the number of followers the account gains from the ad. This makes sense. If an ad resonates with the audience, they’d be more likely to both engage with it as well as follow the account. We’d also expect a linear relationship; as engagement rate increases constantly, the number of account followers would likely increase constantly.
Let’s now add a linear trend line from the “Trend Line” menu in the Analytics tab.
Now, when I hover over the trend line itself, I can learn more about it, and specifically see the p-value and R-squared value.
As a refresher, a very, very small p-value as shown here, < 0.0001, means there’s a < .01% chance that there is no relationship between the engagement rate and the number of account follows.
The second value we can look at is the R-squared value. As a reminder, the R-squared value quantifies the strength of the entire model at explaining changes in the variable output (account follows) given variable input (in this case, just engagement rate).
Our value here of 0.64; pretty good. It means almost ? of the variation in account follows can be explained by variation in engagement rate.
Sometimes, data needs to be altered in order to uncover relationships between variables. In some cases, using a moving average can be helpful to more clearly identify trends as the moving average smooths out dips and spikes by taking an average across a set of data points.
Let’s look at an example. If I look at a trend of clicks on the ad over time, we see quite a bit of weekly variation in volume.
Its difficult to see a broader trend. Let’s make a copy of the field, by right-clicking and selecting "duplicate" and add that copy so we have a duplicate chart.
I’m now going to change the copy of the click field to a moving average, by selecting "Quick table calculation" and "Moving average".
Now, in comparison to the top chart of just the total clicks, using the moving average we see more of a trend, with clicks decreasing from July, a bit of a spike or increase around the end of August, before continued decline. Then we see in mid October there’s more of a reversal, and the click volume picks back up.
Ok! Now that you have a refersher of p-values, R-squared values, and are more familiar with using a moving average, we’ll see how they can be used in marketing analytics in the next series of exercises.
2. Let's practice!