Tableau: describing trend models

1. Tableau: describing trend models

In the previous demo, you learned how to create scatter plots and add a trend line on the windmill dataset. You also saw that the third degree polynomial trend line looked like it was the best fit for the data. In this video, you'll learn how to quantify a good fit in Tableau. Let's start with the linear trend line again. When you hover over the trend line, a pop-up shows the model coefficients, the R-squared and p-value. The model coefficients tell you the exact formula to make predictions, and tell you that for each one meter per second wind speed increase, power output will go up by 2 point 24 Megawatts on average. The R-squared value is 0 point 96 which is a very high coefficient of determination. In other words, 96 percent of the variation of power output is explained by wind speed alone. The p-value says that the model is statistically significant, and that your predictions are valid. To show the full description of the model, right click the trend line and select Describe Trend Model. A pop-up window shows up with a summary and many more metrics of your model. You see the same model coefficients, R-squared and p-value. In Tableau, the residual standard error (RSE) is just called standard error and can be found here as well. In this case, it tells you that the differences between the observed values and the trend line are 1 point 72 Megawatts on average. The lower this number, the better, and it allows you to compare the accuracy of different models. You can also copy this summary, if you want to paste this model description somewhere else. Finally, we can add confidence intervals around the trend line. You can find this option when you right click the trend line and then select Edit All Trend Lines. Let's compare with our polynomial model. By duplicating the worksheet, we preserve the current trend model and we can replace it easily with another one. Choosing the third degree polynomial model, you can see that the confidence intervals are really narrow on the biggest part of your data. The model description also reveals that the R-squared had increased from 96 to 98 percent, and that the standard error has decreased to 1 point 14 Megawatts. The p-value shows that this model is also valid. Combining all visual and calculated metrics show that this model is the better fit for this dataset, and that you can predict power output based on measuring wind speed only with a low error rate. Time to apply this on the dinosaur dataset!

2. Let's practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.