Get startedGet started for free

Analyzing Glassdoor reviews over time

1. Analyzing Glassdoor reviews over time

Now that we have explored how the number of reviews changes by different date grains and between time periods, let’s dive deep into the time series trend. We will do so using run charts, aka line charts. I’ll create the first one by choosing “Line chart” in the Visualizations pane, then dragging the ReviewDate to Axis and ReviewId as a distinct count to Values. I’ll drill down to the lowest date level to see the complete trend between March through May. To quickly visualize the trend in count of reviews, let’s use the native Trend line in Power BI. Going to the Analytics tab, expand “Trend line” then click “+ Add”. I’ll update the color of the time series from blue to light grey by going into the Format tab, expanding Data colors and selecting light grey. The trend line shows a straight, linear increase from the beginning of March through the end of May. However, we can see there is more variability in the light-grey time series. We can generate a more accurate trend line that follows the nonlinear pattern seen in the underlying data while also “smoothing” using rolling averages. This first requires creating a new quick measure. In the pop-up window, under calculation, time intelligence options, select “rolling average”. Use ReviewId as a distinct count for the Base value, ReviewDate for Date. I’ll experiment with a 7-day rolling average by changing the Period before input to be 7 and period after to be 0. Let’s rename the new measure to “7RollingAverage”. Now, I’ll add it to the Values of the existing time series. The new rolling average smooths out the trend seen in the underlying data. Though it won’t showcase the sudden spikes in the number of reviews, it still exposes the general increasing and decreasing trends. Power BI also offers advanced analytics features, specifically anomaly detection, for line charts that help flag potential data points that are abnormal. If you remember from Chapter 1, these data points are like outliers in that they are outside of expected values. To add anomaly detection to our time series, go to the Analytics tab. I’ll remove the trend line. Because the visualization has both 7-day rolling average and the count of distinct reviews, Power BI will not allow for anomaly detection. This is an easy fix by removing the distinct count of ReviewId, leaving the rolling average measure. Now, we’ll go back to the Analytics tab, expand “Find anomalies”, click “+ Add”, then change the “Sensitivity” to 98%. Note: the higher the sensitivity, the more abnormal the data points have to be to be flagged as anomalies. The Power BI anomaly algorithm detected 3 abnormal points. Clicking on the dark grey circles highlighting anomalies will bring up “auto-generated explanations” for why these might be occurring. Dragging other fields, such as Gender, into the “Explain by” section tells Power BI to use the changes or values of that variable for why the anomaly may exist. It seems the number of reviews from females was higher than usual at this anomaly at the end of May. You can see Power BI creates a separate chart with the female only time series for 7-day rolling average review count. We can create our own, showing also that for males, with small multiples. I’ll duplicate the existing chart, then drag Gender to “Small multiples.” I don’t like how the small multiples are in a 2-by-2 grid, so I’ll go to the Format tab, expand “Grid layout” and decrease the number of “Rows” to 1. Compared to males, females had more of a dramatic decrease at the beginning of May then a sharp increase a week or so after that. Now it’s your turn to build time series charts, explore rolling averages, and detect anomalies.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.