1. Events and releases
Great work on the exercises! You have a real mastery of the tools used for evaluating trends in user data. These skills are highly transferable to any scenario with customer data.
2. Exploratory analysis - issues in our ecosystem
Now, we will build on these skills and apply them to discover the cause of an issue in our app ecosystem.
3. Visualizing the drop in conversion rate (3 Years)
Over the course of our monitoring we have noticed a concerning dip in new user retention. Calculating this metric & plotting the data as we have done before reveals how alarming this trend is.
We want to try to investigate what is causing this, and determine if it is something we can solve?
4. Visualizing the drop in conversion rate (6 Months)
First we filter to dates between the current date, and 6 months prior. Next we execute the same steps we saw before, to plot the results.
First, let us limit our graph so that we are not looking at 3 years of data, but only the most recent 6 months. This will give us the resolution to notice subtler changes.
Wow! It seems that our drop happened right around the end of February or beginning of March.
5. Investigating the conversion rate drop
One revealing factor would be if this trend has impacted one group of users and not others. This could point to a specific change or event being the cause. The two biggest segmentors of our user base are country and device as each defines a somewhat independent ecosystem.
6. Splitting our data by country and device
As we have done before let us segment by country. Then by device.
7. Breaking out by Country
Looking by country, we see that while each country is experiencing a drop during that time, it is certainly most pronounced in Brazil and Turkey.
Interestingly we know that these are our two most android heavy countries where we have a user presence.
8. Breaking out by Device
Looking by device confirms the hypothesis we were forming. The dip is only manifesting itself on android devices. At this point we have really honed in on what the issue might be.
A final step is to see if any changes or events occurred that may be relating to this issue.
9. Annotating datasets
Here we have two datasets, events dot csv and releases dot csv which contain the date and type of the event or release. Events primarily includes holidays, and releases includes both android and ios software releases.
10. Plotting annotations - events
We can plot these dates overlaid on our graph of data broken out by device.
First we must iterate through the rows in our annotation DataFrames. This syntax of iteration should be familiar to you. Then for each row, we generate a line to plot with plt dot avline which creates a vertical line at the x-value, we pass in. In this case our date, as well as a color and line type to use.
11. Plotting annotations - releases
We can repeat the same plotting with the release annotations.
Here we can additionally check whether the release is on iOS or android, and specify a different line color in each case to make our graph clearer. Then we can call plot, after plotting our time graphs, to show the annotations overlaid.
12. Annotated conversion rate graphs
Looking at this graph, it is clear that we had an android release on or around the day of our dip starting. Now all that is to find out what in that release might be impacting the new user experience.
13. Power and limitations of exploratory analysis
While this is a simple case, it shows the power of visualizing data to uncover trends. Note that this can only take you so far. It can reveal obvious potential relationships but cannot allow for the scientific testing of different ideas or show causation. That is where exploratory data analysis ends and A/B testing begins.
14. Let's practice!
Now, let’s practice!