Plotting campaign results (II)

1. Plotting campaign results (II)

In this lesson, we will combine all the concepts covered so far in this chapter by grouping multiple columns and plotting the results.

2. Grouping by multiple columns

You will often want to group the data by multiple dimensions. For instance, say you want to count the number of users for each preferred language on each date. In order to do this, you need to group the data by multiple columns; thus we pass a list of columns, date_served and language_preferred to the groupby() method and count the number of users. The result is a series with multiple indices and the number of users.

3. Unstacking after groupby

Sometimes it can be easier to manipulate the data when we have a DataFrame. We use the unstack() method to transform our data such that each preferred language becomes a column. Since language_preferred is the second index, we set the level argument to 1, indicating that we want to unstack the second index. Remember, the first index is represented with 0, and the second with 1.

4. Plotting preferred language over time

Plotting these results is very similar to what we've done in previous lessons. Since the index is a date, if you call the plot() method on the DataFrame, pandas will automatically draw a line plot. Since there's one line for each language, it's crucial to include a legend to ensure you know which language each line represents. We can use the legend() function to add a legend. The loc argument determines the location of the legend and to get the correct labels; we set the labels argument to the column names. The column names can be obtained by chaining the columns and values attributes.

5. Daily language preferences plot

And here's our daily language preference chart. As we can see, by far the most popular language is English.

6. Creating grouped bar charts

Let's say that we followed the same groupby process as before, but this time, we group by age group and preferred language to count the number of users. The code looks very similar as you can see here.

7. Plotting language preferences by age group

In this case, a line plot will no longer be the right way to show this data. Instead, when we plot, we set the kind argument to 'bar'. Once again, we must remember to include a legend using the column names.

8. Language preferences by age group

And here's the plot. See how easy it is to slice and dice your data in order to obtain various insights?

9. Let's practice!

Now, it's time for you to group by multiple columns.