Get startedGet started for free

Absolute difference between two years

As an other variable, you might want to add the absolute difference in donations made in the last year (2017) and the number of donations made in the year before that (2016). To add this variable to the basetable, two pandas dataframes gifts_2016 and gifts_2017 with donations made in 2016 and 2017 respectively are constructed for you.

Recall that you can count the occurances in a pandas dataframe df for each id in a column id in a pandas dataframe df using the groupby function:

df_grouped = df.groupby("id")["value"].count().reset_index()

and that you can left join two dataframes df1 and df2 on a given merge_id using the following code:

merged_df = pd.merge(df1, df2, on=["merge_id"], how="left")

This exercise is part of the course

Intermediate Predictive Analytics in Python

View Course

Exercise instructions

  • Calculate for each donor in gifts_2016 the number of donations made in 2016.
  • Calculate for each donor in gifts_2017 the number of donations made in 2017.
  • Add the number of donations made in 2016 and 2017 to the basetable.
  • Calculate the difference between the number of donations made in 2017 and 2016 to the basetable.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Number of gifts in 2016 and 2017 for each donor
gifts_2016_bydonor = gifts_2016.____("____").____().reset_index()
gifts_2016_bydonor.columns = ["donor_ID", "donations_2016"]
gifts_2017_bydonor = gifts_2017.____("____").____().reset_index()
gifts_2017_bydonor.columns = ["donor_ID", "donations_2017"]

# Add number of gifts in 2016 and 2017 to the basetable
basetable = pd.merge(____, ____, on="donor_ID", how="____")
basetable = pd.merge(____, ____, on="donor_ID", how="____")

# Calculate the number of gifts in 2017 minus number of gifts in 2016
basetable.fillna(0)
basetable["gifts_2017_min_2016"] = ____["____"] - ____["____"]
print(basetable.head())
Edit and Run Code