Exercise

User-defined function to make plots

In the last exercise, you had to write this code to make the plot:

# Subset tech and fmcg companies
subset_dat = dataset.loc[dataset["comp_type"].isin(["tech", "fmcg"])]

# Compute yearly average gross margin ratio of tech and fmcg companies
subset_dat_avg = subset_dat.pivot_table(index=["Year", "comp_type"], values = "gross_margin").reset_index()

# Add column company
subset_dat_avg["company"] = np.where(subset_dat_avg["comp_type"]=="tech", "Avg tech", "Avg fmcg")

# Concat the DataFrames
plot_df = pd.concat([subset_dat, subset_dat_avg], axis=0)

# Make the plot
sns.relplot(data=plot_df.reset_index(drop=True), x="Year", y="gross_margin", hue="company", col="comp_type", kind="line")
plt.show()
plt.close()

Notice that we perform the same actions on the tech and FMCG DataFrames in this exercise. This is repetitive and goes against a coding principle called DRY - Don't repeat yourself. Repetitive code is bad since it increases your work and makes your code more prone to mistakes. In this exercise, you will define your function to process data and plot figures.

Instructions 1/2

undefined XP
    1
    2

Question

Look at this function - this function can make the same plot that you did in the last exercise but with the repetitiveness stripped out.

def make_plot(dataset, ratio, comp_type):
  whole_dat = []
  for industry in comp_type:
    dat = dataset.loc[dataset["comp_type"]==industry]
    dat_avg = dat.pivot_table(index="Year", values=ratio).reset_index()
    dat_avg["company"], dat_avg["comp_type"] = f"Avg {type}", industry"
    whole_dat.append(pd.concat([dat, dat_avg]))

  plot_df = pd.concat(whole_dat).reset_index(drop=True)
  sns.relplot(data=plot_df, x="Year", y="gross_margin", hue="company", col="comp_type", kind="line")
  plt.show()
  plt.close()

This function uses an f-string, with f"Avg {industry}". This f-string will return Avgand the value of industry as an output. Also notice a drop=True in reset_index(). This will reset the index of the DataFrame but not add the old index as a column.

How many comp_types can this function take? The function is loaded in the console for you to test it. The pandas DataFrame dataset is also available with gross margin computed.

Possible answers