Exercise

Preprocess censored data

You are a marine-biologist studying the lifespan of spinner dolphins. You have access to historical data detailing their birth and death dates. Some tagged dolphins migrated to a different part of the water and the lab lost track of them. Some dolphins are migrants from a different pod, and their exact birth dates are unknown. Some dolphins are still alive!

  • If the birth date is NaN, the dolphin is a migrant.
  • If the death date is NaN, the dolphin either ran away or is alive.

The DataFrame is called dolphin_df. To create a new column called observed to flag if a dolphin's lifetime is censored, fill out the function check_observed with appropriate values and use .apply() to apply the function to dolphin_df.

pandas and numpy are loaded as pd and np, respectively.

Instructions

100 XP
  • Create a function check_observed to return 0 if the data point is censored, and 1 otherwise.
  • Create a censorship flag column called observed using the function check_observed.
  • Print the average value of the observed column in the console.