Get startedGet started for free

Removing duplicate objects

Assume that you want to construct a predictive model in order to select donors that are most likely to respond on a letter. The population of the basetable should contain donors that have an adress available, and that have privacy settings that allow to send them a letter. All candidate donors are given in a dataframe donors with three columns: the donor_id, a flag address that is 1 if the address is available and 0 otherwise, and a flag letter_allowed that is 1 if one can send this donor a letter and 0 otherwise. In this exercise you will construct a set with the donors that should go in the population.

This exercise is part of the course

Intermediate Predictive Analytics in Python

View Course

Exercise instructions

  • Create a dataframe donors_population only containing observations that have address available and for which a letter is allowed.
  • Create a list containing the donor ids in donors_population.
  • Construct the final population and then numbers of donors in it.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create a dataframe donors_population
donors_population = ____[(____["____"] == ____) & (____["____"] == ____)]

# Create a list of donor IDs
population_list = ____(____["____"])

# Select unique donors in population_list
population = ____(____)
print(len(population))
Edit and Run Code