Simple use of .apply()
Let's get some handful experience with .apply()
!
You are given the full scores
dataset containing students' performance as well as their background information.
Your task is to define the prevalence()
function and apply it to the groups_to_consider
columns of the scores
DataFrame. This function should retrieve the most prevalent group/category for a given column (e.g. if the most prevalent category in the lunch
column is standard
, then prevalence()
should return standard
).
The reduce()
function from the functools
module is already imported.
Tip: pd.Series
is an Iterable object. Therefore, you can use standard operations on it.
This exercise is part of the course
Practicing Coding Interview Questions in Python
Exercise instructions
- Create a tuple list with unique items from passed object
series
and their counts. - Extract a tuple with the highest counts using
reduce()
. - Return the item with the highest counts.
- Apply the prevalence function on the
scores
DataFrame using columns specified ingroups_to_consider
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def prevalence(series):
vals = list(series)
# Create a tuple list with unique items and their counts
itms = [(____, ____) for x in set(____)]
# Extract a tuple with the highest counts using reduce()
res = reduce(lambda x, y: ____, ____)
# Return the item with the highest counts
return ____[____]
# Apply the prevalence function on the scores DataFrame
result = scores[groups_to_consider].____
print(result)