Get Started

Removing titles and taking names

While collecting survey respondent metadata in the airlines DataFrame, the full name of respondents was saved in the full_name column. However upon closer inspection, you found that a lot of the different names are prefixed by honorifics such as "Dr.", "Mr.", "Ms." and "Miss".

Your ultimate objective is to create two new columns named first_name and last_name, containing the first and last names of respondents respectively. Before doing so however, you need to remove honorifics.

The airlines DataFrame is in your environment, alongside pandas as pd.

This is a part of the course

“Cleaning Data in Python”

View Course

Exercise instructions

  • Remove "Dr.", "Mr.", "Miss" and "Ms." from full_name by replacing them with an empty string "" in that order.
  • Run the assert statement using .str.contains() that tests whether full_name still contains any of the honorifics.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Replace "Dr." with empty string ""
airlines['full_name'] = airlines['full_name'].____.____("____","")

# Replace "Mr." with empty string ""
airlines['full_name'] = ____

# Replace "Miss" with empty string ""
____

# Replace "Ms." with empty string ""
____

# Assert that full_name has no honorifics
assert airlines['full_name'].str.contains('Ms.|Mr.|Miss|Dr.').any() == False

This exercise is part of the course

Cleaning Data in Python

IntermediateSkill Level
4.4+
59 reviews

Learn to diagnose and treat dirty data and develop the skills needed to transform your raw data into accurate insights!

Categorical and text data can often be some of the messiest parts of a dataset due to their unstructured nature. In this chapter, you’ll learn how to fix whitespace and capitalization inconsistencies in category labels, collapse multiple categories into one, and reformat strings for consistency.

Exercise 1: Membership constraintsExercise 2: Members onlyExercise 3: Finding consistencyExercise 4: Categorical variablesExercise 5: Categories of errorsExercise 6: Inconsistent categoriesExercise 7: Remapping categoriesExercise 8: Cleaning text dataExercise 9: Removing titles and taking names
Exercise 10: Keeping it descriptive

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free