Get startedGet started for free

Word search with dataframes

In this exercise you're going to work with text data, containing emails from Enron employees. The Enron scandal is a famous fraud case. Enron employees covered up the bad financial position of the company, thereby keeping the stock price artificially high. Enron employees sold their own stock options, and when the truth came out, Enron investors were left with nothing. The goal is to find all emails that mention specific words, such as "sell enron stock".

By using string operations on dataframes, you can easily sift through messy email data and create flags based on word-hits. The Enron email data has been put into a dataframe called df so let's search for suspicious terms. Feel free to explore df in the Console before getting started.

This exercise is part of the course

Fraud Detection in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Find all cleaned emails that contain 'sell enron stock'
mask = df['clean_content'].____.____('____', na=False)
Edit and Run Code