Word search with dataframes
In this exercise you're going to work with text data, containing emails from Enron employees. The Enron scandal is a famous fraud case. Enron employees covered up the bad financial position of the company, thereby keeping the stock price artificially high. Enron employees sold their own stock options, and when the truth came out, Enron investors were left with nothing. The goal is to find all emails that mention specific words, such as "sell enron stock".
By using string operations on dataframes, you can easily sift through messy email data and create flags based on word-hits. The Enron email data has been put into a dataframe called df
so let's search for suspicious terms. Feel free to explore df
in the Console before getting started.
Este exercício faz parte do curso
Fraud Detection in Python
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Find all cleaned emails that contain 'sell enron stock'
mask = df['clean_content'].____.____('____', na=False)