Word search with dataframes
In this exercise you're going to work with text data, containing emails from Enron employees. The Enron scandal is a famous fraud case. Enron employees covered up the bad financial position of the company, thereby keeping the stock price artificially high. Enron employees sold their own stock options, and when the truth came out, Enron investors were left with nothing. The goal is to find all emails that mention specific words, such as "sell enron stock".
By using string operations on dataframes, you can easily sift through messy email data and create flags based on word-hits. The Enron email data has been put into a dataframe called df
so let's search for suspicious terms. Feel free to explore df
in the Console before getting started.
Diese Übung ist Teil des Kurses
Fraud Detection in Python
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Find all cleaned emails that contain 'sell enron stock'
mask = df['clean_content'].____.____('____', na=False)