LoslegenKostenlos loslegen

Using list of terms

Oftentimes you don't want to search on just one term. You probably can create a full "fraud dictionary" of terms that could potentially flag fraudulent clients and/or transactions. Fraud analysts often will have an idea what should be in such a dictionary. In this exercise you're going to flag a multitude of terms, and in the next exercise you'll create a new flag variable out of it. The 'flag' can be used either directly in a machine learning model as a feature, or as an additional filter on top of your machine learning model results. Let's first use a list of terms to filter our data on. The dataframe containing the cleaned emails is again available as df.

Diese Übung ist Teil des Kurses

Fraud Detection in Python

Kurs anzeigen

Anleitung zur Übung

  • Create a list to search for including 'enron stock', 'sell stock', 'stock bonus', and 'sell enron stock'.
  • Join the string terms in the search conditions.
  • Filter data using the emails that match with the list defined under searchfor.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Create a list of terms to search for
searchfor = ['____', '____', '____', '____']

# Filter cleaned emails on searchfor list and select from df 
filtered_emails = df.____[____['_____'].____._____('|'.join(____), na=False)]
print(filtered_emails)
Code bearbeiten und ausführen