LoslegenKostenlos loslegen

Using statistics to define normal behavior

In the previous exercises we saw that fraud is more prevalent in certain transaction categories, but that there is no obvious way to segment our data into for example age groups. This time, let's investigate the average amounts spend in normal transactions versus fraud transactions. This gives you an idea of how fraudulent transactions differ structurally from normal transactions.

Diese Übung ist Teil des Kurses

Fraud Detection in Python

Kurs anzeigen

Anleitung zur Übung

  • Create two new dataframes from fraud and non-fraud observations. Locate the data in df with .loc and assign the condition "where fraud is 1" and "where fraud is 0" for creation of the new dataframes.
  • Plot the amount column of the newly created dataframes in the histogram plot functions and assign the labels fraud and nonfraud respectively to the plots.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Create two dataframes with fraud and non-fraud data 
df_fraud = df.____[df.____ == ____] 
df_non_fraud = df.____[df.____ == ____]

# Plot histograms of the amounts in fraud and non-fraud data 
plt.hist(____.____, alpha=0.5, label='____')
plt.hist(____.____, alpha=0.5, label='____')
plt.legend()
plt.show()
Code bearbeiten und ausführen