Exploring your data
In the next exercises, you will be looking at bank payment transaction data. The financial transactions are categorized by type of expense, as well as the amount spent. Moreover, you have some client characteristics available such as age group and gender. Some of the transactions are labelled as fraud; you'll treat these labels as given and will use those to validate the results.
When using unsupervised learning techniques for fraud detection, you want to distinguish normal from abnormal (thus potentially fraudulent) behavior. As a fraud analyst to understand what is "normal", you need to have a good understanding of the data and its characteristics. Let's explore the data in this first exercise.
This exercise is part of the course
Fraud Detection in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Get the dataframe shape
df.____
# Display the first 5 rows
df.____