In this first chapter, we will discuss the concept of credit risk and define how it is calculated. Using cross tables and plots, we will explore a real-world data set. Before applying machine learning, we will process this data by finding and resolving problems.
With the loan data fully prepared, we will discuss the logistic regression model which is a standard in risk modeling. We will understand the components of this model as well as how to score its performance. Once we've created predictions, we can explore the financial impact of utilizing this model.
Decision trees are another standard credit risk model. We will go beyond decision trees by using the trendy XGBoost package in Python to create gradient boosted trees. After developing sophisticated models, we will stress test their performance and discuss column selection in unbalanced data.
After developing and testing two powerful machine learning models, we use key performance metrics to compare them. Using advanced model selection techniques specifically for financial modeling, we will select one model. With that model, we will: develop a business strategy, estimate portfolio value, and minimize expected loss.