1. Regression review
Congratulations on finishing chapter 1! Now that you've learned how to use XGBoost for classification, you'll learn how to use XGBoost for regression in this chapter.
2. Regression basics
Regression problems involve predicting continuous, or real, values. For example, if you're attempting to predict the height in centimeters a given person will be at 30 given some of their physical attributes at birth, you're solving a regression problem. Evaluating the quality of a regression model involves using a different set of metrics than those we described for use in classification problems in chapter 1.
3. Common regression metrics
In most cases, we use root mean squared error (RMSE) or the mean absolute error (MAE) to evaluate the quality of a regression model.
4. Computing RMSE
RMSE is computed by
5. Computing RMSE
taking the difference between the actual and the predicted values for what you are trying to predict,
6. Computing RMSE
squaring those differences, computing their mean, and taking that value's square root. This allows us to treat negative and positive differences equally, but tends to punish larger differences between predicted and actual values much more than smaller ones. MAE, on the other hand,
7. Computing MAE
simply sums the absolute differences between predicted and actual values across all of the samples we build our model on. Although MAE isn't affected by large differences as much as RMSE, it lacks some nice mathematical properties that make it much less frequently used as an evaluation metric.
8. Common regression algorithms
Some common algorithms that are used for regression problems include linear regression and decision trees. It's important to briefly note here that some algorithms,
9. Algorithms for both regression and classification
such as decision trees, can be used for both regression as well as classification tasks, which, as we will see, is one of their important properties that makes them prime candidates to be building blocks for XGBoost models.
10. Let's practice!
Awesome, let's test your regression knowledge with a multiple choice questions.