Get startedGet started for free

Regression review

1. Regression review

Congratulations on finishing chapter 1! Now that you've learned how to use XGBoost for classification, you'll learn how to use XGBoost for regression in this chapter.

2. Regression basics

Regression problems involve predicting continuous, or real, values. For example, if you're attempting to predict the height in centimeters a given person will be at 30 given some of their physical attributes at birth, you're solving a regression problem. Evaluating the quality of a regression model involves using a different set of metrics than those we described for use in classification problems in chapter 1.

3. Common regression metrics

In most cases, we use root mean squared error (RMSE) or the mean absolute error (MAE) to evaluate the quality of a regression model.

4. Computing RMSE

RMSE is computed by

5. Computing RMSE

taking the difference between the actual and the predicted values for what you are trying to predict,

6. Computing RMSE

squaring those differences, computing their mean, and taking that value's square root. This allows us to treat negative and positive differences equally, but tends to punish larger differences between predicted and actual values much more than smaller ones. MAE, on the other hand,

7. Computing MAE

simply sums the absolute differences between predicted and actual values across all of the samples we build our model on. Although MAE isn't affected by large differences as much as RMSE, it lacks some nice mathematical properties that make it much less frequently used as an evaluation metric.

8. Common regression algorithms

Some common algorithms that are used for regression problems include linear regression and decision trees. It's important to briefly note here that some algorithms,

9. Algorithms for both regression and classification

such as decision trees, can be used for both regression as well as classification tasks, which, as we will see, is one of their important properties that makes them prime candidates to be building blocks for XGBoost models.

10. Let's practice!

Awesome, let's test your regression knowledge with a multiple choice questions.