1. When should I use XGBoost?
So, given everything we've said about XGBoost, when should (and shouldn't) you use it?
2. When to use XGBoost
Given that I've already talked a bit about when and where XGBoost shines, some of this shouldn't come as a surprise to you. You should consider using XGBoost for any supervised machine learning task that fits the following criteria:
You have a large number of training examples. Although your definition of large can vary, I intend it to mean a dataset that has few features and at least 1000 examples. However, in general, as long as the number of features in your training set is smaller than the number of examples you have, you should be fine.
Finally, XGBoost tends to do well when you have a mixture of categorical and numeric features, or when you have just numeric features.
3. When to NOT use XGBoost
When should you not use XGBoost?
The most important kinds of problems where XGBoost is a suboptimal choice involve either those that have found success using other state-of-the-art algorithms or those that suffer from dataset size issues.
Specifically, XGBoost is not ideally suited for image recognition, computer vision, or natural language processing and understanding problems, as those kinds of problems can be much better tackled using deep learning approaches.
In terms of dataset size problems, XGBoost is not suitable when you have very small training sets ( less than 100 training examples) or when the number of training examples is significantly smaller than the number of features being used for training.
4. Let's practice!
Ok, let's finish off what you learned in chapter 1 with one last multiple choice question!