Exploratory data analysis
Before diving into the nitty gritty of pipelines and preprocessing, let's do some exploratory analysis of the original, unprocessed Ames housing dataset. When you worked with this data in previous chapters, we preprocessed it for you so you could focus on the core XGBoost concepts. In this chapter, you'll do the preprocessing yourself!
A smaller version of this original, unprocessed dataset has been pre-loaded into a pandas
DataFrame called df
. Your task is to explore df
in the Shell and pick the option that is incorrect. The larger purpose of this exercise is to understand the kinds of transformations you will need to perform in order to be able to use XGBoost.
This exercise is part of the course
Extreme Gradient Boosting with XGBoost
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
