Decision trees as base learners
It's now time to build an XGBoost model to predict house prices - not in Boston, Massachusetts, as you saw in the video, but in Ames, Iowa! This dataset of housing prices has been pre-loaded into a DataFrame called df
. If you explore it in the Shell, you'll see that there are a variety of features about the house and its location in the city.
In this exercise, your goal is to use trees as base learners. By default, XGBoost uses trees as base learners, so you don't have to specify that you want to use trees here with booster="gbtree"
.
xgboost
has been imported as xgb
and the arrays for the features and the target are available in X
and y
, respectively.
Este ejercicio forma parte del curso
Extreme Gradient Boosting with XGBoost
Instrucciones del ejercicio
- Split
df
into training and testing sets, holding out 20% for testing. Use arandom_state
of123
. - Instantiate the
XGBRegressor
asxg_reg
, using aseed
of123
. Specify an objective of"reg:squarederror"
and use 10 trees. Note: You don't have to specifybooster="gbtree"
as this is the default. - Fit
xg_reg
to the training data and predict the labels of the test set. Save the predictions in a variable calledpreds
. - Compute the
rmse
usingnp.sqrt()
and themean_squared_error()
function fromsklearn.metrics
, which has been pre-imported.
Ejercicio interactivo práctico
Prueba este ejercicio completando el código de muestra.
# Create the training and test sets
X_train, X_test, y_train, y_test = ____(____, ____, ____=____, random_state=123)
# Instantiate the XGBRegressor: xg_reg
xg_reg = ____
# Fit the regressor to the training set
____
# Predict the labels of the test set: preds
preds = ____
# Compute the rmse: rmse
rmse = ____(____(____, ____))
print("RMSE: %f" % (rmse))