Preprocessing within a pipeline
Now that you've seen what steps need to be taken individually to properly process the Ames housing data, let's use the much cleaner and more succinct DictVectorizer approach and put it alongside an XGBoostRegressor inside of a scikit-learn pipeline.
Este ejercicio forma parte del curso
Extreme Gradient Boosting with XGBoost
Instrucciones del ejercicio
- Import 
DictVectorizerfromsklearn.feature_extractionandPipelinefromsklearn.pipeline. - Fill in any missing values in the 
LotFrontagecolumn ofXwith0. - Complete the steps of the pipeline with 
DictVectorizer(sparse=False)for"ohe_onestep"andxgb.XGBRegressor()for"xgb_model". - Create the pipeline using 
Pipeline()andsteps. - Fit the 
Pipeline. Don't forget to convertXinto a format thatDictVectorizerunderstands by calling theto_dict("records")method onX. 
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Import necessary modules
____
____
# Fill LotFrontage missing values with 0
X.LotFrontage = ____
# Setup the pipeline steps: steps
steps = [("ohe_onestep", ____),
         ("xgb_model", ____)]
# Create the pipeline: xgb_pipeline
xgb_pipeline = ____
# Fit the pipeline
____