Kidney disease case study III: Full pipeline
It's time to piece together all of the transforms along with an XGBClassifier
to build the full pipeline!
Besides the numeric_categorical_union
that you created in the previous exercise, there are two other transforms needed: the Dictifier()
transform which we created for you, and the DictVectorizer()
.
After creating the pipeline, your task is to cross-validate it to see how well it performs.
Diese Übung ist Teil des Kurses
Extreme Gradient Boosting with XGBoost
Anleitung zur Übung
- Create the pipeline using the
numeric_categorical_union
,Dictifier()
, andDictVectorizer(sort=False)
transforms, andxgb.XGBClassifier()
estimator withmax_depth=3
. Name the transforms"featureunion"
,"dictifier"
"vectorizer"
, and the estimator"clf"
. - Perform 3-fold cross-validation on the
pipeline
usingcross_val_score()
. Pass it the pipeline,pipeline
, the features,kidney_data
, the outcomes,y
. Also setscoring
to"roc_auc"
andcv
to3
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create full pipeline
pipeline = ____([
("____", ____),
("____", ____),
("____", ____),
("____", ____)
])
# Perform cross-validation
cross_val_scores = ____(____, ____, ____, ____="____", ____=____)
# Print avg. AUC
print("3-fold AUC: ", np.mean(cross_val_scores))