From zero to hero
You mastered the skills of creating a model specification and splitting the data into training and test sets. You also know how to avoid class imbalances in the split. It's now time to combine what you learned in the preceding lesson and build your model using only the training set!
You are going to build a proper machine learning pipeline. This is comprised of creating a model specification, splitting your data into training and test sets, and last but not least, fitting the training data to a model. Enjoy!
Este exercício faz parte do curso
Machine Learning with Tree-Based Models in R
Instruções do exercício
- Create
diabetes_split, a split where the training set contains three-quarters of alldiabetesrows and where training and test sets have a similar distribution in theoutcomevariable. - Build a decision tree specification for your model using the
rpartengine and save it astree_spec. - Fit a model
model_trainedusing the training data ofdiabetes_splitwithoutcomeas the target variable andbmiandskin_thicknessas the predictors.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
set.seed(9)
# Create the balanced data split
diabetes_split <- ___
# Build the specification of the model
tree_spec <- ___ %>%
___ %>%
___
# Train the model
model_trained <- ___ %>%
fit(___,
___)
model_trained