Pruning the tree with changed prior probabilities
In the video, you have learned that pruning a tree is necessary to avoid overfitting. There were some big trees in the previous exercises and now you will put what you have learned into practice, and prune the previously constructed tree with the changed prior probabilities. The rpart package is already loaded in your workspace.
You will first set a seed to make sure the results are reproducible as mentioned in the video, because you will be examining cross-validated error results. Results involve randomness and could differ slightly upon running the function again with a different seed.
In this exercise you will learn to identify which complexity parameter (CP) will minimize the cross-validated error results, then prune your tree based on this value.
Diese Übung ist Teil des Kurses
Credit Risk Modeling in R
Anleitung zur Übung
- tree_prioris loaded in your workspace.
- Use plotcp()to visualize cross-vaidated error (X-val Relative Error) in relation to the complexity parameter fortree_prior.
- Use printcp()to print a table of information about CP, splits, and errors. See if you can identify which split has the minimum cross-validated error intree_prior.
- Use which.min()to identify which row intree_prior$cptablehas the minimum cross-validated error"xerror". Assign this toindex.
- Create tree_minby selecting the index oftree_prior$cptablewithin the column"CP".
- Use the prune()function to obtain the pruned tree. Call the pruned treeptree_prior.
- Package rpart.plotis loaded in your workspace. Plot the pruned tree using function prp() (default setting).
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# tree_prior is loaded in your workspace
# Plot the cross-validated error rate as a function of the complexity parameter
# Use printcp() to identify for which complexity parameter the cross-validated error rate is minimized.
# Create an index for of the row with the minimum xerror
index <- which.min(___$___[ , "xerror"])
# Create tree_min
tree_min <- tree_prior$cptable[index, "CP"]
#  Prune the tree using tree_min
ptree_prior <- prune(___, cp = ___)
# Use prp() to plot the pruned tree