Predict and submit to Kaggle

To send a submission to Kaggle you need to predict the survival rates for the observations in the test set. In the previous chapter we created rather amateuristic predictions with manual subsetting operations. Now that we have a decision tree, we can make use of the predict() function to "generate" our answer:

predict(my_tree_two, test, type = "class")

Here, my_tree_two is the tree model you've just built, test is the data set to build the preditions for, and type = "class" specifies that you want to classify observations.

Before you can submit to Kaggle, you'll have to convert your predictions to a CSV file with exactly 418 entries and 2 columns PassengerId and Survived. Head over to the instructions to get to it!

Use predict() as specified above to make predictions on the test set. Assign the result to my_prediction.
Finish the data.frame() call to create the my_solution data frame that is in line with Kaggle's standards:
The PassengerId column should contain the PassengerId column of test.
The Survivid column should contain the values in my_prediction.
Check that my_solution has 418 entries with nrow().
Finish the write.csv() call to write the data in my_solution to "my_solution.csv". Don't remove the row.names = FALSE argument.

script.R

R Console

Raising anchor

From icebergs to trees

Improving your predictions through Random Forests

Exercise

Exercise

Predict and submit to Kaggle

Instructions