Get startedGet started for free

Predict and submit to Kaggle

To send a submission to Kaggle you need to predict the survival rates for the observations in the test set. In the previous chapter we created rather amateuristic predictions with manual subsetting operations. Now that we have a decision tree, we can make use of the predict() function to "generate" our answer:

predict(my_tree_two, test, type = "class")

Here, my_tree_two is the tree model you've just built, test is the data set to build the preditions for, and type = "class" specifies that you want to classify observations.

Before you can submit to Kaggle, you'll have to convert your predictions to a CSV file with exactly 418 entries and 2 columns PassengerId and Survived. Head over to the instructions to get to it!

This exercise is part of the course

Kaggle R Tutorial on Machine Learning

View Course

Exercise instructions

  • Use predict() as specified above to make predictions on the test set. Assign the result to my_prediction.
  • Finish the data.frame() call to create the my_solution data frame that is in line with Kaggle's standards:
  • The PassengerId column should contain the PassengerId column of test.
  • The Survivid column should contain the values in my_prediction.
  • Check that my_solution has 418 entries with nrow().
  • Finish the write.csv() call to write the data in my_solution to "my_solution.csv". Don't remove the row.names = FALSE argument.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# my_tree_two and test are available in the workspace

# Make predictions on the test set
my_prediction <- predict(___, newdata = ___, type = ___)

# Finish the data.frame() call
my_solution <- data.frame(PassengerId = ___, Survived = ___)

# Use nrow() on my_solution


# Finish the write.csv() call
write.csv(___, file = ___, row.names = FALSE)
Edit and Run Code