1. Learn
  2. /
  3. Courses
  4. /
  5. Kaggle R Tutorial on Machine Learning

Exercise

Making your first predictions

In one of the previous exercises you discovered that in your training set, females had over a 50% chance of surviving and males had less than a 50% chance of surviving. Hence, you could use this information for your first prediction: all females in the test set survive and all males in the test set die.

You use your test set for validating your predictions. You might have seen that, contrary to the training set, the test set has no Survived column. You add such a column using your predicted values. Next, when uploading your results, Kaggle will use this column (= your predictions) to score your performance.

We already prepared a data frame test_one for you, that is a copy of the test variable.

Instructions

100 XP
  • Add an additional column, Survived, that you initialize to zero.
  • Use vector subsetting like in the previous exercise to set the value of Survived to 1 for observations whose Sex equals "female".