Session Ready
Exercise

First Prediction

In one of the previous exercises you discovered that in your training set, females had over a 50% chance of surviving and males had less than a 50% chance of surviving. Hence, you could use this information for your first prediction: all females in the test set survive and all males in the test set die.

You use your test set for validating your predictions. You might have seen that contrary to the training set, the test set has no Survived column. You add such a column using your predicted values. Next, when uploading your results, Kaggle will use this variable (= your predictions) to score your performance.

Instructions
100 XP
  • Create a variable test_one, identical to dataset test
  • Add an additional column, Survived, that you initialize to zero.
  • Use vector subsetting like in the previous exercise to set the value of Survived to 1 for observations whose Sex equals "female".
  • Print the Survived column of predictions from the test_one dataset.