Session Ready
Exercise

Rose vs Jack, or Female vs Male

How many people in your training set survived the disaster with the Titanic? To see this, you can use the table() command in combination with the $-operator to select a single column of a data frame:

# absolute numbers
table(train$Survived) 

# proportions
prop.table(table(train$Survived))

If you run these commands in the console, you'll see that 549 individuals died (62%) and 342 survived (38%). A simple prediction heuristic could thus be "majority wins": you predict every unseen observation to not survive.

In general, the table() command can help you to explore which variables have predictive value. For example, maybe gender could play a role as well? For a two-way comparison, also including gender, you can use

table(train$Sex, train$Survived)

To get proportions, you can again wrap prop.table() around table(), but you'll have to specify whether you want row-wise or column-wise proportions: This is done by setting the second argument of prop.table(), called margin, to 1 or 2, respectively.

Instructions
100 XP
  • Call table() on train$Survived to calculate the survival rates in absolute numbers.
  • Calculate the survival rates as proportions by wrapping prop.table() around the previous table() call.
  • Do a two-way comparison on the number of males and females that survived, in absolute numbers. Again, use the train data frame.
  • Convert the numbers to row-wise proportions.