Exercise

Train - test split

In this chapter, you will keep working with the ANSUR dataset. Before you can build a model on your dataset, you should first decide on which feature you want to predict. In this case, you're trying to predict gender.

You need to extract the column holding this feature from the dataset and then split the data into a training and test set. The training set will be used to train the model and the test set will be used to check its performance on unseen data.

ansur_df has been pre-loaded for you.

Instructions

100 XP
  • Import the train_test_split function from sklearn.model_selection.
  • Assign the 'Gender' column to y.
  • Remove the 'Gender' column from the DataFrame and assign the result to X.
  • Set the test size to 30% to perform a 70% train and 30% test data split.