Create one holdout set
Your boss has asked you to create a simple random forest model on the tic_tac_toe
dataset. She doesn't want you to spend much time selecting parameters; rather she wants to know how well the model will perform on future data. For future Tic-Tac-Toe games, it would be nice to know if your model can predict which player will win.
The dataset tic_tac_toe
has been loaded for your use.
Note that in Python, =\
indicates the code was too long for one line and has been split across two lines.
This exercise is part of the course
Model Validation in Python
Exercise instructions
- Create the
X
dataset by creating dummy variables for all of the categorical columns. - Split
X
andy
into train (X_train
,y_train
) and test (X_test
,y_test
) datasets. - Split the datasets using 10% for testing
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create dummy variables using pandas
X = ____.____(tic_tac_toe.iloc[:,0:9])
y = tic_tac_toe.iloc[:, 9]
# Create training and testing datasets. Use 10% for the test set
____, ____, ____, ____ = ____(X, y, ____=____, random_state=1111)