Split data to training and testing
Final step before we move to building the regression model! Here, you will follow the steps of identifying the names of the target variable and the feature columns, extract the data, and split them into training and testing.
The pandas
and numpy
libraries have been loaded as pd
as np
respectively. The input features are imported as the features
dataset, and the target variable you built in the previous exercise has been imported for you as Y
.
Este exercício faz parte do curso
Machine Learning for Marketing in Python
Instruções do exercício
- Store the customer identifier column name as a list.
- Select the feature column names excluding the customer identifier.
- Extract the features as
X
. - Split the data to training and testing by using the
train_test_split()
function.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Store customer identifier column name as a list
custid = ['___']
# Select feature column names excluding customer identifier
cols = [col for col in features.___ if col not in ___]
# Extract the features as `X`
X = features[___]
# Split data to training and testing
___, test_X, train_Y, ___ = ___(X, Y, test_size=0.25, random_state=99)