Split data to training and testing

Final step before we move to building the regression model! Here, you will follow the steps of identifying the names of the target variable and the feature columns, extract the data, and split them into training and testing.

The pandas and numpy libraries have been loaded as pd as np respectively. The input features are imported as the features dataset, and the target variable you built in the previous exercise has been imported for you as Y.

Deze oefening maakt deel uit van de cursus

Machine Learning for Marketing in Python

Cursus bekijken

Oefeninstructies

Store the customer identifier column name as a list.
Select the feature column names excluding the customer identifier.
Extract the features as X.
Split the data to training and testing by using the train_test_split() function.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Store customer identifier column name as a list
custid = ['___']

# Select feature column names excluding customer identifier
cols = [col for col in features.___ if col not in ___]

# Extract the features as `X`
X = features[___]

# Split data to training and testing
___, test_X, train_Y, ___ = ___(X, Y, test_size=0.25, random_state=99)

Code bewerken en uitvoeren