Split data to training and testing
You are now ready to build an end-to-end machine learning model by following a few simple steps! You will explore modeling nuances in much more detail in the next chapters, but for now you will practice and understand the key steps.
The independent features have been loaded for you as a pandas
DataFrame named X
, and the dependent values as a pandas
Series named Y
.
Also, the train_test_split
function has been loaded from the sklearn
library. You will now create training and testing datasets, and then make sure the data was correctly split.
Este exercício faz parte do curso
Machine Learning for Marketing in Python
Instruções do exercício
- Split
X
andY
into train and test sets with 25% of the data split into testing. - Ensure that the training dataset has only 75% of original data.
- Ensure that the testing dataset has only 25% of original data.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Split X and Y into training and testing datasets
train_X, test_X, train_Y, test_Y = ___(___, ___, test_size=0.___)
# Ensure training dataset has only 75% of original X data
print(___.shape[0] / X.shape[0])
# Ensure testing dataset has only 25% of original X data
print(___.shape[0] / ___.shape[0])