ComeçarComece de graça

Split data to training and testing

You are now ready to build an end-to-end machine learning model by following a few simple steps! You will explore modeling nuances in much more detail in the next chapters, but for now you will practice and understand the key steps.

The independent features have been loaded for you as a pandas DataFrame named X, and the dependent values as a pandas Series named Y.

Also, the train_test_split function has been loaded from the sklearn library. You will now create training and testing datasets, and then make sure the data was correctly split.

Este exercício faz parte do curso

Machine Learning for Marketing in Python

Ver curso

Instruções do exercício

  • Split X and Y into train and test sets with 25% of the data split into testing.
  • Ensure that the training dataset has only 75% of original data.
  • Ensure that the testing dataset has only 25% of original data.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Split X and Y into training and testing datasets
train_X, test_X, train_Y, test_Y = ___(___, ___, test_size=0.___)

# Ensure training dataset has only 75% of original X data
print(___.shape[0] / X.shape[0])

# Ensure testing dataset has only 25% of original X data
print(___.shape[0] / ___.shape[0])
Editar e executar o código