1. Learn
  2. /
  3. Courses
  4. /
  5. Parallel Programming with Dask in Python

Connected

Exercise

Lazy train-test split

You have transformed the X variables. Now you need to finish your data prep by transforming the y variables and splitting your data into train and test sets.

The variables X and y, which you created in the last exercise, are available in your environment.

Instructions

100 XP
  • Import the train_test_split() function from dask_ml.model_selection.
  • The popularity scores in y are in the range 0-100, divide them by 100 so they are in the range 0-1.
  • Split the data into train and test sets using the train_test_split() function, make sure to shuffle the data, and set the test fraction to 20% of the data.