Split the data
Now that you've done all your manipulations, the last step before modeling is to split the data!
Bu egzersiz
Foundations of PySpark
kursunun bir parçasıdırEgzersiz talimatları
- Use the DataFrame method
.randomSplit()to splitpiped_datainto two pieces,trainingwith 60% of the data, andtestwith 40% of the data by passing the list[.6, .4]to the.randomSplit()method.
Uygulamalı interaktif egzersiz
Bu örnek kodu tamamlayarak bu egzersizi bitirin.
# Split the data into training and test sets
training, test = piped_data.randomSplit(____)