1. सीखें
  2. /
  3. पाठ्यक्रम
  4. /
  5. Foundations of PySpark

Connected

अभ्यास

Split the data

Now that you've done all your manipulations, the last step before modeling is to split the data!

निर्देश

100 XP
  • Use the DataFrame method .randomSplit() to split piped_data into two pieces, training with 60% of the data, and test with 40% of the data by passing the list [.6, .4] to the .randomSplit() method.