Session Ready
Exercise

Testing Kaggle forum ideas

Unfortunately, not all the Forum posts and Kernels are necessarily useful for your model. So instead of blindly incorporating ideas into your pipeline, you should test them first.

You're given a function get_cv_score(), which takes a train dataset as an argument and returns the overall validation root mean squared error over 3-fold cross-validation. The train DataFrame is already available in your workspace.

You should try different suggestions from the Kaggle Forum and check whether they improve your validation score.

Instructions 1/2
undefined XP
  • 1
    • Suggestion 1: the passenger_count feature is useless. Let's see! Drop this feature and compare the scores.
    • 2
      • This first suggestion worked. Suggestion 2: Sum of pickup_latitude and distance_km is a good feature. Let's try it!