Cross-validation without shuffling
Now, re-run your model fit using block cross-validation (without shuffling all datapoints). In this case, neighboring time-points will be kept close to one another. How do you think the model predictions will look in each cross-validation loop?
An instance of the Linear regression model
object is available in your workspace. Also, the arrays X
and y
(training data) are available too.
This exercise is part of the course
Machine Learning for Time Series Data in Python
Exercise instructions
- Instantiate another cross-validation object, this time using KFold cross-validation with 10 splits and no shuffling.
- Iterate through this object to fit a model using the training indices and generate predictions using the test indices.
- Visualize the predictions across CV splits using the helper function (
visualize_predictions()
) we've provided.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create KFold cross-validation object
from sklearn.model_selection import KFold
cv = ____(n_splits=____, shuffle=____)
# Iterate through CV splits
results = []
for tr, tt in cv.split(X, y):
# Fit the model on training data
model.fit(____)
# Generate predictions on the test data and collect
prediction = model.predict(____)
results.append((prediction, tt))
# Custom function to quickly visualize predictions
visualize_predictions(results)