Visualizing model score variability over time
Now that you've assessed the variability of each coefficient, let's do the same for the performance (scores) of the model. Recall that the TimeSeriesSplit
object will use successively-later indices for each test set. This means that you can treat the scores of your validation as a time series. You can visualize this over time in order to see how the model's performance changes over time.
An instance of the Linear regression model object is stored in model
, a cross-validation object in cv
, and data in X
and y
.
This exercise is part of the course
Machine Learning for Time Series Data in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
from sklearn.model_selection import cross_val_score
# Generate scores for each split to see how the model performs over time
scores = cross_val_score(model, X, y, cv=cv, scoring=my_pearsonr)
# Convert to a Pandas Series object
scores_series = pd.Series(scores, index=times_scores, name='score')
# Bootstrap a rolling confidence interval for the mean score
scores_lo = scores_series.____(20).aggregate(partial(____, percentiles=2.5))
scores_hi = scores_series.____(20).aggregate(partial(____, percentiles=97.5))