Working with a forecast object

1. Working with a forecast object

Now let's go through the forecasting workflow!

2. Work with a forecast object

We will perform data preparation, train multiple forecasting models and evaluate their performance. To do this, we will use the statsforecast library. Then, in the exercises, you will work with mlforecast, which has the same workflow.

3. Training approach

We will use a simple train-and-test approach, leaving the last 72 hours as a testing partition and training the models with the rest of the data. In the next chapter, we will explore a more robust training approach.

4. Required libraries

Let's get started by loading the required libraries. We will use pandas and datetime to load and process the time series data.

5. Required libraries

We will import the StatsForecast class, the DynamicOptimizedTheta, SeasonalNaive, AutoARIMA, HoltWinters, and MSTL models, and the plot_series function.

6. The statsforecast data format

As a reminder, to use the statsforecast library, the data must have the following three columns: unique_id, representing the series id, ds - the series timestamp, and y, the series values. Let's modify the data to follow this format.

7. Data preparation - data load

We load our data, and reformatted it to match the statsforecast requirements. In addition, we set an environment variable that specificity the unique id column number.

8. Data preparation - train/test split

Next, we will split the series into training and testing partitions by using datetime's timedelta function to subset by period, leaving the last 72 hours as testing.

9. Data preparation

Before pivoting to the modeling, let's use the plot_series function to plot the training and testing partitions. As we saw in the previous lessons, the series has strong seasonality patterns.

10. Forecasting with StatsModels

Let's start to build our models. We will use the following five forecasting models: AutoArima, Seasonal Naive and Theta models with hourly seasonality component. And two Multi-seasonal trend models to handle both hourly and day of the week seasonality, using different forecasting methods.

11. Forecasting with StatsModels

Next, we store those models in a list, and use the StatsForecast function to define the model object. We use the freq argument to define the series frequency as hourly. The fallback argument specifies which model to use in case of a model failure. Lastly, we set the n_jobs argument to -1 to use all available cores during the model execution.

12. Forecasting with StatsModels

To create the forecast, we call the object's forecast method. We use the training partition as input, set the horizon to 72 hours, and define the prediction interval level to 95%.

13. Forecast Forecasting with StatsModels StatsModels

Let's use again the plot_series function to plot the forecast output with the testing partition.

14. Model evaluation

The last step is to evaluate the model's performance. We will use those functions to calculate MAPE, RMSE, and the prediction interval coverage.

15. Model evaluation

Next, we merge our forecast predictions with test values to create fc DataFrame. We then loop through each model, calculate metrics for each testing partition, and append them to pandas DataFrame.

16. Model evaluation

Sorting performance by RMSE and printing the results, we see that Seasonal Naive model achieved the lowest RMSE and MAPE score on testing partition.

17. Model evaluation

When examining the models' performance, the first question we should ask is whether there is room for improvement. Could using different settings yield a more accurate forecast? Does the performance on the testing set reflect the future performance of the models or just a one-time fit? This is where the use of experiment comes in handy, and this is the focus of the next chapter.

18. Let's practice!

Now it's your turn to build and evaluate forecasting models!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.