Experimentation

1. Experimentation

In the previous chapter, we reviewed the general architecture of a forecasting pipeline.

2. Experimentation

In this chapter, we will focus on the experimentation component.

3. Experimentation in a nutshell

Experimentation in data science is the process of testing and evaluating hypotheses. This process typically starts with a problem statement,

4. Experimentation in a nutshell

which is followed by setting up the hypotheses we want to explore.

5. Experimentation in a nutshell

We then conduct tests to evaluate the hypothesis.

6. Experimentation in a nutshell

Based on the results, we may set up new hypotheses and repeat the process

7. Experimentation in a nutshell

until we reach final conclusions.

8. Experimentation in forecasting

Experimentation in predictive analysis, particularly in forecasting, is the process of training statistical and machine learning models at scale. The goal is to identify the best modeling approach for the problem we are trying to solve. This type of process includes the following components: Data - a time series object. An initial hypothesis about the type of models we want to use. A training framework to train and test the models Predefined performance KPIs to score and evaluate the models' performance on the testing partitions A model registration step that enables us to compare and track the models' performance at scale. Last but not least, a model selection method is used to select the best forecasting model using predefined error criteria such as MAPE or RMSE.

9. Experimentation in forecasting

Previously, we conducted a simplistic experiment to identify the best forecasting model using MAPE and RMSE error metrics. We should ask ourselves whether there is room for additional improvement in the models' performance.

10. Experimentation in forecasting

For example, using AutoARIMA out of the box did not yield great results compared to the other models. Can changing the model's tuning parameters achieve better performance?

11. Experimentation in forecasting

Likewise, is there room for further improvement with the Multi-Seasonal models? For example, can we improve the model performance by using different modeling approaches to model the trend component? We want to explore those types of hypotheses in the experimentation process.

12. Workflow

In the life cycle of model development and deployment, experimentation is a pre-deployment step. Once we select the model and deploy it, we monitor its performance. In the case of performance drift, we return the model to the experimentation step to re-tune it. We will discuss the deployment and monitoring of the model in more detail in the next chapters.

13. Workflow

Here is a simplistic experimentation architecture we will use to train and identify the best forecasting model. This architecture includes the following components:

14. Workflow

Data ingestion and transformation process.

15. Workflow

Backtesting framework to train and test the models.

16. Workflow

Scoring functionality,

17. Workflow

and models parameters and results logging

18. Workflow

It is recommended to use a JSON file or a similar format to set the experiment parameters.

19. Workflow

Like before, we will use Pandas and Nixtla frameworks to process the data and train the models, as well as the MLflow framework to log and track the results. Next, we will dive into the backtesting method.

20. Let's practice!

But first, let's run some experiments!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.