Experimentation
1. Experimentation
In the previous chapter, we reviewed the general architecture of a forecasting pipeline.2. Experimentation
In this chapter, we will focus on the experimentation component.3. Experimentation in a nutshell
Experimentation in data science is the process of testing and evaluating hypotheses. This process typically starts with a problem statement,4. Experimentation in a nutshell
which is followed by setting up the hypotheses we want to explore.5. Experimentation in a nutshell
We then conduct tests to evaluate the hypothesis.6. Experimentation in a nutshell
Based on the results, we may set up new hypotheses and repeat the process7. Experimentation in a nutshell
until we reach final conclusions.8. Experimentation in forecasting
Experimentation in predictive analysis, particularly in forecasting, is the process of training statistical and machine learning models at scale. The goal is to identify the best modeling approach for the problem we are trying to solve. This type of process includes the following components: Data - a time series object. An initial hypothesis about the type of models we want to use. A training framework to train and test the models Predefined performance KPIs to score and evaluate the models' performance on the testing partitions A model registration step that enables us to compare and track the models' performance at scale. Last but not least, a model selection method is used to select the best forecasting model using predefined error criteria such as MAPE or RMSE.9. Experimentation in forecasting
Previously, we conducted a simplistic experiment to identify the best forecasting model using MAPE and RMSE error metrics. We should ask ourselves whether there is room for additional improvement in the models' performance.10. Experimentation in forecasting
For example, using AutoARIMA out of the box did not yield great results compared to the other models. Can changing the model's tuning parameters achieve better performance?11. Experimentation in forecasting
Likewise, is there room for further improvement with the Multi-Seasonal models? For example, can we improve the model performance by using different modeling approaches to model the trend component? We want to explore those types of hypotheses in the experimentation process.12. Workflow
In the life cycle of model development and deployment, experimentation is a pre-deployment step. Once we select the model and deploy it, we monitor its performance. In the case of performance drift, we return the model to the experimentation step to re-tune it. We will discuss the deployment and monitoring of the model in more detail in the next chapters.13. Workflow
Here is a simplistic experimentation architecture we will use to train and identify the best forecasting model. This architecture includes the following components:14. Workflow
Data ingestion and transformation process.15. Workflow
Backtesting framework to train and test the models.16. Workflow
Scoring functionality,17. Workflow
and models parameters and results logging18. Workflow
It is recommended to use a JSON file or a similar format to set the experiment parameters.19. Workflow
Like before, we will use Pandas and Nixtla frameworks to process the data and train the models, as well as the MLflow framework to log and track the results. Next, we will dive into the backtesting method.20. Let's practice!
But first, let's run some experiments!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.