Productionizing your forecast model

1. Productionizing your forecast model

Hi, my name is Rami Krispin. I will be the instructor for this course.

2. Introduction

Before we jump into the course material, let me tell you a bit about myself: I am a senior data science and engineering manager, and I have a decade of experience working with time series data and building forecasting models at scale and MLOps. I am the author of the book Hands-on Time Series Analysis and Forecasting with R, and the creator and maintainer of several open source projects.

3. Forecasting in Production

This course focuses on forecasting in production. We will learn how to design a forecasting pipeline to automate and monitor a recurring forecasting task.

4. Motivation

We typically would like to productionize a forecasting task when the task is either automation - recurring regularly and frequently, such as forecasting the hourly temperature on an hourly basis; or large scale - when having large amounts of series and the forecasting process requires high computing resources to support the process. And, of course, a combination of both - automating a forecasting process at scale. Let's review the general architecture of a forecasting pipeline.

5. General architecture

It typically includes the following components:

6. General architecture

A live data source, such as an API endpoint or a database. This requires a data pipeline to automate the ETL process and ensure that our local dataset is up-to-date with the data source.

7. General architecture

An experimentation framework to train, test, and evaluate the forecasting models' performance. This component is used to identify the best forecasting model.

8. General architecture

Once we identify the best model, we will deploy it in the production environment, which includes the automation and scaling layers.

9. General architecture

Last but not least is the post-deployment step, which includes monitoring the model's performance in production to identify performance drift and other potential issues.

10. General architecture

Throughout this course, we will dive into the different components of this architecture. We will use a real-life example to demonstrate this process using the US hourly demand for electricity from the EIA API.

11. Course outline

In this chapter, we will review the data source and demonstrate how to pull the data from the EIA API. Chapter 2 focuses on the experimentation process, covering how to train, test, and log the performance of multiple forecasting models in order to identify the best forecasting approach. Chapter 3 reviews the deployment process, including data automation, model refresh, and capturing logs. We will demonstrate how to set automation using AirFlow. Chapter 4 focuses on post-deployment steps, which include monitoring the pipeline and setting alerts. Last but not least, we will conclude the course with best practices.

12. Course prerequisites

The level of this course is advanced, and to complete this course successfully and be able to apply the course learning you will need to have prior knowledge of the following: Time series analysis and forecasting. Orchestration systems such as AirFlow, GitHub Actions, etc Query data from APIs, and Python programming

13. Course tools

Here are some of the tools we will use in the course: - Nixtla's statsforecast and mlforecast libraries to create a forecast - MLflow to track and log the model's performance in the experiment, and - Quarto dashboard to monitor the pipeline This course mainly focuses on the principles. Therefore, there is no limitation to apply the learning with other tools or programming languages such as R or Julia.

14. Let's practice!

Let's get started!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.