Pipeline architecture
1. Pipeline architecture
Welcome back!2. Model deployment
In the previous chapters, we reviewed the data source and the model selection process using the experimentation framework.3. Model deployment
In this chapter, we will focus on model deployment by setting up the ETL and ML pipelines.4. Pipelines requirements
Let's start by defining the pipeline's requirements. We will refresh the data and forecast daily and set the forecast horizon to 72 hours. We want the process to be robust, which means adding unit testing and validation steps to avoid data integrity issues, forecast performance drift, and other potential errors. Lastly, we should be mindful of the future maintenance of the pipeline and avoid unnecessary complexity.5. Pipeline design
Once we have defined the pipeline requirements, the next step is to create the pipeline design. This design should represent the pipeline's directed acyclic graph blueprint, or DAG. Recall that DAGs help us define the execution order of the pipeline's tasks. Regardless of the tool you use to orchestrate your pipeline, having a clear end-to-end design simplifies the implementation process. Throughout this course, we will use the following pipeline design, which includes6. Pipeline design
data ingestion process,7. Pipeline design
Forecasting automation component8. Pipeline design
data storage, and9. Pipeline design
capturing logs to monitor the pipeline health, once deployed into production.10. Pipeline design
To set up this pipeline, we will use `Airflow` to orchestrate the different components of the pipeline and `mlflow` to store the winning model from the experimentation process. Let's start by registering the winning model.11. Model registry
There are two common approaches for model registration with `mlflow`: - Register the model during the logging of the model's parameters and metrics, or - Register the model separately, after the experimentation process is complete Each method has pros and cons. For simplicity, we will only log the winning model. We can log the model using `mlflow` built-in functionality, also known as flavor, which supports Python core machine learning classes. And for non-supported classes, we can build a custom register function. In both cases, we log the fitted object, which must have a predict method. To register the winning model, we will use the `mlforecast` library's customized register function.12. Model registry
Let's start by importing the models, `mlflow`, and the `mlforecast` flavor module. We then set the experiment name and path, and pull the experiment metadata using the `get_experiment_by_name` function.13. Model registry
We will define the `lightGBM` model and the `mlforecast` parameters.14. Model registry
Next, we will set the `mlforecast` object and fit it to the time series. Once we have the fitted object, we can register it.15. Model registry
Last but not least, we will set the run name using the model label and the time of the run, and log the model using the `mlforecast` flavor `log_model` function.16. Model registry
Once the model is registered, it will appear in the UI labeled with the run name that we defined.17. Model registry
You can notice that the model name is available in the models column.18. Model registry
The model object is stored as an artifact, including the object pickle file, environment settings, and other metadata. In the following video, we will dive into the ETL process.19. Let's practice!
Now it's your turn. Let's move to some exercises!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.