Model Registration with MLflow

1. Model Registration with MLflow

In this video, we will explore model performance using the MLflow UI.

2. Launching the MLflow UI

To launch the MLflow UI, open the terminal and use the `mlflow ui` command from the experiment folder.

3. Launching the MLflow UI

By default, the server is exposed to port 5000, and you can open the UI using the URL.

4. Analyze the backtesting results

The UI enables you to explore your experiments and compare the different runs seamlessly.

5. Analyze the backtesting results

Let's select the ml_forecast experiment under the experiments menu

6. Analyze the backtesting results

and the corresponding experiment runs from the main table.

7. Analyze the backtesting results

We use the group by drop-down and group the table by the model label.

8. Analyze the backtesting results

This enables us to compare the performance by model label.

9. Analyze the backtesting results

Using the graph icon, we will visualize the overall models' performance by coverage, MAPE, and RMSE. In this case, you can notice that the lightGBM and XGboost models are, on average, fairly close in their performance, with a slight advantage for the lightGBM.

10. Analyze the backtesting results

Likewise, we can explore further the models performance using the compare menu.The box-plot option enables us to review the error distribution of each model.

11. Can we improve the performance?

After we finished our first round of modeling and reviewed the results, we should evaluate if the models are optimized or if there is room for improvement before selecting and promoting a model to production. This evaluation could be based on: - An existing performance benchmark - Using residual analysis to identify remaining patterns that the models did not capture, and - Analyzing the backtesting results Residual analysis is beyond the scope of this course and we will focus on analyzing the backtesting results. Additional improvements can be achieved by: Replacing or adding new models Adding new features and using different tuning parameters

12. Can we improve the performance?

Let's look at the wining model - lightGBM, which scored the lowest error on average on the testing partitions with respect to the other four models.

13. Tuning parameters

The lightGBM model uses the following tuning parameters. Like any other tree based model it highly sensitive to the learning rate and the number of trees used in the training process. By default, the learning rate is set to

14. Tuning parameters

0.1, and the number of trees to 100.

15. Hypothesis

In this case, we can set the following hypothesis: Using lower learning rate, along with higher number of trees will result in better performance or lower error rate. To test this hypothesis, we use the following six models using different learning rate and number of trees. The first model using the default setting as used in the initial experiment. We repeat the same steps as before to train the models using the same backtesting framework, and then score and log the results with MLflow.

16. Analyzing the results

Let's leverage the MLflow UI to compare the performance of the lightGBM models. As you can see in the plot, model lightGBM6, marked in light green, is the winning model with the lowest MAPE and RMSE on average. This reflects an improvement of 10% in the RMSE score and 15% in the MAPE score with respect to the previous winning model - lightGBM1, marked with gray.

17. Experimentation constraints

The tuning process could continue until the marginal improvement is not significant under the constraints of time and computational cost. The performance of the lightGBM6 model is sufficient, and in the next chapter, we will focus on promoting the model to production.

18. Let's practice!

Now let's register and analyze the backtesting.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.