1. The state of MLOps today
Hello again! So far, we have repeatedly referred to DevOps because MLOps originated from there. But there is a crucial difference between the two. While DevOps is an established set of practices, although not narrowly defined, focusing on collaboration and automation, MLOps is much less mature and even less well-defined.
2. The tool challenge
That does not only concern the collaboration and culture aspects but also the so-called tool stack. To develop, deploy, operate, and monitor machine learning models professionally, we need a sophisticated architecture, as seen in this image. We will not go into detail, but there are feature stores, metadata stores, a code and a model registry, and many more components. Different tools are available for each of these components, some of them only small in scope, others whole platforms providing end-to-end capabilities. This enumeration again highlights the skills and teamwork required to excel in MLOps.
3. Where can you host your MLOps application?
In Chapter 1, we briefly discussed that we could run MLOps via the big cloud providers, other managed platforms, or build our own tool stack. Of course, there are also combinations of those three options possible.
4. Big 3 cloud providers
For tools like AWS Sagemaker, Microsoft Azure Machine Learning, or Google Vertex AI, one will usually pay higher fees using these services than compared to cheaper but not managed cloud instances.
5. Managed MLOps platforms
The pricing schema of managed platforms like Databricks, DataRobot or dataiku depends on the platform, while
6. Free open-source tools
open-source tools are typically free, and many of them can be run from within Python, for example. There are many open-source tools available for MLOps. Python itself is open-source and free to use. Here is a non-exhaustive list of such tools. These tools can also be installed on cheaper cloud instances.
7. Advantages and disadvantages of self-build tool stacks
The advantages are much higher flexibility. The team can customize the tool stack to their individual needs. It is also usually cheaper, as we discussed. On the downside, they need to set up and maintain the tool stack. And do not underestimate the maintenance part here. Since MLOps is a very dynamic field, frequent new updates are often necessary to implement, for example, for security reasons or to run the latest models.
8. How to start your MLOps journey
If a business just started its MLOps journey, setting up the infrastructure manually usually does not make much sense. One needs sophisticated skills to both build and maintain such an infrastructure, and while there are limitations with full platforms such as Databricks, these will typically not emerge very soon.
9. State of MLOps
When concluding the state of MLOps today, we already noticed that the term is still vaguely defined. For example, many machine learning models never make it into production, and we often do not precisely know why. Neither is there something like a reference tool stack. MLOps is still a vague term, and it is rather in its infancy stage, so a lot of transformation and consolidation is still expected concerning tools and practices. This is one reason why MLOps is still perceived as very challenging, and it is arguably so! The discipline sees a lot of dynamics accordingly, and MLOps remains challenging. But this makes it an excellent time to start or extend MLOps capabilities.
10. Let's practice!
Now, let's test your knowledge!