1. Business challenges and risks related to MLOps
Welcome back!
In the previous chapter, we discussed the complete MLOps lifecycle. Now, we will discuss how to succeed with MLOps in practice.
2. Agenda of chapter 3
We will touch on how to prepare for and handle risks, make MLOps teams shine, and how the MLOps landscape looks as of today.
3. MLOps is challenging
Many businesses find MLOps challenging and it is widely accepted that operationalizing, automating, and scaling machine learning applications is challenging.
There are a variety of reasons for this. We will discuss them next.
4. MLOps requires diverse skills
First, remember that MLOps lies at the intersection of different fields.
5. MLOps challenges: skills
Teams might lack certain skills. Often these are related to software engineering, such as profound experience in validating and testing.
This can lead, for example, to technical debt. Technical debt is extra effort teams have to do at a later stage because a quick and cheap solution has been chosen in the first place. It also concerns a lack of standardization or the inability to reproduce earlier results.
6. MLOps challenges: collaboration and culture
Related challenges concern collaboration and culture. We already discussed the danger of silos. MLOps applications might not be accepted unless business and statistical metrics are closely geared. All stakeholders need to understand and accept how modern MLOps teams operate. Otherwise, MLOps initiatives might fail early, and critical talent might leave. Important is also a culture of constant learning and the acknowledgment that failures happen, and a team-wide habit of documentation and transferring one's knowledge. More about that later!
7. MLOps challenges: technology
Another challenge relates to technology. Everything seems very dynamic, and the popular tool stack from one year might look different compared to the following years. It leads to a strong demand to keep both team skills and tools up-to-date. We will discuss this later in this chapter, but this, in turn, means there is not yet a common tool stack for MLOps, and different teams will implement widely different solutions with relatively little guidance and shared best practices.
As a result, a common mistake to avoid is to be over-reliant on technology. Tools are an important part of MLOps, but there is more, as we have seen.
8. MLOps challenges: risks
As with all business initiatives, there are risks involved with deploying and operating machine learning models. MLOps explicitly tries to reduce the business risks of running machine learning models but cannot exclude them entirely. What are typical risks associated with machine learning operations?
There are business and legal risks often associated with unavailable models or lower-than-expected or deteriorating prediction quality. This can lead to direct costs; imagine, for example, a trading app where the actual generated return is lower than expected after development. It can also lead to indirect costs by reduced customer experience and churn. We also discussed the demanding skill requirements for MLOps. The churn of key talent can risk not being able to maintain or adapt our applications.
There might also be governance risks, especially with black-box models, since we cannot fully control them. Well-known companies like Amazon, Microsoft, and Google all had to deal with negative reputation due to wrong machine learning predictions. The model might also discriminate against certain subgroups leading to possible legal exposure.
Finally, as with other deployed IT solutions, there is, of course, also a potential cyber security risk. There are even machine learning models developed to fool other models.
9. Traditional software projects vs. MLOps
MLOps is more challenging than traditional software applications because it requires performing basically all the same steps and tests required for classical software, but also because the underlying data might change and the non-deterministic nature of machine learning. Even done all that, we are less sure about how well our MLOps application will work compared to traditional software solutions.
10. Let's practice!
Now with these challenges in mind, let's practice again!