Get startedGet started for free

MLOps

1. MLOps

We have learned how to scale AI projects. In this video, we will focus on effective operationalization of AI software, commonly called MLOps.

2. Operationalization

But what exactly does operationalization include? Amazon defines it as “bringing advanced AI models to business-as-usual operations”. It is easier said than done and is a road full of challenges; let’s discuss a few.

3. Code and data version control

The experimental data science projects lead to the generation of many code versions, often maintained through version control systems like Git. In addition to code, the data involved in these projects is dynamic and has multiple versions. This can occur when new data is incorporated into existing datasets or new features are engineered for model training.

4. Portability

ML projects typically involve three environments: training, validation, and production. Hence, it is crucial that the libraries, frameworks, versions of the underlying packages, or any other dependencies are consistent throughout the lifecycle.

5. Right time to think!

The complexities of managing different versions and dependencies can quickly become overwhelming, mainly when focusing on proving the immediate business value in a PoC. Teams end up keeping scalability and sound architecture as an afterthought, leading to ad-hoc approaches that can create challenges down the line when it comes to maintenance and growth.

6. Automated data management

Consider the manual work required to re-prepare training data if errors, changes, or updates occur in the initial data sources or if different attributes must be selected. Automated data management provides a robust foundation for ensuring consistent, high-quality data. This approach includes quality checks and alerts for the responsible team when data quality declines.

7. Need for automation

Manual interventions to address data issues result in significant delays in response time. When the team notices degraded model performance, conducts an error analysis, and takes corrective action, considerable time may have passed. During this period, the system continues to make incorrect predictions, which could have various negative consequences depending on the application.

8. Code refactoring

AI model developers may lack expertise in core engineering best practices. In such cases, an MLE must often refactor the initial code to make it production-grade. This introduces a significant risk, as the original logic may be misunderstood or misrepresented during refactoring.

9. Characteristics of efficient architecture

Building reliable systems requires significant effort in engineering efficient architecture. What does such a system look like? Here is a glimpse: MLOps practices focus on standardizing processes and building automated pipelines to manage ML systems. Automation, by its very nature, promotes reusability. It could involve building reusable modules like data products, code, and more to ensure quality. Further, automated testing frameworks can catch issues in code logic before deployment. It reduces the risk of refactoring and saves ML engineers from manually fixing the issue.

10. Characteristics of efficient architecture

Using containers like Docker to manage dependencies bundles the model with its environment, leading to fewer manual interventions and making the lifecycle more efficient. Maintaining the model performance in production is challenging for varied reasons, such as changes in data patterns or code breaks. MLOps provides a framework to monitor the models in production through automated detection and resolution of model issues. Mckinsey states, “Companies that adopt comprehensive MLOps practices shelve 30% fewer models, thereby increasing value from their AI efforts by 60%.”

11. Let's practice!

Great, we have learned the challenges of maintaining models in production and the role of MLOps in their effective operationalization. Let’s practice!