They say that IT specialists or programmers are masters of automation. They will spend 8 hours creating a script that will save them 10 minutes of work. One time. Sometimes it goes further and from such scripts made on the fly they will create an entire pipeline that does everything from A to Z. Until one of the steps of such a workflow receives slightly different input data and falls off the bike.

The issue is no different when it comes to implementing solutions based on Machine Learning. At some point in the project, you have to move from the experimentation phase to the implementation phase. Is such a transition done only once? Can we partially or completely automate the implementation of the ML solution, from data to the final product (implemented solution)? What is MLOps and how can it help us automate wisely?

Background

Let's look at the prism of moving from data to product from Michael's perspective.

Michał is a young, middle-aged gentleman who worked in the Machine Learning industry. He worked at Sample-POL Sp. z o. o. as a machine learning engineer.

Michał Taki*, the hero of the story

A new project recently came to Michał's company. It was a project commissioned by the New Zealand government. Michał was to create an ML model to detect possums on camera images. They supposedly prowl this Zealand and destroy crops, chew through cables in cars, etc. Interestingly, he received prepared, marked data from the client, because they were so fed up with these possums that they got together and, with a committee of volunteers, marked these nasty possums on the images collected from the cameras.

Michał has the data, so you could say that it was easy for him now! He got to work right away.

He took some ready-made model that he found on the internet. He trained the model on the received data, played around a bit with tuning hyperparameters. After some time and some GPU resources used, he managed to train a model that meets the assumptions of the first phase of the project.

He wrapped the model in an API in FastAPI (a Python framework often used to build web applications). He wrapped it all up via FTP to a server and it worked. He has the first working solution.

Time passed. In distant New Zealand, possums began to wear cotton caps and trappers, and the model stopped recognizing them. Unfortunately, Michael did not have any monitoring of his model, so he learned about the model's errors from the New Zealand Minister of Animals, who sent him a photo of possums in caps chewing through the battery cables in his Dacia Sandero.

Poor Michael sat down at the computer and didn't know what to do next. He supposedly got new data, even marked, but he had to train the model again, upload it to the server, swap paths. And so on a dozen or so times, because whenever he introduced a new model, after some time the possums used various tricks to fool the cameras. Michael was tired. And then the colleague from the desk next to him mentioned to him that it might be worth building some mechanism that would help carry out part or all of this process. She mentioned that it might be worth talking to the DevOps team, which creates such automations for various applications that the company had produced up to that point.

Michael, like a deer to a feeder, rushed to the desk and began googling furiously. There it was! He came across something called MLOps.

It looks like DevOps for machine learning, is that what I am looking for? – wondered Michał.

What is MLOps

Michael dug deeper. First, let’s see what he found out about MLOps after an intense googling session.

87%

Michał came across a report. The report stated that 87% of ML models would not reach production. Although the report was a bit old, from 2019, and a bit poorly documented, it planted a seed of uncertainty in Michał's head. Because as he found out, reaching production is one thing, but operationalization (a difficult word) is also an important element.

To operationalize a product is to deliver it to production, which means deploying, monitoring and maintaining the solution.

DevOps for Machine Learning

Michał knew that in the case of "traditional software development" we have a concept called DevOps, which allows us to handle this cyclical software development process very nicely. We code, build, test, implement, monitor, improve, etc.

And if we look at the life cycle of a project based on machine learning, we can also see certain cycles and loops that would fit in a similar way as in the case of software.

And so the term MLOps is defined as an “extension” of the DevOps methodology in such a way that the data science and machine learning process and artifacts are, let’s say, first-class passengers in the DevOps ecosystem.

MLOps, like DevOps, is the result of understanding that separating the development of an ML model from the process that delivers it—ML operations—decreases the quality, transparency, and agility of all intelligent software .

MLOps is a set of tools and best practices for bringing machine learning into production. What are the assumptions behind MLOps?

We strive to have a language- and framework-independent, uniform release cycle where artifacts are properly tested and play the leading role in a continuous integration and delivery system. This allows us to deploy such software more agilely, while reducing technical debt.

The premise is nice, but why bother and add more work to it?

As a Data Scientist, what percentage of your time do you spend deploying ML models?

Well, in addition to the previously mentioned topics such as testability, automation or technical debt, time is also an important aspect. In one of the studies ( Algorithmia.com study ) the question was asked: What percentage of time do you spend as a Data Scientist on deploying ML models?

We can assume that any improvement in this matter can save a lot of precious time (because time is money).

As the beautiful proverb goes:

“Models are temporary, streams are eternal.”

What does the MLOps process look like?

We know more or less what MLOps is and how it can help us. What does such a process look like?

Agile ML Workflow

Image from ml-ops.org by “INNOQ”

Areas

Well, as we said earlier, in the case of MLOps we are dealing with 3 overlapping areas: Design, Model Development, Operations.

Design

The first phase — design — is dedicated to understanding the business, understanding the data, and designing software that uses machine learning.

At this stage, we identify our potential user, design a solution to solve their problem, and evaluate further development of the project.

We define and prioritize ML use cases. Of course, we also check the availability of data that will be needed to train our model and determine the functional and non-functional requirements of our ML model.

We should use these requirements to design the ML application architecture, establish a serving strategy, and create a test suite for the future ML model.

Model development

The next phase “ML Experimentation and Development” is dedicated to verifying the applicability of ML to our problem by implementing a Proof-of-Concept for the ML model.

Here, we iteratively perform various steps such as identifying or refining the appropriate ML algorithm for our problem, data engineering, and model engineering.

The main goal in this phase is to deliver a stable quality ML model that we can run in production.

Operations

And then the “ML Operations” phase, where the goal is to deliver the previously developed ML model to production using established DevOps practices such as testing, versioning, continuous delivery, and monitoring.

All three phases are interconnected and influence each other. For example, a design decision during the design phase will propagate to the experimentation phase and ultimately influence implementation options during the final operations phase.

The Data to Product Process

Let's now look at an overview of a typical workflow for building software based on machine learning.

Machine Learning Engineering

Image from ml-ops.org by “INNOQ”

The goal of a machine learning project is to build a model by using collected data and applying machine learning algorithms to it. Therefore, every machine learning software contains three main artifacts: Data, ML Model, and Code. Looking at these artifacts, a typical machine learning process can be divided into three main phases:

Data Engineering:

ML Model Engineering

Code Engineering

Integrating the ML model into the final product.

The most cliched chart…

You've probably seen this diagram somewhere before.

ML systems is more than ML code.

ML System Elements from Google's MLOps Article

If we look at all the building blocks that we see here in the form of short slogans, it confirms the thesis that only a small part of the ML system in the real world consists of ML code.

What are we striving for?

MLOps is an ML engineering culture that in its final phase is based on the following practices:

What are the tools?

Michael learned what MLOps is and what the process looks like. And it all seemed simple and logical. The next step on his list was to look around for tools that were available on the market. He found this cool site that lists tools in the ML and MLOps areas.

Quite a few tools…

Screenshot from The LF AI & Data landscape

He saw that, well, a few of these tools will be found. In each of the areas.

– That's a lot… – thought Michael.

Let's go back to our process. In the image above we can see tools grouped into certain categories. If we plot these categories on the process diagram, we get something that can be called a MLOps stack template.

MLOps Technology Stack Template

AI Landscape

Painting by Henrik Skogström (Valohai)

Now from all these tools that we have seen somewhere before, we should choose those that we will implement in our project. We can use the list from the first image in this section, we can also look through various repositories on github that also group such tools, such as Awesome Production Machine Learning.

How to apply tools to the problem?

Let's say that by browsing through this vast ocean of tools, Michael managed to choose the ones he would like to use. What should he do now? Try to implement all of these tools at once? In stages? Or maybe just give up altogether? Well, Michael is not stubborn, so he divided his work into certain stages.

MLOps - Levels to get

Like in a computer game, Michał divided the quest for MLOps Nirvana (and ultimately the effective neutralization of possums — let's not forget that) into certain stages.

MLOps Level 0 – Manual process.

A typical data science process that is performed at the beginning of an ML implementation. This level is experimental and iterative. Each step in each pipeline, such as data preparation and validation, model training and testing, is performed manually. A common way to process this is through Rapid Application Development (RAD) tools such as Jupyter Notebooks.

MLOps Level 1 – ML Pipeline Automation

Automation of the pipeline or pipe or in Spanish (la tuberiijía) of machine learning. The next level involves performing model training automatically. Here we introduce continuous model training. As soon as new data is available, the model retraining process is triggered. This level of automation also includes data and model validation stages.

MLOps Level 2 – Automating Continuous Integration and Delivery.

In the final stage, we introduce a CI/CD system to quickly and reliably deploy ML models to production. The fundamental difference from the previous stage is that we now automatically build, test, and deploy data, the ML model, and the ML training pipeline components.

Automated ML Pipeline

Image from ml-ops.org by “INNOQ”

Recall that at Level 3 we want to have an automatic pipeline as in the diagram, where subsequent steps are triggered continuously and automatically.

Phases

The table below provides a neat summary of the MLOps stages that represent the process of automating a machine learning pipeline and what we would like to see happen once this stage is implemented.

MLOps Stage Output of the Stage Execution
Development & Experimentation (ML algorithms, new ML models) Source code for pipelines: Data extraction, validation, preparation, model training, model evaluation, model testing
Pipeline Continuous Integration (Build source code and run tests) Pipeline components to be deployed: packages and executables.
Pipeline Continuous Delivery (Deploy pipelines to the target environment) Deployed pipeline with new implementation of the model.
Automated Triggering (Pipeline is automatically executed in production. Schedule or trigger are used) Trained model that is stored in the model registry.
Model Continuous Delivery (Model serving for prediction) Deployed model prediction service (eg model exposed as REST API)
Monitoring (Collecting data about the model performance on live data) Trigger to execute the pipeline or to start a new experiment cycle.

What do you need?

After analyzing the MLOps steps, we can see that configuring MLOps requires installing or preparing several components. We need:

This aligns with our MLOps stack template that we saw earlier. And actually, a lot of these elements are similar to what we have in “traditional” DevOps.

DevOps vs MLOps - The Ultimate Showdown

A complete ML development pipeline includes three levels at which changes can occur: Data, ML Model, and Code. This means that in machine learning-based systems, the trigger for building can be a combination of a code change, a data change, or a model change.

What distinguishes MLOps from DevOps is that at each level of automation we must remember that we have not only code, but also data and ML models that we must include in the designed automations.

Afterword

And that’s where the story of Michael ends, who by implementing the MLOps automation pipeline bravely solved the possum problem in New Zealand, and additionally gained valuable experience in automating the implementation of ML solutions. Let’s briefly recap the things worth remembering from this short story: