Home
>
DevOps News
>
Delivering Production-Grade Machine Learning Outcomes with MLOps – InApps Technology 2022

March 29, 2022 by Phu Nguyen

Delivering Production-Grade Machine Learning Outcomes with MLOps – InApps Technology 2022

Main Contents:

Delivering Production-Grade Machine Learning Outcomes with MLOps – InApps Technology is an article under the topic Devops Many of you are most interested in today !! Today, let’s InApps.net learn Delivering Production-Grade Machine Learning Outcomes with MLOps – InApps Technology in today’s post !

What Is MLOps?

Just as DevOps linked software development with operations, MLOps links two disparate areas of machine learning: development of the machine learning model and operating it in production. Linking these areas requires the right team, and automated processes around continuous integration and deployment, bringing in aspects of DevOps. MLOps also requires broad collaboration between data scientists, infrastructure experts and data engineers.

The Last Mile Problems of Machine Learning

Nick Heudecker

Nick is Senior Director of Market Strategy and Competitive Intelligence at Cribl.

Enterprises building out machine learning capabilities have focused on two things:

The recruiting and training of data scientists.
Identifying suitable business case candidates for machine learning.

While important, these are only first mile problems for any enterprise. As long as data scientists have enough relevant data for training and testing algorithms, building a predictive model is a straightforward, iterative process. The model, and its supporting code, is only a small component of the overall ensemble needed to deploy and operate machine learning at scale.

Several last-mile problems must be overcome. Data collection and verification, deployment infrastructure, observing performance of the model, model analysis, and debugging — among other components — are essential to successfully deploying ML in the enterprise.

Today, these tasks are left to data scientists and the costs are substantial. Getting ML models into production takes about ninety days on average, with only 11% of companies consistently deploying in less than seven days (source: Algorithmia’s 2021 Enterprise Trends in Machine Learning). From the same source, most data scientists also report that the process of deployment takes 25% of their time.

Why does it take so long? Putting the various pieces together goes well beyond the typical skill set of data scientists. They are not software engineers, or infrastructure and operations specialists, but deploying production quality services requires the range of expertise those roles provide.

Compounding the production problem is model decay. In traditional software applications, the application’s code determines its behavior and output. That behavior is validated with tests. In a machine learning system, data determines the behavior of the model. In the real world, the data your model consumes will drift from training data, giving unreliable results. As the data skews between training data and deployment, the model’s performance degrades over time — often with little explanation. This means that models must be constantly observed and retrained as the data drifts from expectations. The need for constant redeployment exacerbates the earlier challenges around time to deploy.

Observability and MLOps

In the same way that conventional applications are evolving beyond monitoring towards observability, ML systems must adopt the same observability capabilities. Collecting and storing the full range of logs and metrics emitted by a ML system allows data scientists and I&O engineers to quickly determine the cause of performance issues, by actively interrogating a system’s behavior. Traditional monitoring solutions fall short here, by only supporting predefined dashboards over limited data volumes.

Another factor driving observability in ML systems is understanding why a given outcome occurred, even if the data is correct. Bias in ML systems has a massive societal impact, from criminal sentencing to resume filtering. Businesses making decisions based on automated algorithms must be able to justify why and how a given outcome was reached, and that it wasn’t the result of bias. Being able to ask questions about the behavior of an ML system is essential, and that’s where observable ML systems come into play.

Structuring the MLOps Team

An effective MLOps team is cross-functional. It comprises skills and capabilities from five different roles:

Data Scientist/Machine Learning Researcher

The data scientist or ML researcher is responsible for the discovery and creation of the model using a combination of algorithms and data.

Data Engineer

Data engineers configure and maintain data infrastructure and pipelines supporting applications and information systems. This role has evolved significantly over the last three years, expanding into a standalone function in many enterprises. You can read more about the data engineering role here.

Infrastructure and Operations Engineer

I&O engineers, also called DevOps engineers and sometimes site reliability engineers (SREs), are responsible for the reliability and resiliency of infrastructure and data. These are often the roles responsible for the deployment and monitoring of machine learning models, and the data they consume, in production environments.

Product Manager

Identifying the customer needs and the business case for machine learning and managing the delivery of the product is the role of the product manager. The involvement of this role is often the determining factor in creating a successful and well-integrated machine learning application. Without product management, many ML projects remain as irrelevant lab projects.

Business Stakeholder

Put simply, the stakeholder is the person (or group of people) with an interest in the outcome of the ML deployment. This is commonly where the budget for the project comes from.

The Machine Learning Pipeline

With a cross-functional team in place to fund, develop and deploy models, the focus turns to the ML pipeline. The idea of an ML pipeline comes from the data engineering concept of a data pipeline. A data pipeline connects a data source and a destination and defines the data transformations in a graph of dependencies. In many cases, this is an evolution of traditional extract, transform and load (ETL), since it can go beyond batch processing to streaming and event-based consumption patterns. Data pipelines are also automated, freeing data engineers from repetitive, redundant work.

Most ML environments rely on manual processes, and for good reason. Today’s machine learning workflows consist of dozens of tools — each with their own languages, user experiences, performance characteristics, and skillset assumptions. The data used by these tools is equally diverse, residing in data warehouses, object stores and feature stores.

This is where the DevOps influence brings value to ML deployments. Given the number of tools used in machine learning pipelines, continuous integration tests must be automated across a range of tools — ensuring that every step validates its input and output, and that the next step can consume the previous output. This process must be reproducible, which is where version tracking of models and integration code comes in. Ideally, there would be solutions that allowed us to version control data as well, but the volumes used for training often make this impossible. There is no consensus yet on how to achieve data versioning for machine learning.

Like other *Ops trends getting attention from vendors and end users, MLOps requires more than just technology investment. It requires collaboration across disparate parts of the organization to ensure the right problems are being solved with machine learning. Technology provides the environment to manage and observe deployed machine learning projects in collaboration with a strong team.

Feature image via Pixabay.

Source: InApps.net

List of Keywords users find our article on Google:

cribl

“mlops”

devops engineer resume template

etl testing jobs in usa

etl testing resume

learning rate decay

cribl jobs

etl consultant resume

etl developer resume

continuous integration resume

ecommerce product manager resume

structuring machine learning projects

production grade ai

it infrastructure engineer resume sample

production machine operator resume sample

learning and development consultant resume

ml observability

production grade

cribl linkedin

manage offshore team resume

etls in 90 days

free ats resume template 2020

machine learning

“ml system”

risk of bias graph template

data intelligence wikipedia

devsecops collaboration and messaging

e commerce testing resume sample

ecommerce software testing resume

resume for etl developer

devops projects for resume

data engineer resume template

ml observability platform

sample resume for experienced devops engineer

production grade development

mlops 2022

deployment.toml

app lab projects

devops qa resume

production management wikipedia

grade mobile trustpilot

ninety days from today

machine learning consultant jobs

machine learning researcher jobs

outsource etl testing services

qa tester resume template

mlsystem

edtech recruiting

qa mobile testing resume

resume for production engineer

etl wikipedia

ml projects for resume

ensemble learning code

mldevops for healthcare

verified gitops

ats resume template 2020

ml ops

i&o

machine learning projects

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.