Home
>
DevOps News
>
Make SRE More Proactive by Shifting Left – InApps Technology 2025

March 19, 2022 by Phu Nguyen

Make SRE More Proactive by Shifting Left – InApps Technology 2025

Main Contents:

Make SRE More Proactive by Shifting Left – InApps Technology is an article under the topic Devops Many of you are most interested in today !! Today, let’s InApps.net learn Make SRE More Proactive by Shifting Left – InApps Technology in today’s post !

Key Summary

This article, authored by Andreas Grabner of Dynatrace, discusses how AIOps (Artificial Intelligence for IT Operations) can transform Site Reliability Engineering (SRE) by adopting a proactive, “shift-left” approach. Presented in the context of DevOps, it addresses the limitations of traditional manual troubleshooting and older AIOps solutions, proposing a more integrated, automated strategy to enhance software delivery and system reliability. Key points include:

Current Challenges in DevOps:
- Manual Troubleshooting: According to the Puppet State of DevOps Report and Dynatrace Autonomous Cloud Survey, 90% of organizations rely on manual troubleshooting and remediation, which is unsustainable with the expected tenfold increase in production deployments over the next 12 months (as of 2022).
- Dynamic Environments: Modern multicloud, containerized, microservices-based architectures with frequent deployments (e.g., blue/green, canary, feature flags) make root-cause analysis complex due to millions of dependencies.
Limitations of Gen 1 AIOps:
- How It Works: Early AIOps solutions ingested logs, metrics, and traces to find correlations for root-cause analysis, suitable for low-frequency, predictable deployments.
- Shortcomings: Struggle in dynamic, high-frequency deployment environments where correlating data across numerous services is inefficient, failing to provide fast, precise insights.
Shifting AIOps Left:
- Concept: Integrate AIOps into development, testing, and pre-production stages (shifting left) to create test-driven operations, similar to test-driven development.
- Implementation:
  - Use Keptn (a CNCF open-source project) to orchestrate pre-production environments, where AIOps monitors load tests, chaos engineering, and auto-remediation scripts.
  - Validate AIOps’ ability to detect anomalies and trigger remediation before production, ensuring proactive fixes rather than reactive responses after user issues.
- Benefits: Reduces downtime, improves mean time to repair (MTTR), and ensures consistent digital experiences by battle-testing remediation in chaotic scenarios.
Integration with Platforms and Processes:
- Holistic Approach: Embed AIOps into CI/CD pipelines, development, testing, and SRE practices to automatically learn and adapt to intentional/unintentional behavior changes.
- Outcome: Enhances anomaly detection and auto-remediation, enabling faster, more reliable software delivery and healthier production systems.
Future Outlook:
- A follow-up article will explore additional AIOps best practices, but this shift-left approach lays the foundation for proactive SRE, aligning with modern DevOps needs.
InApps Insight:
- Shifting AIOps left aligns with modern DevOps trends, enabling proactive reliability and scalability in complex, cloud-native environments.
- InApps Technology can integrate AIOps solutions like Dynatrace and tools like Keptn into client workflows, enhancing CI/CD pipelines and ensuring robust, automated operations for microservices-based applications.

Why Gen 1 AIOps Solutions Fall Short

The first wave of AIOps solutions provided observability by ingesting data, including logs, metrics and traces, and analyzing this data for possible correlations to explain the root cause of technical problems or changed user behavior. At the time, IT teams could count how many deployment and configuration challenges associated with production workloads occurred each year, so this use of AIOps worked fine for a relatively low number of these challenges. Because the frequency of changes was so low and predictable, it was easier for ITOps teams to manage maintenance windows and keep downtime and mean time to repair (MTTR) to a minimum.

But that is not the environment digital teams are living in today. Now, production deployments are counted in days, not years. Multicloud environments have grown increasingly more dynamic and containerized. Most new application architectures leverage microservices that are deployed as containers in multicluster, multicloud environments, making it even harder to keep track of changes and find root causes.

Teams are moving toward progressive delivery models for deployments (blue/green, canary, feature flags), where instead of replacing entire systems, individual services are upgraded and replaced with new iterations on a piecemeal basis. Environments change too quickly for correlation-based machine learning algorithms to establish a baseline of what’s normal. Also, with potentially millions or billions of dependencies between applications, infrastructure, containers and microservices, it’s harder to correlate logs, metrics, and traces for conclusions. There are too many services involved.

As dynamic multicloud environments drive new changes in delivery and operations, AIOps must adapt accordingly for DevOps teams and site reliability engineers (SREs) to maximize the value they, and their organization, can get out of it. In other words, teams need to ensure they’re doing AIOps the right way.

Tighter Integration Between Processes and Platforms

A more dynamic, comprehensive approach to AIOps goes beyond simply updating your AIOps tools. It means integrating AIOps solutions into everything — development processes, testing, DevOps and SRE practices — and embedding it within your internal platforms. Closing the gap between your AIOps solutions and your internal platforms and processes is what enables AIOps to precisely, and automatically, absorb and learn about both intentional and unintentional behavior changes occurring in your CI/CD pipelines.

The more that ITOps teams can leverage AIOps as part of chaos engineering, the more battle-tested and validated those solutions become at anomaly detection. That validation then gives teams confidence in their AIOps solution’s ability to auto-remediate issues in production environments. If it can handle itself in chaotic scenarios, its automated anomaly detection can deliver fast, precise answers — along with the remediation to back them up — in any situation.

Creating More Proactive ‘Test-Driven Operations’

SREs use service-level objectives (SLOs) to validate and track how systems behave in production, under different workloads or conditions, and write auto-remediation scripts to make whatever adjustments are needed to maintain availability and a consistent digital experience. But this is a reactive position, so engineers are often only deploying the auto-remediation code after a user has had a problem and their digital experience has been compromised.

Shifting AIOps left enables a more proactive approach, where resiliency and auto-remediation scripts are tested before they enter production. One way to do this: engineers can use Keptn, an open-source CNCF project to orchestrate a pre-production environment monitored by the AIOps solution for loading tests, injecting chaos and validating auto-remediation scripts. This is the “shift left” part: By integrating the AIOps solution into this “test-driven operations” environment, you validate the ability of AIOps to trigger auto-remediation scripts in the event of an issue. Rather than the engineers having to script and deploy auto-remediation code after a user has experienced an issue, the AIOps tool can proactively deploy the fix immediately, because it’s already been battle-tested for those scenarios ahead of time.

In my next article, I’ll delve into a couple more examples of how engineers can leverage AIOps the right way, but this use case should hopefully begin to highlight how AIOps, when done right, helps ensure healthy systems in production. Just as test-driven development processes help developers create better quality code, test-driven operations will help engineers maintain more stable production systems and more consistent digital experiences for users, in turn driving more value for the organization overall.

Feature image via Pixabay.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

July 18, 2025 by Anh Hoang

Make SRE More Proactive by Shifting Left – InApps Technology 2025

Key Summary

Read more about Make SRE More Proactive by Shifting Left – InApps Technology at Wikipedia

Why Gen 1 AIOps Solutions Fall Short

Tighter Integration Between Processes and Platforms

Creating More Proactive ‘Test-Driven Operations’

Offshore AI Chatbot Development: Driving Business Innovation

AI‑Driven Automation: 7 Real‑Life Business Success Stories (2025 Update)

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2025

Why Your Business Needs a Mobile App Rather Than a Website

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2025

Offshore AI Chatbot Development: Driving Business Innovation

Offshore AI Development Center Services: Unlocking Global AI Expertise

AI‑Driven Automation: 7 Real‑Life Business Success Stories (2025 Update)

Locations

Key Summary

Read more about Make SRE More Proactive by Shifting Left – InApps Technology at Wikipedia

Why Gen 1 AIOps Solutions Fall Short

Tighter Integration Between Processes and Platforms

Creating More Proactive ‘Test-Driven Operations’

Get a custom Proposal

You need to enter your email to download

Blog post

Locations