Home
>
DevOps News
>
Site Reliability Engineering Is a Kind of Magic – InApps 2022

March 30, 2022 by Phu Nguyen

Site Reliability Engineering Is a Kind of Magic – InApps 2022

Main Contents:

Site Reliability Engineering Is a Kind of Magic – InApps is an article under the topic Devops Many of you are most interested in today !! Today, let’s InApps.net learn Site Reliability Engineering Is a Kind of Magic – InApps in today’s post !

A Sobering Reality

It’s easy to throw out yet another three-letter acronym and claim it’s a magical elixir for all the problems involved with running complex IT systems. In reality, engineering reliability into distributed systems with thousands of containerized applications and microservices is a tough gig. Not least because of all the moving parts, but also because any preconceived notions about predictable system behavior no longer apply.

Take for example keeping watch over a modern software application. This might consist of business logic written in polyglot languages and linked to the legacy ERP system (custom built or packaged or both). There’ll also be a raft of databases — traditional relational for transactional support, yes, but more likely a smorgasbord of NoSQL data stores — be that in-memory, graphing or document — perhaps fronted by recently adopted Node.js.

Some of this componentry will be on-premise, some will be containerized and moved to the public cloud — that might mean Docker and Kubernetes on AWS, but maybe Azure and Mesos — heck, why not both for some hybrid-style resilience?

But like the old Monty Python sketch, “you’ll be lucky” if this is all you ever have to manage. Depending on the nature of the business, there’ll also be a glut of third-party services — including payment processing and reconciliation. That’s not to mention all the new web and mobile apps interacting with the core business systems through an API gateway and possibly some analytics horsepower delivered by the likes of Hadoop and ElasticSearch. It’ll take a lot of operational wizardry to keep all that performant.

Fortune Favors the Bold

In a wonderful talk at SREcon earlier this year, Julia Evans from Stripe described the realities of managing today’s complex distributed systems. What was refreshing about her presentation was the open admission that she often finds the work difficult and how there’s always a ton of new stuff to learn. As she says in her abstract, she doesn’t always feel like a wizard (echoing the protestations of Harry Potter).

This honesty illustrates what’s exciting about being an SRE. With systems like the ones described above causing any number of thorny problems, it’ll be the inquisitive and brave that keep business on track. Being an SRE isn’t for the faint-hearted or those happy with a fire-fighting status-quo. It’s for those within our ranks who get bored easily — those super sleuths who keep asking reliability questions, crafting improvements — and learning as they go.

So, if we consider a typical business-critical problem that could impact our modern application — let’s say some latency issue is causing an increasing number mobile app users to abandon a booking service? How would teams address the issue? Problems like this might go unnoticed for some time, or there could be a deluge of alarms. Even when a problem is identified, where do teams find the root-cause? Is it a problem with a new code release or at the API gateway? Is it a down to some weird microservices auto-scaling issue and was that earlier CPU increase we thought was OK but actually was really bad?

With an SRE-style approach, business critical problems are never addressed in knee-jerk fashion. Using modern tooling in areas such as application performance management and app analytics, SREs can observe the real-time behavior of applications, with systems collecting and correlating information from all related components. Rather than react after the fact, these solutions continuously identify anomalous patterns (like those mobile app abandonments) and compare them to historical trends — meaning SRE’s are alerted well before the business is impacted.

But beyond exposing new normal application weirdness and “unknown-unknowns,” modern tools also encourage and stimulate more of the SRE detective work — the real valuable stuff. These tools won’t just detect anomalies and then leave teams scrambling to find the needle in a haystack of needles. Instead, they’ll analytically gather all the evidence and lead teams in fact-based fashion towards a solution. Like for example, using an SRE inspired monitoring service to detect a performance anomaly introduced with a new software build and then tracing to the actual code causing the problem.

Like Harry Potter, operations professionals might have a hard time accepting they’re wizards. But ask yourself this — do you want to remain a silly muggle getting burnt out by constant fire-fighting? Of course not, it’s career limiting and sucks. Time then for some SRE magic — gaining the skills and tools needed to adopt new tech like containers and microservices — becoming an essential part of future-proofing your business.

CA Technologies is a sponsor of InApps.

Feature image via Pixabay.

InApps is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker.

Source: InApps.net

List of Keywords users find our article on Google:

“performance based fire protection engineering”

magic edtech

kubernetes chaos engineering

mesos monitoring

magic ed tech

ca technologies api gateway

site reliability engineer jobs

hire distributed systems engineers

fire extinguisher website template

hire elixir developers

etsy trustpilot

sre questions

exists elasticsearch

constant fire protection

monty python wikipedia

raft wikipedia

harry nguyen real estate

ats reliability

collection net raft

site reliability engineering manager jobs

ca application performance management

elasticsearch react native

google sre culture

elasticsearch multi field

site reliability engineering logo

managed elasticsearch azure

whatsapp business api gateway

race car party favors

elasticsearch node js example

status quo cd

monitor mesos

elasticsearch service performance

magic software linkedin

etsy notion template

sre wikipedia

site reliability engineer linkedin

si finds etsy

capital one backend development

what is kubernetes equivalent in aws

dont touch my phone muggle

modern wizard ui

docker deluge

peter nguyen linkedin

site reliability engineer questions

magicedtech

wizard of wikipedia

argo blockchain share chat

jerk pit reviews

kubernetes cost anomaly detection

harry potter letter template

reliability engineering manager jobs

steady as she goes meaning

sre manager google

ca technologies jobs

harry potter icons for apps

harry potter website template

qa technologist

harry potter template letter

reliability wikipedia

advanced elasticsearch course

etsy trust pilot

might and magic upload

elixir developer jobs

sre booking

kubernetes cloud cost anomaly

stripe software engineer jobs

mesos health

argo workflows

hire elixir developer

nodejs component with stripe api

stripe reviews trustpilot

needles case management software

elasticsearch net

harry potter party favors

modern application development with python on aws

srecon

argo tires

google site reliability engineer

hire site reliability engineers

performant healthcare

reliability of wikipedia

site reliability engineering at google

workwell technologies jobs

argo workflow

docker elixir

elasticsearch-hadoop

mesos docker

nodejs elasticsearch

honesty net solutions

detect magic

elixir web solutions

kubernetes node status unknown

azure logic app performance

aws auto scaling latency

elixir dev

auto scaling latency

monitor aws elasticsearch service

mesos performance

elasticsearch monitoring

auto scaling monitoring

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

May 29, 2025 by Anh Hoang

Site Reliability Engineering Is a Kind of Magic – InApps 2022

Read more about Site Reliability Engineering Is a Kind of Magic – InApps at Wikipedia

A Sobering Reality

Fortune Favors the Bold

List of Keywords users find our article on Google:

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2023

AI Automation for Business in 2025: A Step-by-Step Guide

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

Locations

Read more about Site Reliability Engineering Is a Kind of Magic – InApps at Wikipedia

A Sobering Reality

Fortune Favors the Bold

List of Keywords users find our article on Google:

Get a custom Proposal

You need to enter your email to download

Blog post

Locations