Service Mesh is currently an emerging trend in the cloud native and microservices world. There are huge reasons behind the popularity of this platform type and Service Mesh’s application in the tech industry has become popular more than ever before. In this article, we will:
- Provide you with a throughout understanding of Service Mesh
- Guide you through a comprehensive comparison of the most popular Service Mesh platforms
Without further ado, let’s dive right in.
What are Service Mesh platforms?
Service mesh is an infrastructure layer that enables secure service-to-service communication. It relies on lightweight network proxies deployed alongside each microservice.
A centralized control plane orchestrates the proxies to manage traffic policies, security, and observability.
Even though service mesh is predominantly used with microservices packaged as containers, it can also be integrated with VMs and even physical servers.
By leveraging the traffic policies of service mesh efficiently, applications running across multiple environments can be seamlessly integrated. This factor makes service mesh one of the key enablers of the hybrid cloud and multi-cloud.
What problems does Service Mesh solve?
The shift to microservices comes with its own set of challenges. If architecting, designing, and developing microservices is considered to be complex, deploying and managing them is no less complex. Deploying and managing microservices are challenging tasks.
In fact, developers need to ensure that communication across the service is secure. Not only that, they also need to implement distributed tracing that tells how long each invocation takes.
Some of the best practices of distributed services such as retries, and circuit breakers bring resiliency to services. Microservices are typically polyglot and use disparate libraries and SDKs.
Writing a generic, reusable software to manage intra-service communication across different protocols such as HTTP, gRPC, and GraphQL is complex, expensive, and time-consuming.
After a microservices-based application is deployed, day two operations are performed by the DevOps teams. They need to monitor the service health, latency, logs, events, and tracing.
DevOps teams are also expected to implement policy-based routing to configure blue/green deployments, canary releases, and rolling upgrades.
Finally, the metrics, events, logs, and alerts originating from multiple microservices need to be aggregated and integrated with existing observability and monitoring stacks.
Service Mesh attempts to solve these problems for developers and operators. Cloud-native advocates recommend using Service Mesh when running microservices in production environments.
Service mesh frees developers from building language-specific SDKs and tools to manage intra-service communication.
For operators, service mesh delivers out-of-the-box traffic policies, observability, and insights from the stack.
The advantages of using Service Mesh
The best thing about a service mesh is that it is a “zero-touch” software that doesn’t force change in the code or configuration.
By leveraging the patterns of the sidecar, a service mesh injects a proxy into every service which acts as an agent for the host service. Since the agent or the proxy intercepts every inbound and outbound call, it gains unmatched visibility into the call stack.
Each proxy associated with a service sends the telemetry collected from the call stack to a centralized component which also acts as a control plane.
When operators configure a traffic policy, they submit it to the control plane which pushes that into the proxy to influence the traffic.
Software Reliability Engineers (SREs) leverage the observability of the service mesh to gain insights into the application.
Service mesh integrates with an existing API gateway or a Kubernetes ingress controller. While the API gateway and ingress tackle the north-south traffic, a service mesh is responsible for the east-west traffic.
Most Popular Service Mesh platforms in comparison.
When coming to a pool of popular Service Mesh platforms, you might wonder which one will be worth selecting. For example, between Kuma vs Istio which one is better?; Consul vs Istio which will outperform the other one?
By bringing the top popular Service Mesh platform together for comparing and contrasting. InApps hope that at the end of the post you will find your own answer.
AWS App Mesh
Launched at AWS re:Invent 2018, AWS App Mesh is designed to bring the benefits of a service mesh to Amazon Web Services’ compute and container services.
Since AWS App Mesh can be easily configured with Amazon EC2, Amazon ECS, AWS Fargate, Amazon EKS, and even AWS Outposts, it is considered as one of the most popular Service Mesh platforms.
Since App Mesh can act as a service mesh for both VMs and containers, Amazon created an abstraction layer based on virtual services, virtual nodes, virtual routers, and virtual routes.
A virtual service represents an actual service deployed in a VM or a container. Each version of a virtual service is mapped to a virtual node.
There is one to many relationships between a virtual service and a virtual node. When a new version of a microservice is deployed, it is simply configured as a virtual node.
Similar to a network router, a virtual router acts as an endpoint for the virtual node. The virtual router has one or more virtual routes that adhere to the traffic policies and retry policies.
A mesh object acts as a logical boundary for all the related entities and services.
A proxy is associated with each service participating in the mesh which handles all the traffic flowing within the mesh.
Real-world example of AWS App Mesh
Let’s assume that we are running two services in AWS – servicea.apps.local and serviceb.apps.local.
We can easily mesh-enable these services without modifying the code.
We notice that service.apps.local has a virtual service, a virtual node, and a virtual router with two virtual routes that decide the percentage of traffic sent to v1 and v2 of the microservice.
For a detailed explanation of AWS App Mesh, refer to my previous article and the tutorial.
Like most of the service mesh platforms, AWS App Mesh also relies on the open-source Envoy proxy data plane.
The App Mesh control plane is built with AWS compute services in mind. Amazon has also customized the Envoy proxy to support this control plane.
When using AWS App Mesh with Amazon EKS, you get the benefits of automated sidecar injection along with the ability to define the App Mesh entities in YAML. Amazon has built CRDs for EKS to simplify the configuration of App Mesh with standard Kubernetes objects.
The telemetry generated by AWS App Mesh can be integrated with Amazon CloudWatch. The metrics may be exported to third-party services such as Splunk, Prometheus, and Grafana, as well as open-tracing solutions like Zipkin and LightStep.
For customers using AWS compute services, AWS App Mesh is free. There is no additional charge for AWS App Mesh.
Consul from HashiCorp was launched as a service discovery platform with an in-built key/value store. It acts as an efficient, lightweight load balancer for services running within the same host or in a distributed environment. Consul exposes a DNS query interface for discovering the registered services. It also performs health checks for all the registered services.
Consul was created much before containers and Kubernetes became mainstream. But the rise of microservices and service mesh prompted HashiCorp to augment Consul to a full-blown service mesh platform. Consul leverages its service mesh feature called Connect to provide service-to-service connection authorization and encryption using mutual Transport Layer Security (TLS).
A step-by-step guide to implementing Consul
Since the sidecar pattern is the most preferred approach to service mesh, Consul Connect comes with its own proxy to handle inbound and outbound service connections. Based on a plugin architecture, Envoy can be used as an alternative proxy for Consul.
Consul adds two essential capabilities to Consul — security, and observability.
By default, Consul adds a TLS certificate to the service endpoints to implement mutual TLS (mTLS). This ensures that the service-to-service communication is always encrypted.
Security policies are implemented through intentions that define access control for services and are used to control which services may establish connections.
Intentions can either deny or allow traffic originating from a specific service. For example, a database service can deny the inbound traffic coming directly from the web service but allow the request made via the business logic service.
When Envoy is used as a proxy with Consul Connect, it takes advantage of the L7 observability features. Envoy integrated with Consul Connect can be configured to send the telemetry to a variety of sources including statsd, dogstatsd, and Prometheus.
Depending on the context, Consul can act as a client (agent) or server, it supports sidecar injection when integrated with orchestrators such as Nomad and Kubernetes.
There is a Helm chart to deploy Consul Connect in Kubernetes. The Consul Connect configuration and metadata are added as annotations to the pod spec submitted to Kubernetes.
It can integrate with Ambassador, an ingress controller from Datawire that handles the north-south traffic.
Consul lacks advanced traffic routing and splitting capabilities for implementing blue/green deployments or canary releases. Compared to other service mesh choices, it’s security traffic policies are not very flexible.
With the integration of Envoy, some of the advanced routing policies may be configured. But, Consul Connect doesn’t offer an interface for that.
Istio is one of the most popular service mesh platforms backed by Google, IBM, and Red Hat.
It is also one of the first service mesh technologies to use Envoy as the proxy. This platform follows the standard approach of a centralized control plane and distributed data plane associated with microservices.
Though Istio can be used with virtual machines, it’s predominantly integrated with Kubernetes. Pods deployed in a specific namespace can be configured to have an automatic sidecar injection where Istio will attach the data plane component to the pod.
Three capabilities of Istio:
Istio delivers three chief capabilities to microservices developers and operators:
1. Traffic management: Istio simplifies the configuration of service-level attributes such as circuit breakers, timeouts, and retries, and makes it easy to implement configurations like A/B testing, canary rollouts, and staged rollouts with percentage-based traffic splits.
It also provides out-of-box failure recovery features that help make your application more robust against failures of dependent services or the network.
Istio comes with its own Ingress that handles the north-south traffic. For an end-to-end guide on implementing blue/green deployments with Istio, refer to my past tutorial.
2. Security: Istio provides out-of-the-box security capabilities for intra-service communication. It provides the underlying secure communication channel and manages authentication, authorization, and encryption of service communication at scale.
With Istio, service communications are secured by default, letting developers and operators enforce policies consistently across diverse protocols and runtimes with no code or configuration changes.
3. Observability: Since Istio’s data plane intercepts the inbound and outbound traffic, it has visibility into the current state of deployment.
Istio delivers robust tracing, monitoring, and logging features that provide deep insights into the service mesh deployment.
Istio comes with integrated and pre-configured Prometheus and Grafana dashboards for observability. Refer to my tutorial on configuring and accessing Istio’s observability dashboards.
Google and IBM offer managed Istio as a part of their hosted Kubernetes platforms. Google built Knative as a serverless compute environment based on Istio. For Google services such as Anthos and Cloud Run, Istio has become the core foundation.
When compared to other offerings, Istio is considered to be a complex and heavy service mesh platform. But the extensibility and rich capabilities make it the preferred platform for enterprises.
Launched in September 2019, Kuma is one of the recent entrants into the service mesh ecosystem. Kuma is a well-designed, clean implementation of a service mesh. Its integration with Kong Gateway may drive its adoption among existing users and customers.
Like most of the service mesh platforms, Kuma comes with separate data plane and control plane components.
The control plane is the core enabler for the service mesh that holds the master truth for all the service configurations and infinitely scales to manage tens of thousands of services across an organization.
Kuma couples a fast data plane with an advanced control plane that allows users to easily set permissions, expose metrics and set routing policies through the Custom Resource Definitions (CRD) in Kubernetes or REST API.
Kuma’s data plane is tightly integrated with Envoy proxy which lets the data plane run in virtual machines or containers deployed in Kubernetes.
Overall, Kuma has two modes of deployment: 1) Universal and 2) Kubernetes. When running in Kubernetes, Kuma leverages the API server and etcd database to store the configuration. In universal mode, it needs an external PostgreSQL as the datastore.
Kuma-cp, the control plane component manages one or more data plane components, Kuma-dp. Each microservice registered with the mesh runs an exclusive copy of Kuma-dp. In Kubernetes, Kuma-cp runs as a CRD within the Kuma-system namespace. A namespace that’s annotated for Kuma can inject the data plane into each pod.
Kuma comes with a GUI that provides an overview of the deployment including the state of each data plane registered with the control plane. Similarly, this type of interface can be used to view the health checks, traffic policies, routes, and traces from the proxies attached to the microservices.
Kuma service mesh has a built-in CA that’s used to encrypt the traffic based on mTLS. Traffic permissions can be configured based on labels associated with the microservices. Tracing can be integrated with Zipkin while metrics can be redirected to Prometheus.
Some of the advanced resilience features such as circuit breaking, retries, fault injection, and delay injection are missing in Kuma.
Linkerd 2.x is an open-source service mesh exclusively built for Kubernetes by Buoyant. It’s licensed under Apache V2 and is a Cloud Native Computing Foundation incubating project.
After Istio, Linkerd is one of the popular service mesh platforms. It has the attention and mindshare of developers and operators considering a lightweight and easy-to-use service mesh.
Linkerd is an ultra-lightweight, and easy-to-install service mesh platform. It has three components – 1) CLI & UI, 2) control plane and 3) data plane.
Once the CLI is installed on a machine that can talk to a Kubernetes cluster, the control plane can be installed with a single command. All the components of the control plane are installed as Kubernetes deployments within the linked namespace. The web and CLI tools use the API server of the controller.
Linkerd comes with pre-configured Prometheus and Grafana components providing out-of-the-box dashboards.
The data plane has a lightweight proxy that attaches itself to the service as a sidecar. There is a Kubernetes Init Container to configure the iptables to define the flow of traffic and connect the proxy to the control plane.
Linkerd complies with all the attributes of a service mesh — Traffic routing/splitting, security, and observability.
It’s interesting to note that Linkerd doesn’t use Envoy as the proxy. Instead, it relies on a purpose-built, lightweight proxy written in Rust programming language.
Linkerd doesn’t have an ingress built into the stack but it can work in conjunction with an ingress controller.
Maesh comes from Containous, the company that built the popular ingress, Traefik. Similar to Kong, Inc, Containous built Maesh to complement Traefik. Like Kuma, Maesh can also work with other ingress controllers.
Maesh takes a different approach compared to other service mesh platforms. It doesn’t use a sidecar pattern to manipulate the traffic.
Instead, it deploys a pod per Kubernetes node to provide a well-defined service endpoint. Microservices can continue work as is even when Maesh is deployed.
But, when they use the alternative endpoint exposed by Maesh, they can take advantage of the service mesh capabilities.
The objective of Maesh is to provide a non-intrusive and non-invasive infrastructure that provides an opt-in capability to developers. But it also means that the platform lacks some of the key capabilities such as transparent TLS encryption.
Maesh supports the baseline features of service mesh including routing and observability except for security. It supports the latest specs defined by the Service Mesh Interface (SMI) project.
You’ve just read the article “Most Popular Service Mesh Platforms: A Complete Comparison”.
We hope that the article has brought you fundamental knowledge about Service Mesh and helped you find out “the one” that best suits your needs. If you find this blog post helpful, share it with others who have the same concerns for extra.
InApps is an outsourcing and development company located in Vietnam. We’ve become a trusted partner of many businesses from the US, the UK, and Europe to Hongkong from different industries.
More than being a partner, we aim to become the business’s companion. Therefore we continue to update helpful articles for those who are in need.
If you find this article helpful, stay tuned because we’ll keep sharing more informative posts in the near future. You can read our previous article here.
Let’s create the next big thing together!
Coming together is a beginning. Keeping together is progress. Working together is success.