Home
>
DevOps News
>
A Deep Dive into Architecting a Kubernetes Infrastructure – InApps 2022

March 23, 2022 by Phu Nguyen

A Deep Dive into Architecting a Kubernetes Infrastructure – InApps 2022

Main Contents:

A Deep Dive into Architecting a Kubernetes Infrastructure – InApps is an article under the topic Devops Many of you are most interested in today !! Today, let’s InApps.net learn A Deep Dive into Architecting a Kubernetes Infrastructure – InApps in today’s post !

The Architecture

Your architecture hugely revolves around your use case and you have to be very careful in getting it right and take proper consultation if needed from experts. While it is very important to get it right before you start, mistakes can happen, and with a lot of research happening these days, you can often find any revolution happen any day which can make your old way of thinking obsolete.

That is why I would highly recommend you to Architect for Change and make your architecture as Modular as possible so that you have the flexibility to do incremental changes in the future if needed.

Let’s see how we would realize our goal of architecting our system considering a client-server model in mind.

The Entry Point: DNS

In any typical infrastructure (cloud native or not), a message request has to be first resolved by the DNS server to return the IP address of the server. Setting up your DNS should be based on the availability you would require. If you require higher availability, you may want to distribute your servers across multiple regions or cloud providers depending on the level of availability you would like to achieve.

Content Delivery Network (CDN)

In some cases, you might need to serve the users with minimum latency as possible and also reduce the load on your servers. This is where Content Delivery Network (CDN) plays a major role.

Does the client frequently request a set of static assets from the server? Are you aiming to improve the speed of delivery of content to your users while also reducing the load on your servers? In such cases, a CDN at edge serving a set of static assets might actually help to both reduce the latency for users and load on your servers.

Is all your content dynamic? Are you fine with serving content to users with some level of latency in favor of reduced complexity? Or is your app receiving low traffic? In such cases, a CDN might not make much sense to use and you can send all the traffic directly to the Global Load Balancer. But do note that having a CDN also does have the advantage of distributing the traffic which can be helpful in the event of DDOS attacks on your server.

CDN providers include Cloudfare CDN, Fastly, Akamai CDN, Stackpath and there is a high chance that your cloud provider might also offer a CDN service like Cloud CDN from Google Cloud Platform, CloudFront from Amazon Web Services, Azure CDN from Microsoft Azure and the list goes on.

Edge Network

Load Balancers

If there is a request that cannot be served by your CDN, the request will next hit your load balancer. And these can be either regional with Regional IPs or global with Anycast IPs and in some cases, you can also use load balancers to manage internal traffic.

Apart from routing and proxying the traffic to the appropriate backend service, the load balancer can also take care of responsibilities like SSL Termination, integrating with CDN and even managing some aspects of network traffic.

While hardware load balancers do exist, software load balancers provide greater flexibility, cost reduction and scalability.

Similar to CDNs, your cloud providers should be able to provide a load balancer as well for you (such as GLB for GCP, ELB for AWS, ALB for Azure, etc.) but what is more interesting is that you can provision these load balancers directly from Kubernetes constructs. For instance, creating an ingress in GKE (aka GKE ingress) also creates a GLB for you behind the scenes to receive the traffic and other features like CDN, SSL Redirects, etc. can also be set up just by configuring your ingress as seen here.

Networking and Security Architecture

The next important thing to take care of in your architecture is the networking itself. You may want to go for a private cluster if you want to increase security. There you can moderate the inbound and outbound traffic, mask IP addresses behind NATs, isolate networks with multiple subnets across multiple VPCs and so on.

How you setup your network would typically depend on the degree of flexibility you are looking for and how you are going to achieve it. Setting up the right networking is all about reducing the attack surface as much as possible while still allowing for regular operations.

Protecting your infrastructure by setting up the right network also involves setting up firewalls with the right rules and restrictions so that you allow only the traffic as allowed to/from the respective backend services both inbound and outbound.

In many cases, these private clusters can be protected by setting up Bastion Hosts and tunneling through them for doing all the operations in the cluster, since all you have to expose to the public network is the Bastion (aka Jump host) — typically setup in the same network as the cluster.

Some cloud providers also provide custom solutions in their approach towards Zero Trust Security. For instance, GCP provides its users with Identity Aware Proxy (IAP) which can be used instead of typical VPN implementations.

Once all of these are taken care of, the next step to networking would be setting up the networking within the cluster itself depending on your use case.

It can involve tasks like:

If you would like to look at some sample implementations, I would recommend looking at this repository which helps users set up all these different networking models in GCP including hub and spoke via peering, hub and spoke via VPN, DNS and Google Private Access for on-premises, Shared VPC with GKE support, ILB as next hop and so on, all using Terraform.

And the interesting thing about networking in cloud is that it need not be just limited to the cloud provider within your region but can span across multiple providers across multiple regions as needed. This is where projects like Kubefed or Crossplane could help.

If you would like to explore more on some of the best practices when setting up VPCs, subnets and the networking as a whole, I would recommend going through this page, and the same concepts are applicable for any cloud provider you are onboard with.

Kubernetes

If you are using managed clusters like GKE, EKS, AKS, Kubernetes is automatically managed, thereby lifting a lot of complexity away from the users.

If you are managing Kubernetes yourself, you need to take care of many things like, backing up and encrypting the etcd store, setting up networking among various nodes in the clusters, patching your nodes periodically with the latest versions of OS, managing cluster upgrades to align with the upstream Kubernetes releases. This is only recommended if you can afford to have a dedicated team that does just this.

Site Reliability Engineering (SRE)

When you maintain a complex infrastructure, it is very important to have the right observability stack in place so that you can find out errors even before they are noticed by your users, as well as to predict possible changes, identify anomalies and have the ability to drill down deep into where the issue exactly is.

Now, this would require you to have agents that expose metrics as specific to the tool or application to be collected for analysis (which can either follow the push or pull mechanism). And if you are using service mesh with sidecars, they often do come with metrics without doing any custom instrumentation by yourself.

In any such scenarios, a tool like Prometheus can act as the time series database to collect all the metrics for you along with something like OpenTelemetry to expose metrics from the application and the various tools using built-in exporters. A tool like Alertmanager can send notifications and alerts to multiple channels, while Grafana will provide the dashboard to visualize everything in one place, giving users complete visibility on the infrastructure as a whole.

In summary, this is what the observability stack involving Prometheus would look like:

Prometheus Architecture

(Source: https://prometheus.io/docs/introduction/overview/)

Having complex systems like these also require the use of log aggregation systems so that all the logs can be streamed into a single place for easier debugging. This is where people tend to use the ELK or EFK stack with Logstash or FluentD doing the log aggregation and filtering for you based on your constraints. But there are new players in this space, like Loki and Promtail.

This is how log aggregation systems like FluentD simplify our architecture:

Log Aggregation

(Source: https://www.fluentd.org/architecture)

But what about tracing your request spanning across multiple microservices and tools? This is where distributed tracing also becomes very important especially considering the complexity that microservices come with. Tools like Zipkin and Jaeger have been pioneers in the area, with the recent entrant to this space being Tempo.

While log aggregation would give information from various sources, it does not necessarily give the context of the request and this is where doing tracing really helps. But do remember, adding tracing to your stack adds a significant overhead to your requests since the contexts have to be propagated between services along with the requests.

This is how a typical distributed tracing architecture looks like:

Jaeger Architecture

(Source: https://www.jaegertracing.io/docs/1.21/architecture/)

But site reliability does not end with just monitoring, visualization and alerting. You have to be ready to handle any failures in any part of the system with regular backups and failovers in place so that either there is no data loss or the extent of data loss is minimized. This is where tools like Velero play a major role.

Velero helps you to maintain periodic backups of various components in your cluster including your workloads, storage and more by leveraging the same Kubernetes constructs you use. This is how Velero’s architecture looks like:

Velero Architecture

(Source: https://velero.io/docs/v1.5/how-velero-works/)

As you notice, there is a backup controller that periodically makes backups of the objects, pushing them to a specific destination with the frequency based on the schedule you have set. This can be used for failovers and migrations since almost all objects are backed up.

Storage

There are a lot of different storage provisioners and filesystems available, which can vary a lot between cloud providers. This calls for a standard like Container Storage Interfact (CSI) which helps push most of the volume plugins out of the tree thereby making it easy to maintain and evolve without the core being the bottleneck.

This is what the CSI architecture typically looks like supporting various volume plugins:

Kubernetes Storage Management

(Source: https://kubernetes.io/blog/2018/08/02/dynamically-expand-volume-with-csi-and-kubernetes/)

What about clustering, scaling and various other problems that comes with distributed storage?

This is where file systems like Ceph has already proved themselves, though considering that Ceph was not built with Kubernetes in mind and is very hard to deploy and manage, this is where a project like Rook could also help.

While Rook is not coupled to Ceph, and supports other filesystems like EdgeFS, NFS, etc. as well, Rook with Ceph CSI is like a match made in heaven. This is how the architecture of Rook with Ceph looks like:

Rook Ceph Architecture

(Source: https://rook.io/docs/rook/v1.5/ceph-storage.html)

As you can see, Rook takes up the responsibility of installing, configuring and managing Ceph in the Kubernetes cluster. The storage is distributed underneath automatically as per the user preferences. All this happens without the app being exposed to any complexity.

Image Registry

A registry provides you a user interface where you can manage various user accounts, push/pull images, manage quotas, get notified on events with webhooks, do vulnerability scanning, sign the pushed images, and also handle operations like mirroring or replication of images across multiple image registries.

If you using a cloud provider, there is a high chance that they already provide image registry as a service already (eg. GCR, ECR, ACR, etc.) which removes a lot of the complexity. If your cloud provider does not provide one, you can also go for third party registries like Docker Hub, Quay, etc.

But what if you want to host your own registry?

This may be needed if you either want to deploy your registry on-premises, want to have more control over the registry itself, or want to reduce costs associated with operations like vulnerability scanning.

If this is the case, then going for a private image registry like Harbor might actually help. This is what the architecture of Harbor looks like:

Harbor Architecture

(Source: https://goharbor.io/docs/1.10/install-config/harbor-ha-helm/)

Harbor is an OCI compliant registry made of various open source components, including Docker registry V2, Harbor UI, Clair, and Notary.

CI/CD Architecture

Kubernetes acts as a great platform for hosting all your workloads at any scale, but this also calls for a standard way of deploying the applications with a streamlined continuous integration/continuous delivery (CI/CD) workflow. This is where setting up a pipeline like this can really help.

CI/CD Architecture

Some third-party services like Travis CI, Circle CI, Gitlab CI or Github Actions include their own CI runners. You just define the steps in the pipeline you are looking to build. This would typically involve: building the image, scanning the image for possible vulnerabilities, running the tests and pushing it to the registry and in some cases provisioning a preview environment for approvals.

Now, while the steps would typically remain the same if you are managing your own CI runners, you would need to configure them to be set up either within or outside your clusters with appropriate permissions to push the assets to the registry.

Conclusion

We have gone over the architecture of the Kubernetes-based cloud native infrastructure. As we have seen above, various tools address different problems with infrastructure. They are like Lego blocks, each focusing on a specific problem at hand, abstracting away a lot of complexity for you.

This allows users to leverage Kubernetes in an incremental fashion rather than getting on board all at once, using just the tools you need from the entire stack depending on your use case.

If you have any questions or are looking for help or consultancy, feel free to reach out to me @techahoy or via LinkedIn.

InApps is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker.

Source: InApps.net

List of Keywords users find our article on Google:

kubernetes io

gitlab ci kubernetes

aws load balancer

aws elb pricing

kubernetes firewall

docker hub circleci

circleci dockerhub

logstash docker

hire logstash developers

aws elb dashboard

ecr circleci

circleci gke

load balancer metrics

azure cdn

elb metrics

alertmanager gitlab

ceph observability

terraform cloudfront

ceph releases

el segundo notary

azure logstash

kubernetes logstash deployment

gitlab ci include

high availability checklist

circle ci vpn

lego helms deep

aws alb ingress controller

docker deep dive

logstash kubernetes deployment

software native load balancers

hop on hop off ho chi minh

circleci ecr

jobs at gcr

aws elb metrics

kubernetes docs

load balancing vpn

install logstash on aws

harbor kubernetes

aws hub and spoke

elk in aws

kubernetes up and running dive into the future of infrastructure

circleci ecr push

csi kubernetes

terraform circle ci

kubernetes load balancer

google ddos

“inbound fintech ltd”

stackpath jobs

circleci docker hub

ceph docs

ceph github

jaeger github

hire elb developer

small business revolution wikipedia

load balancer reviews

networking components wikipedia

quay container registry

amazon web services elk

gke gitlab

oci e-commerce platform

space management wikipedia

oci bastion

network as a service wikipedia

private terraform registry

ceph dashboard

ecr traffic

lego modular building 2022

oci load balancer

ats infrastructure ltd jobs

akamai identity cloud

kubernetes input plugin

cloudfront terraform

nfs ceph

aws ecr metrics

dockerhub circleci

eks dashboard without proxy

what is inbound user provisioning

akamai vs amazon cloudfront

io.prometheus.client

quay kubernetes

saas multi-cloud load balancer

stackpath reviews

aws alb pricing

entry in inbound table not found

google cloud cdn review

hire elk stack developer

logstash plugins

aws vpc peering terraform

circleci without docker

cloud google kubernetes

gitlab ecr push

logstash filter

shared vpc

aws load balancer icon

azure elk as a service

ceph public network

circleci v2

elb observability

gitlab runner gke

logstash

logstash input

ceph cluster network

stackpath review

circleci contexts

gitlab ci helm

google kubernetes comes under which of the following services?

internal load balancer

gitlab runner entrypoint

how cloud-native storage can simplify your kubernetes

load balancer behind firewall

architect summary linkedin

ceph consulting

checklist for oci application

elk stack on azure

gitlab docker registry cleanup

logstash ha

troubleshooting monitoring and tracing windows infrastructure

zipkin server

circleci aws ecr

gitlab ci entrypoint

gitlab runners

lego bastion set

logstash schedule

oci migration checklist

akamai technologies jobs

grafana azure dashboard

high availability gitlab

issues surrounding the decisions to build and/or host your own ecommerce
site or to outsource some aspects of site development.

logstash filtering

oci network architecture

which type of architecture deploys the vpn so that traffic to and from the
vpn is not firewalled

google cloud nfs storage

hire cloud infrastructure architects

oci shared security model

saas load balancer

stackpath cdn reviews

akamai traffic management

aws elb latency

aws-alb-ingress-controller

docker ceph

elk on aws

logstash file input

static load balancing faqs

azure event hub logstash

cdn mesh delivery

circleci vs travis

circleci vs travis ci

fluentd plugin

gcp vpn pricing

grafana gitlab dashboard

successful load balancing architectures

travis ci vs circleci

opentelemetry vs prometheus

circleci dynamic config

docker hub elk

elk vs grafana

gcp kubernetes network security

ssl native plugins review

travis vs circleci

alb ingress controller

csi driver kubernetes

dns level load balancing

etcd dashboard

etcd grafana

harbor vulnerability scanning

terraform cdn

terraform provider gcp

aws-ecr/build-and-push-image

aws-ecr/build-and-push-image circleci

ceph docker

container registry scanning

elk stack aws

gcp elk stack

opentelemetry and prometheus

sample logstash config file

a10 loadbalancer

aws elk setup

cephfs docker

circleci vpn

facebook infrastructure architecture

google container registry vulnerability scanning

hub and spoke aws

kubernetes application load balancer

kubernetes csi

logstash host

multi-cloud load balancers

nfs technology group

application load balancer ingress

aws list elb

aws managed elk

gitlab ci terraform

gke ingress

gke network policy

logstash aws

logstash kubernetes

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

May 29, 2025 by Anh Hoang

A Deep Dive into Architecting a Kubernetes Infrastructure – InApps 2022

Read more about A Deep Dive into Architecting a Kubernetes Infrastructure – InApps at Wikipedia

The Architecture

The Entry Point: DNS

Content Delivery Network (CDN)

Load Balancers

Networking and Security Architecture

Kubernetes

Site Reliability Engineering (SRE)

Storage

Image Registry

CI/CD Architecture

Conclusion

List of Keywords users find our article on Google:

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2025

AI Automation for Business in 2025: A Step-by-Step Guide

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

Locations

Read more about A Deep Dive into Architecting a Kubernetes Infrastructure – InApps at Wikipedia

The Architecture

The Entry Point: DNS

Content Delivery Network (CDN)

Load Balancers

Networking and Security Architecture

Kubernetes

Site Reliability Engineering (SRE)

Storage

Image Registry

CI/CD Architecture

Conclusion

List of Keywords users find our article on Google:

Get a custom Proposal

You need to enter your email to download

Blog post

Locations