Sayan Saha
Sayan Saha is a seasoned product executive with 12+ years of open source software product management experience spanning Linux-based platform software, containers, Kubernetes, high availability/clustering software, and software-defined storage. At NetApp, he is a Senior Director of Product Management for Astra, a fully managed (SaaS) multi-hybrid cloud data management service for Kubernetes applications, and Trident, an open source solution for consuming persistent storage for containerized workloads.

As Kubernetes (K8s) and containers become the de-facto choice for developing, deploying, running and scaling cloud native and next-generation IT apps, enterprises are running more and more business-critical applications on K8s clusters. Business-critical applications are often stateful. A stateful application has associated state, data, and config information and depends on previous data transactions to execute its business logic.

Business-critical apps on Kubernetes that provide a service often have availability and business continuity requirements like traditional applications meaning an outage of the service (breaching SLAs) can seriously impact the revenue and reputation of the provider. Enterprises often realize that they need to enable their Kubernetes deployments with data management tooling to be resilient to service failures after a service-impacting disaster or when they face a hard application migration task to a new cluster or environment.

Read More:   Update Data Management Strategy Is More Strategic than You Think

Other enterprises recognize the need but use custom tools developed internally with intimate knowledge of the application, which are hard to scale, apply and normalize across the enterprise and application teams. In other words, such tooling is application-specific and needs to be custom developed for every application. Consequently, enterprises grapple with a coherent and cohesive persistence and data management strategy for their Kubernetes estate.

The Current State of K8s Application Data Persistence and Management

The larger Kubernetes community and the ecosystem have done an excellent job defining the Container Storage Interface (CSI), which solves users’ first-order problems with persistent storage provisioning and consumption for stateful Kubernetes applications. The CSI interface also defines data management primitives like support for persistent volume (PV) snapshots and clones. These interfaces provide a foundation for building comprehensive application data protection and mobility solutions by storage and data management vendors.

Kubernetes Data Management — Application-Awareness Is Key — and It’s Not Always About What’s in the Cluster

There is no concrete definition of what constitutes a Kubernetes application today. However, for most Kubernetes practitioners and users, a K8s application is formed by including data and metadata about the application that combines standard K8s objects and resources (like ConfigMap, Secrets, Deployment, ReplicaSet), Persistent Volumes (PVs), and custom resources (CRDs and CRs) across namespaces and, in some cases, across clusters. Consequently, any application-agnostic Kubernetes data management tooling needs to account for all such components that comprise an application.

Not doing so and replicating and/or backing up the persistent volumes associated with K8s applications can lead to some spectacular failures when the time comes to recover an application after a disaster. Treating the application as a holistic unit for protecting and migrating is pivotal to accomplishing Kubernetes application-data management.

Complicating this situation even further are cloud native K8s application design patterns used primarily in the public clouds, where application teams take advantage of the convenience, stability, and performance of using fully managed cloud services, like databases, message queues and object storage. In such cases, by definition, the K8s application is no longer confined to a cluster but spans fully managed services outside the cluster. It is very common to consume external fully managed or self-managed databases from applications running within Kubernetes clusters.

Read More:   Update Tutorial: Set up a Secure and Highly Available etcd Cluster

Building upon this design pattern of cloud native development, public clouds like AWS and Azure are making it even easier to consume fully managed services from Kubernetes clusters using Kubernetes-native APIs. AWS Controllers for Kubernetes (ACK) and Azure Service Operator (for Kubernetes) are examples of such initiatives.

Alternatives to Kubernetes-Native Persistence — Common Design Patterns and Why

As explained above, application teams building modern Kubernetes-based services often use a multitude of persistence technologies in addition to other native cloud services that are not limited to using persistent volumes in K8s clusters. This pattern has emerged because of many reasons, including but not limited to:

  • An entrenched belief that Kubernetes is an excellent platform for running stateless applications only.
  • Early experiences with running databases on K8s clusters or awareness about failed projects attempting to do so.
  • Embracing a cloud native and microservices approach towards building Kubernetes applications using native public cloud DBaaS (e.g., AWS RDS, Google Cloud SQL, Azure Cosmos DB), a third-party vendor-managed datastore (delivered as SaaS), or a self-managed external database cluster feels normal. This design paradigm naturally conforms to a cloud native design approach, which takes advantage of the scalability, resiliency, elasticity and flexibility of these data services using an API-based contract amongst microservices.
  • Using object storage for K8s persistence needs because it is ubiquitous in the public cloud and is used pervasively for persisting modern applications.

Like everything else, these persistence choices have some drawbacks. Using fully managed native public cloud databases and NoSQL datastores can be expensive and lead to an implicit lock-in to one public cloud. But this may be a fine design choice for enterprises who have selected a single or a primary cloud provider for their IT needs. To avoid a cloud provider lock-in, enterprises with a multicloud strategy often use a cloud-agnostic DBaaS offering from a third-party ISV.

In other cases, they run their external database clusters on cloud providers’ reserved instances (availing of discounted reserved instance pricing) to save on costs. Doing this, they end up self-managing database clusters, which can be tedious.

Read More:   What is the difference between Smoke and Sanity testing

Using object storage for Kubernetes persistence is extremely popular. However, using object storage can also make an application inherently less portable due to the incompatibility of the API used to access the native object storage services in the public cloud as they do not natively support the same API. The K8s community is working on a new standard Container Object Storage Interface (COSI) to provide a common layer of abstraction for consuming object storage from K8s applications, which will make the portability of K8s applications using object storage easier. Also, object storage is not suitable for many existing applications even when they are being refactored.

What Does This Mean for Enterprises?

As is evident, the definition of what constitutes a Kubernetes application and its persistence needs do not always neatly map to Kubernetes resources within a cluster and persistent volumes attached to pods running within a cluster. The choices are rich for K8s data persistence, each with its benefits and drawbacks. This means popular K8s application data management functionality, like backup, recovery, migration, and compliance, must include what’s inside K8s clusters and objects and resources that may reside outside the cluster(s) and are an integral part of the application.

For example, taking a consistent backup of a K8s application also means triggering a backup of the fully managed public cloud database that provides data services to this application in addition to K8s resources, metadata and objects that are present inside the Kubernetes cluster. A subsequent recovery procedure must also restore the external database in addition to the in-cluster K8s resources.

Consequently, enterprises must carefully review their K8s application protection, mobility, and compliance strategies and use a selection criterion for their K8s storage and data management solution, accommodating the most common cloud native persistence design patterns adopted by their own application teams and developers.

Feature image via Pixabay.