In part 1 of this blog series on Kubernetes data protection and cloud-native data protection, we addressed  The need for Data Protection for Containerized Applications.
In part 2 of this blog series, we go through:
What’s different about CloudCasa for Kubernetes Data Protection
Kubernetes has become the de facto standard for container orchestration, but as with any new technology platform, it has some weaknesses in management, including in the areas of Kubernetes data protection and disaster recovery. The management of Kubernetes deployments is currently dominated by developers and DevOps engineers who usually don’t deal with Kubernetes data protection solutions. Further, IT organizations are just starting to catch up with the DevOps teams on what is needed to support the deployment and management of cloud-native business applications.
We built CloudCasa to address these Kubernetes data protection weaknesses in cloud-native infrastructure, and to bridge the data management and protection gap between DevOps and IT Operations. CloudCasa is a simple, scalable and cloud-native backup-as-a-service solution built using Kubernetes for Kubernetes data protection. As a SaaS solution, CloudCasa removes the complexity of managing traditional backup infrastructure, and it provides the same level of application-consistent data protection and disaster recovery that IT Operations provides for their server-based applications today. With CloudCasa, IT doesn’t need to be Kubernetes experts and DevOps doesn’t need to be storage experts in order to protect your Kubernetes clusters and data.
Cloud-Native Data Protection –Â What to Backup and Restore
Let’s look at the different types of data and resources that need to be protected in cloud native applications built with Kubernetes.
Cluster Data
Kubernetes is a container orchestration system that manages a cluster of hosts and all the resources in the system such as pods, services and namespaces. Cluster data including resource specifications and configuration data are stored in etcd, a distributed key-value store. These are key components in a Kubernetes deployment, and it is important to back them up in order to rebuild a cluster.
Persistent Volumes
Persistent Volumes (PVs) are resources in Kubernetes that are associated through persistent volume claims (PVCs) with pods or groups of containers. PVs allow for storage resources to be associated with stateful applications such as databases. PVs can be contrasted with ephemeral storage volumes that live and die with containers and are associated with stateless applications.
Container Storage Interface
The Container Storage Interface (CSI) was developed as a standard for supporting both block and file storage systems in Kubernetes. Prior to having the CSI, storage systems were supported via plug-ins that were part of the core Kubernetes code, which meant that vendors had to wait for a new distribution to add support or fix a bug. With the adoption of CSI, storage providers can add or update support for their systems in Kubernetes without ever having to touch the core Kubernetes code. This greatly expanded device support, gave Kubernetes users more options for storage, and made the platform more secure and extensible.
Since data protection products rely on snapshots to efficiently create point-in-time copies of data, a snapshot capability was added to the CSI. Many container storage systems provide the ability to create these snapshots or copies of a volume, which can then be used for backup, restore and disaster recovery. CSI snapshots can also be used to provision new copies or replicas of a volume for additional uses cases such as application and database testing and reporting.
Serverless Databases
In cloud native applications, containers running under Kubernetes may also use serverless databases or managed database services, which can speed application development and deployment. However, this can further complicate the picture when it comes to data protection and disaster recovery. It is important to make sure that you are capturing integrated snapshots of all of your application’s disparate components, including databases, regardless of whether they reside in Kubernetes or the cloud provider’s infrastructure, and that at a minimum they are protected in separate geographies and access domains.
It is usually easy to make serverless components redundant both within and across regions, but as with other cloud-native infrastructure, it is important not to confuse redundancy with true protection and DR capability, which protects against human errors and/or intentional harm from bad players. It can also be more difficult to identify failure domains when you are using serverless components than with more traditional infrastructure.
 Kubernetes Data Protection Summary
The DevOps team must take initial responsibility for data protection in cloud native applications to ensure consistent backup and recovery of container-based applications, given the new and different types of container resources and cloud native data storage. While the DevOps team is in the best position to understand the applications, where the various pieces of them reside, and what configuration and application state data need to be protected, they don’t normally deal with data protection solutions. Therefore, we expect to see the data protection responsibility, budget and accountability remain as a shared responsibility between DevOps and IT Operations for the foreseeable future.
CloudCasa was built as a cloud native service to support best practices for data protection and recovery for cloud native applications, and to bridge the data management and protection gap between DevOps and IT Operations.
We invite you to sign up for CloudCasa and give us your feedback on CloudCasa!