Speaker 1 (00:05):
Welcome. Today we're going to be doing A Walk Through Persistent Storage for Kubernetes. Kubernetes is now the de facto standard for container orchestration. It helps organizations realize the benefits of an agile application development in which their developers are able to iterate and introduce new features into the application much more quickly, as well as react to the demands placed on that application.
Kubernetes allows for a microservices architecture, which is a major change from the monolithic architecture applications used to be developed in. This microservices architecture allows for an independent scaling of services. As the demands on the application change, it can react to those changes to provide better services and a better experience for end users.
Kubernetes also provides the foundation for orchestrating communication between the different containers, and provides overlay networking and security features to make sure all the services are able to find each other and communicate. This helps lead to an efficiency and a DevOps model, allows for a standardized automation, and allows customers to introduce new features into the application in a consistent and automated way.
Finally, we have the ability to provide monitoring of the different service levels across all the different containers and services that are running and make up that application.
What we're seeing with Kubernetes is a challenge around persistent storage. Since Kubernetes was initially developed, it was targeted at ephemeral services. It didn't really have a concept of persistent data, and any type of data services were originally conceived to live outside the application itself. However, as Kubernetes started to evolve and more use cases were found for it, there was a huge demand to be able to run applications that had state. This is where persistent volumes came in. However, whenever we first saw the instantiation of persistent volumes, there was a very manual process. The operations teams had to go out and manually provision the volumes before the developers could start utilizing them.
Dynamic provisioning changed this, and it allowed the operations team to simply create a storage class. Anytime a developer wanted to stand up an application that needed a persistent volume, the only thing they had to do was address that storage class and tell it the parameters they wanted for the volume. This helped unburden the operations teams from answering trouble tickets, instead allowing them to provision new volumes. It also unleashed the development teams and allowed them to simply write code to request the resources they wanted.
One of the major changes that we saw inside of the Kubernetes stack was a move from entry drivers to the container storage interface. In the past, it took a long time for storage platform vendors to get their driver integrated into the Kubernetes code. With the move to CSI, it now allows them to develop their drivers independently of the Kubernetes code base, which allows them to iterate new features and get their newer platforms integrated into Kubernetes much more easily.
Whenever we're looking at selecting a storage platform for Kubernetes, there are several different considerations we want to make. One of the first considerations around application requirements is their performance characteristics. These are standard whether or not we're utilizing containers or virtual machines. Basically, what we're looking at is some different characteristics, including the IOPS, the amount of bandwidth that's required for the application, how many file operations the application might make, how sensitive it is to latency, the delay in requesting certain pieces of data, and the differences in the way it reads and writes information into the storage platform.
Something that's new and unique is a consideration around volume provisioning times in Kubernetes. As we know, Kubernetes can spin up and down resources very, very rapidly. If your application has a requirement to be able to do this, the amount of time it takes to actually provision and attach that volume to the pod becomes a really important consideration.
If we look at Kubernetes, there are new application access modes for volumes. The first one is a ReadWriteOnce, which is probably one of the more common ones. This allows a single pod to read and write into a specific volume. There was the case originally where, if on a single node, you had two pods—they would still be able to access that same volume and ReadWriteOnce mode. This is why a newer development inside the CSI architecture has a new primitive called ReadWriteOnce pod. This guarantees that only a single pod has access to that volume, in both read and write capabilities.
Another interesting, common use case is a ReadOnlyMany. This allows many different applications to access the same volume and read from it. Imagine a scenario where you have a bunch of front end web services, and you need to scale them up quickly. If they all have access to read the static content for that website, and the configuration information out of that volume, it allows them to spin up much faster and provide you a single source of truth for their information.
The final type of access mode is a ReadWriteMany. This is a bit more uncommon, but it’s utilized for applications that have the ability to provide data consistency at the application layer, as opposed to the actual volume level. Many applications would benefit from the ability to access a single volume, update it, and have that single source of truth, but with the consistency built into the application and ensuring they're not interfering with each other.
The next consideration is data availability and protection. This is common across most of the storage industry and different types of platforms you might integrate into. The first primitive is high availability, and making sure the storage platform is resilient, can handle failures, and is fault tolerant. Another consideration is the security it provides. Does it provide data encryption, data in flight? How does it handle user access into that storage platform?
Some of the data protection features, like snapshots and replication, allow you to recover from either corruption, or even site failures. This is where DR comes in. DR for containers is a little different than you'll see across the standard virtual stacks. We need to understand if we just need to be able to recover the data, or if we need to be able to recover both the data and the application running inside of Kubernetes.
Finally, we want to take a look at some considerations for the actual storage platform itself. Kubernetes now (especially with CSI) has the ability to integrate legacy arrays into the platform, so you can utilize your current investments in order to provide persistent volume services in Kubernetes. Now, our storage platforms are able to be deployed natively through Kubernetes. This means your storage platform is defined as code inside of a YAML file, and you actually can use cube cuddle commands to deploy it, just as you would any other application. This provides more of a DevOps feel for the storage platform, than the legacy array you would normally see.
We also need to understand if there's a multi-tendency requirement for the storage platform. This is a lot more common in Kubernetes now, as we have different departments, or even different end user customers who are going to be accessing the same platform, and we need to ensure the platform has the ability, not only to provide that access, but do it securely.
We're also seeing a lot of hybrid and multi-cloud use cases. It's powerful to have a platform you can not only run on premises, but inside the public cloud. Having the single platform simplifies operations and makes it a lot easier to transition data, as well as your services between your on-premise infrastructure and all of the different public cloud resources.
The final consideration is always going to be cost. Cost is more than an upfront purchase cost—it is the ongoing maintenance. It is how you deal with the end of life of a certain platform and what type of data migration costs you're facing at that point.
What we're seeing in the industry is a lot more adoption of different consumption models. This could be more heavily favored towards CAPEX, or OPEX. There's now, with the public cloud. There's also the ability to transition licenses between your on-premise infrastructure, as well as your public cloud resources.
Finally, we'd like to talk to you about how Redapt can help you accelerate your Kubernetes journey, whether for the storage platform, or overall Kubernetes distribution. We provide needs assessments and analysis, where we come in and discover what your applications require and what type of platforms would be best suited. We help you evaluate those platforms and select them. We can help with primary, secondary, and DR solution designs. Many of these are enabled through multi-cloud strategy, so you're not dependent on premise infrastructure or a single public cloud provider to make sure that your application stays up and running.
We have custom engineering services around the Kubernetes platform, regardless of the type of distribution that you're considering, as well as application modernization services, which help you bring those monolithic applications into more of a web scale architecture.
For any of the solutions you select, we have the ability to do hardware, interoperability, testing, and optimization. This comes in handy, especially when you're looking at open source code, or software defined and storage platforms.
One of the major strengths we have at Redapt is our ability to deliver infrastructure globally at scale. We make sure you have the right infrastructure and the right platforms in place quickly, so you have a quicker time to value. I'd like to thank you for joining us. If you would like to learn more, please visit us at readapt.com/content. Thank you.