It can be quite challenging for an SRE team to maintain the well-being of a large-scale Kubernetes based system with hundreds or thousands of services. In this blog post, Gigi Sayfan, author of “Mastering Kubernetes”, outlines the SRE challenge and how we can achieve the ultimate goal of automated SRE with Kubernetes operators
You are part of an SRE team responsible for the well-being of a large-scale Kubernetes-based system with hundreds or thousands of services, possibly integrated with multiple 3rd party providers, maybe managing some hardware too. That’s a lot of responsibility. Everything is moving fast and you have to keep it all together. In this blog post we will talk about the SRE challenge, the ultimate goal of automated SRE, introduce Kubernetes operators and see how they can help us towards our goal. Finally, we will survey some of the current operator frameworks to get you ready to build your own Kubernetes operators.
The developers are cranking out code, new features and upgrading their systems. The data keeps growing. The bits keep flowing. Everybody wants more capacity, better performance, cost saving, security in depth, total visibility and absolutely no downtime. After all, SRE stands for site reliability engineering. The site better be reliable!
If it sounds daunting the main reason is that it is daunting!
The SRE discipline and methodology emerged as a means to address this very problem. Let’s explore what happens when you take SRE to the limit.
Federico Garcia Lorca once said “Besides black art, there is only automation and mechanization.” If you ever SSH’ed into a broken production server and started fixing it manually you know all about black art.
Automation has multiple positive effects:
Automation is a virtuous cycle. The moment you start automating tasks you don’t only save time on the task you automated. But you also strengthen the automation culture in your organization and open the door to more automation.
The endgame is to have autonomous systems that can take care of themselves, self-heal, upgrade, patch security vulnerability and in general, just work.
There are some situations where you need human oversight, but those situations become rarer and rarer as you improve your automation and gain more confidence that it can handle more real-world situations.
Runbooks and check lists are a staple of professional operators. You can think of each runbook as an opportunity for automation. If the rules are encoded in a runbook, do you actually need a human to perform them?
So, automation is good. But, how do we go about it? Let’s take a book from Kubernetes itself.
Kubernetes is in its essence a bunch of control loops. It manages various resources like pods, deployments, config maps and secrets. It stores the state of those resources in etcd and then it runs multiple controllers. Each controller is responsible for a specific resource type. Its job is to reconcile the actual state of the resource with its desired state.
The following diagram shows the Kubernetes architecture:
The controller manager is a process that contains all these controllers. The controller watches for different events, as well as for changes to the manifests that represent their resources. When they detect that the actual state is different from the desired state they take action.
For example the ReplicaSetController manages replica sets. If a replica set has a replica count of 3 and the ReplicaSetController detects that there are currently only 2 pods running, it will create another pod to get it back to 3.
But, if a user changed the replica count in the YAML from 3 to 2 then the ReplicaSetController will kill one of the 3 pods.
If you look at the big picture of operations it's all about control loops:
Note that the desired state is not fixed and may change too.
Human operators implement a control loop. They monitor their systems taking actions when the desired state (applications that need to be deployed, performance targets, supported versions of 3rd party software) deviates from the actual state. They also respond when the desired state doesn't change, but the actual system state drifts (nodes going down, manual configuration changes).
The operator pattern in Kubernetes aims to package the knowledge and skills of a human operator in software. It boils down to Kubernetes custom resources and a custom controller that watches the custom resources and usually some additional system. The custom controller works just like Kubernetes controllers and reconciles the desired state in the spec of the custom resource with the actual state that is reflected in the status.
The operator pattern was conceived by CoreOS (which was acquired by RedHat, later acquired by IBM) in 2016. Here is the blog post that introduced operators to the world:
The primary motivation was to support stateful applications that often require multiple custom steps for scaling, upgrades, backups, fail overs, etc.
Kubernetes can handle stateless workloads pretty well, but it only offers the StatefulSet for stateful workloads. This is by design. The operational knowledge required to manage stateful workloads is often bespoke and outside the scope of Kubernetes itself.
The operator pattern is exactly the right abstraction.
Kubernetes operators take the Kubernetes controller pattern that manages native Kubernetes resources (Pods, Deployments, Namespaces, Secrets, etc) and let you apply it to your own custom resources. Kubernetes extensibility is legendary and operators fit right in.
If we need to define operators in one formula it would be:
Operator = Custom Resource + Controller
Custom resources are Kubernetes objects that you define via CRDs (Custom Resource Definitions). Once a CRD is defined, you can create custom resources based on the definition and they are stored by Kubernetes and you can interact with them through the Kubernetes API or kubectl, just like existing resources. Here is a CRD for a candy custom resource.
Don’t be overwhelmed. At the end of the day it defines a simple object that has a name field and a flavor field. Everything else is needed to integrate with Kubernetes and kubectl. For example, the various names in the names section provide a good user experience when presenting information to the user. The schema section allows Kubernetes to validate on your behalf that Candy custom resources adhere to the requirements.
Well, if CRDs look a little complicated the custom resources themselves are pretty straightforward. Here is chocolate candy custom resource:
Just with CRDs and custom resources you can take advantage of Kubernetes and abuse it as both a persistent database, a RESET API and a command-line client.
That’s right. Kubernetes will store all your custom resources in etcd for you and provide CRUD access through its API as well as through kubectl.
For example, we can create the chocolate custom resource via kubectl:
Then, we can list all the candies just like any other resource:
We can get the contents as JSON too. Here, we use the short name cn:
In case you want to access it programmatically then there is a new Kubernetes API endpoint:
CRDs and custom resources are useful on their own, but when you write your own controllers to manage them you get to reap the real benefit.
Specifically, operators consist of a controller that has one job - reconcile the desired state as specified in the spec of the custom resource.
Let’s explain how operators work with our chocolate custom resource example.
Imagine a chocolate factory. The sweetness spec for each chocolate bar is of course Sweeeeeeet . Our chocolate operator runs in our Kubernetes cluster. It is connected to the chocolate making machine where it can control for example, how much sugar to add. It can also sense the sweetness of each manufactured chocolate bar, by measuring small bits. If the actual sweetness doesn't match the spec, the specific chocolate bar will be disposed off because it didn’t pass quality control. The custom resource can stick around, but in its status it will record the actual sweetness and if the chocolate bar was disposed of or not.
Other data analytics pipeline can query the custom resources and provide insights (e.g. a specific machine produces too many non-standard chocolate bars and must be calibrated or fixed).
This way we can bring an external system of a chocolate factory into the fold of Kubernetes and interact with it using Kubernetes concepts and tooling.
Let’s look at a real operator - the etcd operator. As you know, Kubernetes manages its state in an internal etcd cluster. But, Etcd is a general-purpose key-value store and you may want to install Etcd in your Kubernetes cluster for use by your workloads. It is possible to use the same Etcd cluster used by Kubernetes, but it is not a good idea because it’s considered an implementation detail of Kubernetes and also it’s configured for listening only on localhost.
With the Etcd operators you can easily install and manage your own Etcd cluster and reap all the benefits. You can find the Etcd operator on OperatorHub.io, which is a community site that curates Kubernetes operators.
Here are some the features you get out of the box:
The Etcd operator manages 3 different CRDs: Cluster, Backup and Restore.
Here is what a Etcd Cluster custom resource looks like:
The spec has a size and version field. For example, by modifying the version field you can signal the operator that you want to upgrade your Etcd cluster. Upgrading safely a distributed data store is a non-trivial procedure, but the operator encapsulates all the knowledge and lets users just update one field in a YAML file, sit back and watch the magic happen.
Let’s look at some code, just to get a sense of what operator code is like. The Etcd operator is implemented in Go and has multiple packages. Here is the heart of the operator - the reconcile() method of the Cluster type:
We’re not going to analyze each line, but the gist of it is that the operator checks the size field of the spec and compares it to the actual number of members in the cluster. If the numbers don’t match then the operator calls the reconcileMembers() method that resizes the cluster properly.
Then it checks if an upgrade is required and if this is the case, the operator performs a rolling upgrade by upgrading one old member at a time until all members are at the new version.
The operator also makes sure to always update the status to the actual state.
Using operators is typically very simple because all the complexity is encapsulated by the operator. But, someone has to write the operator and deal with the complexities of stateful, async, distributed systems as well as integrate with the Kubernetes API machinery. This is not trivial. Luckily the Kubernetes community developed several frameworks to assist in writing Kubernetes controllers in general and operators in particular. Most of these frameworks are Go frameworks as Go is the implementation language of Kubernetes itself and the most high-fidelity client libraries are also implemented in Go. But, there is also one Python framework for you, pythonistas nad, one language-agnostic framework.
Kubebuilder is a Go framework for building Kubernetes API extensions based on CRDs, controllers and webhook admission controls (to validate custom resources). It is developed by the Kubernetes API machinery work group. It can be considered the “official” way to build API extensions. In addition, it has a lot of momentum and it provides a lot of capabilities out of the box. It promotes the following workflow:
Under the covers Kubebuilder is using the controller-runtime library for a lot of the heavy lifting.
There is an entire book about Kubebuilder that you can pursue: https://book.kubebuilder.io/
The operator framework is another mature framework. It was originally developed by CoreOS, the originators of the operator concept. It is still going strong and has excellent documentation as well as a lot of components. One of the core components is the OperatorSDK. However, there is an integration project going on to merge Kubebuilder and the OperatorSDK.
The OperatorSDK is also built on top of the controller-runtime. If and when Kubebuilder assimilates the OperatorSDK it is not clear what would be the future of the Operator framework as a whole.
The Metacontroller framework is different. It is built on the concept of web hooks. Those web hooks are served by a lambda controller that runs in Kubernetes and invokes your lambda functions that can be implemented in any language. You get a lot of flexibility and can implement your controllers in any language at the cost of an additional layer of indirection.
Kopf is a Python operator framework that makes development very Pythonic. Kopf provides both the “outer” toolkit to interact with Kubernetes, deploy your operators and run them in the cluster as well as “inner” libraries to manipulate Kubernetes resources and in particular custom resources.
Operators are an extremely powerful pattern for managing stateful applications in Kubernetes. The conceptual model follows control theory. The utility of the operator pattern became clear as soon as CoreOS introduced it to the world and a plethora of operators are now available. You may consider building operators for your system and if you do, there are a variety of frameworks and tools to assist you along the way.
Squadcast is an incident management tool that’s purpose-built for SRE. Your team can get rid of unwanted alerts, receive relevant notifications, work in collaboration using the virtual incident war rooms, and use automated tools like runbooks to eliminate toil.