Understanding Basic Kubernetes Concepts V - Daemon Sets and Jobs
This post is the fifth in a series of blog posts about basic Kubernetes concepts. In the first one I explained the concepts of Pods, Labels, and Replica Sets. In the second post we talked about Deployments. The third post explained the Services concept and in the forth we looked at Secrets and ConfigMaps. In this final post we will be talking about Daemon Sets and Jobs.
In previous posts we looked at the basics of how to run our applications as pods in Kubernetes. There's two more ways to run pods that are a bit more specialized. One is the Daemon Set and the other is called Jobs.
A daemon set ensures that an instance of a specific pod is running on all (or a selection of) nodes in a cluster. It creates pods on each added node and garbage collects pods when nodes are removed from the cluster.
As the name suggests you can use daemon sets for running daemons (and other tools) that need to run on all nodes of a cluster. These can be things like cluster storage daemons (e.g. Quobyte, glusterd, ceph, etc.), log collectors (e.g. fluentd or logstash), or monitoring daemons (e.g. Prometheus Node Exporter, collectd, New Relic agent, etc.)
The simplest use case is deploying a daemon to all nodes. However, you might want to split that up to multiple daemon sets for example if you have a cluster with nodes of varying hardware, which might need adaptation in the memory and/or cpu requests you might include for the daemon.
In other cases you might want different logging, monitoring, or storage solutions on different nodes of your cluster. For these cases where you want to deploy the daemons only to a specific set of nodes instead of all, you can use a node selector to specify a subset of nodes for the daemon set. Note that for this to work you need to have labeled your nodes accordingly.
There are four ways to communicate with your daemons:
- Push: The pods are configured to push data to a service, so they do not have clients that need to find them.
- NodeIP and known port: The pods use a
hostPortand clients can access them via this port on each NodeIP (in the range of nodes they are deployed to).
- DNS: The pods can be reached through a headless service by either using the
endpointsresource or getting multiple A records from DNS.
- Service: The pods are reachable through a standard service. Clients can access a daemon on a random node using that service. Note that this option doesn't offer a way to reach a specific node.
Currently, you cannot update a daemon set. The only way to semi-automatically update the pods is to delete the daemon set with the
--cascade=false option, so that the pods will be left on the nodes. Then you create a new daemon set with the same pod selector, but the updated template. The new daemon set will recognize the old pods, but not update them automatically. You will need to force the creation of pods with the new template, by manually deleting the old pods from the nodes.
Unlike the typical pod that you use for long-running processes, jobs let you manage pods that are supposed to terminate and not be restarted. A job creates one or more pods and ensures a specified number of them terminate with success.
You can use jobs for the typical batch-job (e.g. a backup of a database), but also for workers that need to work off a certain queue (e.g. image or video converters).
There are three kinds of jobs:
- Non-parallel jobs
- Parallel jobs with a fixed completion count
- Parallel jobs with a work queue
For non-parallel jobs usually only one pod gets started and the job is considered complete once the pod terminates successfully. If the pod fails another one gets started in its place.
For parallel jobs with a fixed completion count the job is complete when there is one successful pod for each value between 1 and the number of completions specified.
For parallel jobs with a work queue, you need to take care that no pod terminates with success unless the work queue is empty. That is even if the worker did its job it should only terminate successfully if it knows that all its peers are also done. Once a pod exits with success, then all other pods should also be exited or in the process of exiting.
For parallel jobs you can define the requested parallelism. By default it is set to 1 (only a single pod at any time). If parallelism is set to 0, the job is basically paused until it is increased.
Keep in mind that parallel jobs are not designed to support use cases that need closely-communicating parallel processes like for example in scientific computations, but rather for working off a specific amount of work that can be parallelized.
This is the last post in this series of Kubernetes basics. However, this doesn't mean that having read all five blog posts makes you a Kubernetes master.
First, the primitives introduced, while maybe not limited to the most basic, do not cover the whole range of primitives available in Kubernetes.
Second, there's new primitives coming with new releases, such as Pet Sets that just recently got introduced as an alpha resource in Kubernetes 1.3.
So most probably I will be writing about more primitives once I feel there's need for more or simpler explanations than what is available out there already.
Furthermore, just reading these blog posts and maybe looking through the Kubernetes documentation gives you a good foundation. However, you need to actually go and try it out and find ways to use the primitives to run and manage actual applications to get proficient in their usage. And there's no excuse that you don't have a cluster at hand, just try it out on a local environment.