Container Image Building with Kaniko
Kaniko is a project initiated by Google in the early part of 2018. The rationale behind Kaniko is to seek to remove the need for elevated privileges when performing container image builds. If you’ve been following this series, you’ll already know by now that unprivileged container image builds are one of the most sought-after features for security-conscious organizations. This is particularly pertinent for container image building in Kubernetes clusters.
So, what is Kaniko and how does it work?
Kaniko uses the familiar objects that you would ordinarily use to build container images; a Dockerfile containing build instructions and a build context, which contains the Dockerfile and any other artifacts required for the image.
However, unlike the native Docker client build command, it doesn’t require a Docker daemon to execute the build steps.
Instead, Kaniko uses its own ‘executor’ for executing the build steps, which runs inside a container (which may be a Kubernetes pod). The build context, of course, needs to be made available to the container or pod in order for the build to consume it. Because the build steps are executed inside a container, there is still an ultimate dependency on the Docker daemon or some other container runtime, but the important distinction is that build steps themselves are executed by the executor code.
The executor iterates over each build stage defined in the Dockerfile. It pulls the base image specified for the build stage and unpacks its rootfs. It then executes each Dockerfile instruction in sequence, adding to or changing the content of the rootfs as it goes. If changes are made to the rootfs, the executor snapshots the filesystem changes as a ‘diff’ layer and updates the image metadata where necessary. The snapshotting is achieved by comparing the prior state of the filesystem to the state after the execution of the Dockerfile instruction. This is not unlike the behavior of copy-on-write filesystems like overlayfs, except the action is performed entirely in userspace. Once the build steps are complete, the diff layers are added to the base image in sequence to form the new image.
“What happens to the built image? Won’t the image disappear with the spent container or pod,” I hear you ask?
Well, the executor also expects a container registry repository name or names to be specified on the command line where it will push the new image on completion of a successful build.
The following snippet of YAML describes the spec of a pod running the Kaniko executor:
spec: containers: - name: executor image: gcr.io/kaniko-project/executor:latest args: [ "--context", "git://github.com/mycorp/my-app.git", "--destination", "quay.io/mycorp/my-app:1d03df1", ]
Kaniko is quite flexible when it comes to defining the source of the build context. A volume mount can be used to specify a local directory (to the container) that contains the context and the example above highlights the use of a git repository as a source.
Given Kaniko’s origin, it would be a surprise if cloud storage didn’t figure as a source for build contexts. And sure enough, Kaniko supports the use of object storage solutions provided by the three main public cloud providers.
A hugely important component of container image building is the ability to make use of previous invocations of image build steps, which are often cached for this purpose. If a build step will result in the same output, the builder will use the cached content rather than executing the build step all over again.
Kaniko provides two caching capabilities for enhanced image building:
Base images can be downloaded to a local, shared volume, which can be made available to the executor container when it’s invoked. This saves executor containers from pulling the same base image each time that they’re invoked, which saves on build time. Kaniko provides the ‘warmer’ for this purpose. You just need to specify the local directory for storing the images and the relevant images you want to cache.
The intermediate image layers that are created during image builds are also a key ingredient of the build cache. Again, Kaniko enables the caching of images created during the execution of RUN Dockerfile instructions, but these are stored in a remote container registry repository. Before the executor processes a RUN instruction, the nominated repository is checked for an equivalent layer. If one is found, it’s pulled from the repository instead of the instruction being executed.
Since Kaniko avoids the use of the Docker daemon’s build API endpoint to execute build steps, this helps greatly with security. The executor can be run as an unprivileged container and doesn’t require mounting a Docker daemon socket into the container. This is an important step in the quest to securely run container image builds in pods on Kubernetes.
The downside is that the build steps are likely to require execution as the root (uid=0) user. Generally speaking, containers should be run with a non-root user - root in the container is root on the host. If components of the base image’s rootfs need to be owned by the root user, or commands need to be executed with root privileges, this is unavoidable with Kaniko.
Some of the concerns associated with running the executor container as the root user can be partially mitigated. It’s possible to protect the host on which the executor container runs by providing additional isolation. Google’s own gVisor userspace kernel abstraction is a good example of such isolation, as is the Kata Containers runtime.
Kaniko is one of the new breeds of image builder tools that seeks to remove the long-standing dependency on the Docker daemon. It does this well and provides a credible container image building experience within Kubernetes clusters. And this is achieved without the security horrors normally associated with building against a Docker daemon. For this reason, Kaniko is a popular image build tool and regularly features as a backend build task for Tekton Pipelines.
As popular as it is, Kaniko doesn’t quite achieve the nirvana of rootless container image builds and also perpetuates the sequential nature of builds based on the Dockerfile.
Are you looking to get your Kubernetes containers into production? Contact Giant Swarm and start your cloud-native journey.