Securing the Base Infrastructure of a Kubernetes Cluster

Nov 7, 2018

The first article in this series Securing Kubernetes for Cloud Native Applications, provided a discussion on why it’s difficult to secure Kubernetes, along with an overview of the various layers that require our attention, when we set about the task of securing that platform.

The very first layer in the stack, is the base infrastructure layer. We could define this in many different ways, but for the purposes of our discussion, it’s the sum of the infrastructure components on top of which Kubernetes is deployed. It’s the physical or abstracted hardware layer for compute, storage, and networking purposes, and the environment in which these resources exist. It also includes the operating system, most probably Linux, and a container runtime environment, such as Docker.

Much of what we’ll discuss, applies equally well to infrastructure components that underpin systems other than Kubernetes, but we’ll pay special attention to those factors that will enhance the security of Kubernetes.


Machines, Data Centers, and the Public Cloud


The adoption of the cloud as the vehicle for workload deployment, whether its public, private, or a hybrid mix, continues apace. And whilst the need for specialist bare-metal server provisioning hasn’t entirely gone away, the infrastructure that underpins the majority of today’s compute resource, is the virtual machine. It doesn’t really matter, however, if the machines we deploy are virtual (cloud-based or otherwise), or physical, the entity is going to reside in a data center, hosted by our own organisation, or a chosen third-party, such as a public cloud provider.

Data centers are complex, and there is a huge amount to think about when it comes to the consideration of security. It’s a general resource for hosting the data processing requirements of an entire organisation, or even, co-tenanted workloads from a multitude of independent organisations from different industries and geographies. For this reason, applying security to the many different facets of infrastructure at this level, tends to be a full-blown corporate or supplier responsibility. It will be governed according to factors such as, national or international regulation (HIPAA, GDPR), industry compliance requirements (PCI DSS), and often results in the pursuit of certified standards accreditation (ISO 27001, FIPS).

In the case of a public cloud environment, a supplier can and will provide the necessary adherence to regulatory and compliance standards at the infrastructure layer, but at some point, it comes down to the service consumer (you and me), to further build on this secure foundation. It’s a shared responsibility. As a public cloud service consumer, this begs the question, “what should I secure, and how should I go about it?” There are a lot of people with a lot of views on the topic, but one credible entity is the Center for Internet Security (CIS), a non-profit organisation dedicated to safeguarding public and private entities from the threat of malign cyber activity.


CIS Benchmarks


The CIS provides a range of tools, techniques, and information for combating the potential threat to the systems and data we rely on. CIS Benchmarks, for example, are per-platform best practice configuration guidelines for security, consensually compiled by security professionals and subject matter experts. In recognition of the ever increasing number of organisations embarking on transformation programmes, which involve migration to public and/or hybrid cloud infrastructure, the CIS have made it their business to provide benchmarks for the major public cloud providers. The CIS Amazon Web Services Foundations Benchmark is an example, and there are similar benchmarks for the other major public cloud providers.

These benchmarks provide foundational security configuration advice, covering identity and access management (IAM), ingress and egress, and logging and monitoring best practice, amongst other things. Implementing these benchmark recommendations is a great start, but it shouldn’t be the end of the journey. Each public cloud provider will have their own set of detailed recommended best practices1,2,3, and a lot of benefit can be taken from other expert voices in the domain, such as the Cloud Security Alliance.

Let’s take a moment to look at a typical cloud-based scenario that requires some careful planning from a security perspective.


Cloud Scenario: Private vs. Public Networks


How can we balance the need to keep a Kubernetes cluster secure by limiting access, whilst enabling the required access for external clients via the Internet, and also from within our own organisation?

  • Use a private network for the machines that host Kubernetes - ensure that the host machines that represent the cluster’s nodes don’t have public IP addresses. Removing the ability to make a direct connection with any of the host machines, significantly reduces the available options for attack. This simple precaution provides significant benefits, and would prevent the kind of compromises that result in the exploitation of compute resource for cryptocurrency mining, for example.
  • Use a bastion host to access the private network - external access to the host’s private network, which will be required to administer the cluster, should be provided via a suitably configured bastion host. The Kubernetes API will often also be exposed in a private network behind the bastion host. It may also be exposed publicly, but it is recommended to at least restrict access by whitelisting IP addresses from an organization’s internal network and/or its VPN server.
  • Use VPC peering with internal load balancers/DNS - where workloads running in a Kubernetes cluster with a private network, need to be accessed by other private, off-cluster clients, the workloads can be exposed with a service that invokes an internal load balancer. For example, to have an internal load balancer created in an AWS environment, the service would need the following annotation: service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0. If clients reside in another VPC, then the VPCs will need to be peered.
  • Use an external load balancer with ingress - workloads are often designed to be consumed by anonymous, external clients originating from the Internet; how is it possible to allow traffic to find the workloads in the cluster, when it’s deployed to a private network? We can achieve this in a couple of different ways, depending on the requirement at hand. The first option would be to expose workloads using a Kubernetes service object, which would result in the creation of an external cloud load balancer service (e.g. AWS ELB) on a public subnet. This approach can be quite costly, as each service exposed invokes a dedicated load balancer, but may be the preferred solution for non-HTTP services. For HTTP-based services, a more cost effective approach would be to deploy an ingress controller to the cluster, fronted by a Kubernetes service object, which in turn creates the load balancer. Traffic addressed to the load balancer’s DNS name is routed to the ingress controller endpoint(s), which evaluates the rules associated with any defined ingress objects, before further routing to the endpoints of the services in the matched rules.

This scenario demonstrates the need to carefully consider how to configure the infrastructure to be secure, whilst providing the capabilities required for delivering services to their intended audience. It’s not a unique scenario, and there will be other situations that will require similar treatment.


Locking Down the Operating System and Container Runtime


Assuming we’ve investigated and applied the necessary security configuration to make the machine-level infrastructure and its environment secure, the next task is to lock down the host operating system (OS) of each machine, and the container runtime that’s responsible for managing the lifecycle of containers.


Linux OS


Whilst it’s possible to run Microsoft Windows Server as the OS for Kubernetes worker nodes, more often than not, the control plane and worker nodes will run a variant of the Linux operating system. There might be many factors that govern the choice of Linux distribution to use (commercials, in-house skills, OS maturity), but if its possible, use a minimal distribution that has been designed just for the purpose of running containers. Examples include CoreOS Container Linux, Ubuntu Core, and the Atomic Host variants. These operating systems have been stripped down to the bare minimum to facilitate running containers at scale, and as a consequence, have a significantly reduced attack surface.

Again, the CIS have a number of different benchmarks for different flavours of Linux, providing best practice recommendations for securing the OS. These benchmarks cover what might be considered the mainstream distributions of Linux, such as RHEL, Ubuntu, SLES, Oracle Linux and Debian. If your preferred distribution isn’t covered, there is a distribution independent CIS benchmark, and there are often distribution-specific guidelines, such as the CoreOS Container Linux Hardening Guide.


Docker Engine


The final component in the infrastructure layer is the container runtime. In the early days of Kubernetes, there was no choice available; the container runtime was necessarily the Docker engine. With the advent of the Kubernetes Container Runtime Interface, however, it’s possible to remove the Docker engine dependency in favour of a runtime such as CRI-O, containerd or Frakti.4 In fact, as of Kubernetes version 1.12, an alpha feature (Runtime Class) allows for running multiple container runtimes, side-by-side in a cluster. Whichever container runtimes are deployed, they need securing.

Despite the varied choice, the Docker engine remains the default container runtime for Kubernetes (although this may change to containerd in the near future), and we’ll consider its security implications here. It’s built with a large number of configurable security settings, some of which are turned on by default, but which can be bypassed on a per-container basis. One such example is the whitelist of Linux kernel capabilities applied to each container on creation, which helps to diminish the privileges available inside a running container.

Once again, the CIS maintain a benchmark for the Docker platform, the CIS Docker Benchmark. It provides best practice recommendations for configuring the Docker daemon for optimal security. There’s even a handy open source tool (script) called Docker Bench for Security, that can be run against a Docker engine, which evaluates the system for conformance to the CIS Docker Benchmark. The tool can be run periodically to expose any drift from the desired configuration.

Some care needs to be taken when considering and measuring the security configuration of the Docker engine when it’s used as the container runtime for Kubernetes. Kubernetes ignores much of the available functions of the Docker daemon, in preference of its own security controls. For example, the Docker daemon is configured to apply a default whitelist of available Linux kernel system calls to every created container, using a seccomp profile. Unless specified, Kubernetes will instruct Docker to create pod containers ‘unconfined’ from a seccomp perspective, giving containers access to each and every syscall available. In other words, what may get configured at the lower ‘Docker layer’, may get undone at a higher level in the platform stack. We’ll cover how to mitigate these discrepancies with security contexts, in a future article.


Summary


It might be tempting to focus all our attention on the secure configuration of the Kubernetes components of a platform. But as we’ve seen in this article, the lower layer infrastructure components are equally important, and are ignored at our peril. In fact, providing a secure infrastructure layer can even mitigate problems we might introduce in the cluster layer itself. Keeping our nodes private, for example, will prevent an inadequately secured kubelet from being exploited for nefarious purposes. Infrastructure components deserve the same level of attention, as the Kubernetes components, themselves.

In the next article, we’ll move on to discuss the implications of securing the next layer in the stack, the Kubernetes cluster components.


Footnotes


  1. AWS Security Best Practices 

  2. Azure Security Best Practices and Patterns 

  3. Best Practices for Enterprise Organizations (Google Cloud Platform) 

  4. Smaller, lighter container runtimes like containerd, singularly built for bootstrapping containers, are inherently more secure, because of their dedicated purpose. 

You May Also Like

These Stories on Tech

Feb 1, 2024
Dec 15, 2022
Sep 14, 2022