For some people, it is a replacement for automation and configuration management tools – leaving complex imperative deployment tools behind and moving on to declarative deployments, which simplify things but grant full flexibility to developers nonetheless.
Kubernetes not only represents a large projection area. It is currently one of the most active open source projects and many large and small companies are working on it. Under the cover of the Cloud Native Computing Foundation, which belongs to the Linux Foundation, a large community is organizing itself. Of course, the focus is on Kubernetes itself, but other projects such as Prometheus, OpenTracing, CoreDNS and Fluentd are also part of the CNCF by now. Essentially, the Kubernetes project is organized through Special Interest Groups (SIGs). The SIGs communicate via Slack, GitHub and weekly meetings, for everyone to attend.
In this article, the focus is less on the operation and internals of Kubernetes than on the user interface. We explain the building blocks of Kubernetes to set up our own application or build pipelines on a Kubernetes cluster.
The resource distribution on a computer is largely reserved for the operating system. Kubernetes performs a similar role in a Kubernetes cluster. It manages resources such as memory, CPU and storage, and distributes applications and services to containers on cluster nodes. Containers themselves have greatly simplified the workflow of developers and helped them to become more productive. Now Kubernetes takes the containers into production. This global resource management has several advantages, such as the more efficient utilization of resources, the seamless scaling of applications and services, and more importantly a high availability and lower operational costs. For orchestration, Kubernetes carries its own API, which is usually addressed via the CLI kubectl.
The most important functions of Kubernetes are:
- Containers are launched in so-called pods.
- The Kubernetes Scheduler assures that all resource requirements on the cluster are met at all times.
- Containers can be found via services. Service Discovery allows cluster distributed containers to be addressed by name.
- Liveness and readiness probes continuously monitor the state of applications on the cluster.
- The Horizontal Pod Scaler can automatically adjust the number of replicas based on different metrics (e. g. CPU).
- New versions can be rolled out via rolling updates.
The rather rudimentary described concepts below are typically needed to start a simple application on Kubernetes.
- Namespace: Namespaces can be used to divide a cluster into several logical units. By default, namespaces are not really isolated from each other. However, there are certain ways to restrict users and applications to certain namespaces.
- Pod: Pods represent the basic concept for managing containers. They can consist of several containers, which are subsequently launched together in a common context on a node. These containers always run together. If you scale a pod, the same containers are started together again. A pod is practical in that the user can run processes together; processes which originate from different container images, that is. An example would be a separate process which sends a services logs to a central logging service.In the common context of a pod, container memory can share network and storage. This allows porting applications to Kubernetes which had previously run together in a machine or VM. The advantage is that you can keep the release and development cycles of the individual containers separate. However, developers should not make the mistake of pushing all processes of a machine into a pod at once. As a result, it would lose the flexibility of distributing resources in the cluster evenly and scale them separately.
- Label: One or more key/value pairs can be assigned to each resource in Kubernetes. Using a selector, corresponding resources can be identified from these pairs. This means that resources can be grouped by labels. Some concepts such as services and ReplicaSets use labels to find pods.
- Service: Cubernetes services are based on a virtual construct – an abstraction, or rather a grouping of existing pods, which are matched using labels. With the help of a service, these pods can then, in turn, be found by other pods. Since pods themselves are very volatile and their addresses within a cluster can change at any time, services are assigned specific virtual IP addresses. These IP address can also be resolved via DNS. Traffic sent to these addresses is passed on to the matching pods.
- ReplicaSet: A ReplicaSet is also a grouping, but instead of making pods locatable, it’s to make sure that a certain number of pods run in the cluster altogether. A ReplicaSet notifies the scheduler on how many instances of a pod are to run in the cluster. If there are too many, some will be terminated until the designated number is reached. If too few are running, new pods will be launched.
- Deployment: Deployments are based on ReplicaSets. More specifically: Deployments are used to manage ReplicaSets. They take care of starting, updating, and deleting ReplicaSets. During an update, deployments create a new ReplicaSet and scale the pods upwards. Once the new pods run, the old ReplicaSet is scaled down and ultimately deleted. A Deployment can also be paused or rolled back.
- Ingress: Pods and services can only be accessed within a cluster, so if you want to make a service accessible for external access, you have to use another concept. Inbound objects define which ports and services can be reached externally, but unfortunately: Kubernetes in itself does not have a controller which uses these objects. However, there are some implementations within the community, the so-called ingress controllers. A quite typical Ingress controller is the nginx Ingress Controller.
- Config Maps and Secrets: Furthermore, there are two concepts for configuring applications in Kubernetes. Both concepts are quite similar, and typically the configurations are passed to the pod using either the file system or environment variables. As the name suggests, sensitive data is stored in secrets.
An exemplary application
For deploying a simple application to a Kubernetes cluster, a deployment, a service, and an ingress object is required. In this example, we issue a simple web server which responds with a Hello World website. The deployment defines two replicas of a pod with respectively one container of giantswarm/helloworld. Both the deployment and the pods are labeled helloworld, while the deployment is located in a default namespace (Listing 1).
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: helloworld labels: app: helloworld namespace: default spec: replicas: 2 selector: matchLabels: app: helloworld template: metadata: labels: app: helloworld spec: containers: - name: helloworld image: giantswarm/helloworld:latest ports: - containerPort: 8080
To make the pods accessible in the cluster, an appropriate service needs to be specified (Listing 2). This service is assigned to the default namespace as well and has a selector on the label helloworld.
apiVersion: v1 kind: Service metadata: name: helloworld labels: app: helloworld namespace: default spec: selector: app: helloworld
All that is missing now is that the service should be accessible externally. Therefore, the service receives an external DNS entry, whereby the clusters Ingress controller then forwards the traffic, which carries this DNS entry in its host header, to the helloworld pods (Listing 3).
apiVersion: extensions/v1beta1 kind: Ingress metadata: labels: app: helloworld name: helloworld namespace: default spec: rules: - host: helloworld.clusterid.gigantic.io http: paths: - path: / backend: serviceName: helloworld servicePort: 8080
Note: Kubernetes itself does not carry its own Ingress controller. However, there are some implementations: nginx, HAProxy, Træfik.
Professional tip: If there is a load balancer prior to the Kubernetes cluster, it is usually set up so that the traffic is forwarded to the Ingress controller. The Ingress controller service should then be made available on all nodes via NodePorts. Cloud providers typically use the LoadBalancer type. This type ensures that the cloud provider extension of Kubernetes automatically generates and configures a new load balancer.
These YAML definitions can now be stored in individual files or collectively in a file, and loaded onto a cluster with kubectl.
kubectl create -f helloworld-manifest.yaml
It is possible to file YAML files together in Helm Charts, which helps to avoid a constant struggle with single YAML files. Helm is a tool for the installation and management of complete applications. Furthermore, the YAML files are also incorporated as templates into the Charts, which makes it possible to establish different configurations. This allows developers to run their application on the same chart in a test enviroment, but with a different configuration in the production enviroment. This means that, if the cluster’s operating system is Kubernetes, then Helm is the package management. Although, Helm does need a service called Tiller, which can be installed on the cluster via
helm init. The following commands can be used to install Jenkins on the server:
helm repo update
helm install stable/Jenkins
The Jenkins chart will then be loaded from GitHub. There are also so-called application registries, which can manage charts, similar to container images (for example quay.io). Developers can now use the installed Jenkins to deploy their own Helm Charts, although this does require the installation of a Kubernetes-CI-Plug-in for Jenkins. This will result in a new Build Step, which can deploy the Helm Charts. The plug-in automatically creates a Cloud configuration in Jenkins and also configures the login details for the Kubernetes API.
Distributed Computing software can be challenging. This is the main reason for Kubernetes, to provide even more concepts, as to simplify the construction of such architectures. In most cases, the modules are special variations of above described resources. It is also possible to use them to configure, isolate or extend resources.
- Job: Starts one or more pods and secures their successful delivery
- Cron Job: Starts a Job in a specific or recurring timeframe
- DaemonSet: Sees to it, that Pods are distributed to all (or only a few determined) nodes.
- PersistentVolume,PersistentVolumeClaim: Definition of the storage medium in the cluster and the assignment to Pods.
- StorageClass: Does define the cluster’s available saving options
- StatefulSet: Similar to Replica Sets, it does start a specific amount of Pods. These though do have a specified and identifiable ID, which will still be assigned to the Pod even after a restart or a relocation, which is useful for libraries.
- NetworkPolicy: Allows the definition of a set of rules, which does control the networking attempts in a Cluster.
- RBAC: Role-based access control in a Cluster.
- PodSecurityPolicy: Defines the functionality of certain Pods, for example, which a host’s resources can be accessed by a container.
- ResourceQuota: Restricts usage of resources inside a Namespace.
- HorizontalPodAutoscaler: Scales Pods, based on the Cluster’s metrics.
- CustomResourceDefinition: Extends and adds a custom object to the Kubernetes AI. With CustomController, these objects can then also be managed within the Cluster (see: Operators)
In this context, one should not forget that the community is developing many tools and extensions for Kubernetes. The Kubernetes incubator currently contains 27 additional repositories and many other software projects offer interfaces for the Kubernetes API or are already equipped with Kubernetes manifestos.
Kubernetes is a powerful tool and the sheer depth of every single concept is just impressive. Though it probably will take some time to get a clearer overview of the tool’s possible operations. It’s still very important to mention, how all of its concepts are build upon each other so that it is possible to form building blocks, which then can be combined into whatever is needed at the time. This is one of the main strong points Kubernetes has, in contrast to regular frameworks, which abstract run times and processes and press applications into a specific form. Kubernetes grants a very flexible design in this regard. It is a well-rounded package of IaaS and Pass, which can draw upon Google’s many years of experience in the field of distributed computing. This experience can also be seen in the project’s contributors, who were able to apply their knowledge to it, due to learning from mistakes, which were made in previous projects, like the OpenStack, CloudFoundry and Mesos project. And today Kubernetes is widely spread in its use, all kinds of companies are using it, from GitHub and OpenAI to even Disney.
Timo Derstappen is co-founder of Giant Swarm in Cologne. He has many years of experience in building scalable and automated cloud architectures and his interest is mostly generated by lightweight product, process and software development concepts. Free software is a basic principle for him.
Sessions & Workshops about Container Technologies at DevOpsCon 2018
All News about DevOpsCon!