Manage your Kubernetes clusters in a Kubernetes native way

Manage your Kubernetes clusters in a Kubernetes native way

How the Cluster Management API being adopted by many cloud providers can help you manage your Kubernetes clusters

Background

In last year’s KubeCon Shanghai and KubeCon Seattle, some keynotes delivered the message that Kubernetes is becoming “boring,” but I would actually say that Kubernetes is becoming “mature.”

A “mature” Kubernetes was triggered by the growing ecosystem of Kubernetes, like developers who are now trying to move upward by running more services on top of Kubernetes. This means that Kubernetes is now becoming the integration engine and innovation platform, and some new services like Kubeflow for Machine Learning, Knative for Serverless, Istio for Service Mesh, and more are now emerging in the Kubernetes ecosystem. Not to mention, developers are also trying to extend their applications to a variety of environments: hybrid cloud, Federation V2 for multi cloud, multi-cloud environments, Edge Cloud, and Rancher K3S – just to name a few.

The growing ecosystem of Kubernetes also means that developers are increasingly using a Kative methodology to resolve new requirements that are based on some CRDs and Controllers (like cluster-registry, Federation V2, Kubernetes Operators). This Kubernetes-native way also enables developers to catch up with the new services and functions that are built on Kubernetes with more ease.

The Cluster Management API, or Cluster API for short, was born under such conditions. For hybrid cloud and multi-cloud environments, we needed a Kubernetes-native way to help users manage (create/delete/add-node/delete-node) their Kubernetes clusters for their different cloud providers, like AWS, Azure, GCE, OpenStack, and IBM Cloud™.

Introduction to the Cluster API

The Cluster API is a Kubernetes project, which uses the Kuberentes native way for cluster creation, configuration, and management. It provides optional, additive functionality on top of the Kubernetes core.

The Cluster API can be treated as a common framework for Kubernetes cluster management, though you might want to implement your own cloud provider that’s based on Cluster API. There are now many different kinds of cloud providers that are based on Cluster API, like AWS, Azure, GCE, OpenStack, and IBM Cloud. You can get a full list of cloud providers from here.

You can also refer to the what-and-why-of-cluster-api for a more detailed explanation for why a developer would need Cluster API.

Playing With cluster-api-provider-openstack

The cluster-api-provider-openstack is a cloud provider implementation of cluster-api for OpenStack. I’ll show you some steps on how to work with it.

Currently, there are three ways to provision your Kubernetes cluster via Cluster API – you can use Minikube, Kind, or your existing Kubernetes cluster. We encourage you use either Kind or your existing Kubernetes cluster, as they are faster. Here, I’ll show you the steps for how to use Kind or an existing Kubernetes cluster to provision your Kubernetes Clusters.

Prerequisites
  • Install kubectl
  • Install Kind. If you do not have an existing Kubernetes cluster, you can refer to the Kind installation doc for steps on how to install.
  • Build clusterctl command as follows:
  $ git clone https://github.com/kubernetes-sigs/cluster-api-provider-openstack $GOPATH/src/sigs.k8s.io/cluster-api-provider-openstack
  $ cd $GOPATH/src/sigs.k8s.io/cluster-api-provider-openstack/cmd/clusterctl
  $ go build

Prepare clouds.yaml

The clouds.yaml is a file that specifies your OpenStack configuration parameters, those parameters will specify where do you want to manage your Kubernetes clusters.

The following is an example for my environment. The project_id identifies which project you want to get resources from to provision to your Kubernetes cluster. You can refer to https://github.com/kubernetes-sigs/cluster-api-provider-openstack#quick-notes-on-cloudsyaml for more details on clouds.yaml.

clouds:
  openstack:
    auth:
      auth_url: "https://xxx.ibm.com:5000"
      username: "your user name"
      password: "your password"
      project_id: 07962130d7044e3c84e1825859d5bef9
      domain_name: "ibm"
      user_domain_name: "ibm"
    region_name: "RegionOne"
    interface: "public"
    identity_api_version: 3
    verify: false
    cacert: |
      -----BEGIN CERTIFICATE-----
      xxxxxxxx
      -----END CERTIFICATE-----

Generate Cluster Creation files

After your clouds.yaml is ready, you need to use it to generate your Kubernetes cluster creation files as follows:

cd examples/openstack
./generate-yaml.sh [options] <path/to/clouds.yaml> <openstack cloud> <provider os>

openstack cloud is the cloud you are going to use and you can get it from the clouds.yaml file. In my case, you will see my openstack cloud will be openstack as defined in clouds.yaml.

provider os specifies the operating system of the virtual machines Kubernetes will run on; currently, we only support Ubuntu and Centos.

In my case here, I use the following command to provision my Kubernetes cluster in OpenStack cloud with Ubuntu as my VM operating system.

./generate-yaml.sh ./clouds.yaml openstack ubuntu

After this command finishes, it will generate two objects: a folder named out in the current dir and a new SSH key pair, stored as $HOME/.ssh/openstack\_tmp and $HOME/.ssh/openstack\_tmp.pub.

The out folder will include three files:

./generate-yaml.sh ./clouds.yaml openstack ubuntu

The cluster.yaml mainly defines your Kubernetes cluster name, Kubernetes CIDR for Pods and Services, Kubernetes Service Domain, Cloud Provider, and more. The following is an example of the cluster.yaml in my environment.

[email protected]:~/go/src/sigs.k8s.io/cluster-api-provider-openstack/cmd/clusterctl/examples/openstack# cat out/cluster.yaml
apiVersion: "cluster.k8s.io/v1alpha1"
kind: Cluster
metadata:
  name: test1
spec:
    clusterNetwork:
        services:
            cidrBlocks: ["10.96.0.0/12"]
        pods:
            cidrBlocks: ["192.168.0.0/16"]
        serviceDomain: "cluster.local"
    providerSpec:
      value:
        apiVersion: "openstackproviderconfig/v1alpha1"
        kind: "OpenstackProviderSpec"

The machines.yaml file defines the machine spec that you want to provision for your Kubernetes cluster, like your VM image, VM floating IP, Kubernetes version, OpenStack network UUID, and OpenStack security group. The following is an example machines.yaml file in my environment for Kubernetes master node.

items:
- apiVersion: "cluster.k8s.io/v1alpha1"
  kind: Machine
  metadata:
    generateName: liugya-master-
    labels:
      set: master
  spec:
    providerSpec:
      value:
        apiVersion: "openstackproviderconfig/v1alpha1"
        kind: "OpenstackProviderSpec"
        flavor: m1.xlarge
        image: KVM-Ubt18.04-Srv-x64
        sshUserName: cloudusr
        keyName: cluster-api-provider-openstack
        availabilityZone: nova
        networks:
        - uuid: e2d9ead6-759b-4592-873d-981d3db07c86
        floatingIP: 9.20.206.22
        securityGroups:
        - uuid: 97acf9d4-e5bf-4fff-a2c0-be0b04fbc44b
        userDataSecret:
          name: master-user-data
          namespace: openstack-provider-system
        trunk: false
    versions:
      kubelet: 1.14.0
      controlPlane: 1.14.0

The provider-components.yaml file defines some CRD resources and controllers for the OpenStack cloud provider. It mainly includes two controllers as follows:

[email protected]:/home/cloudusr# kubectl get deploy -n openstack-provider-system
NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
clusterapi-controllers   1/1     1            1           3h8m
[email protected]:/home/cloudusr# kubectl get sts -n system
NAME                 READY   AGE
controller-manager   1/1     3h9m

The controller-manager manages some common resources for Cluster API, like MachineSet, MachineDeployment, and Nodes.

The clusterapi-controllers is implemented by the OpenStack cloud provider; it is mainly used to manage clusters and machines, like creating cluster resources and provisioning machines on OpenStack.

The SSH key pair enables the clusterctl to fetch the provisioned Kubernetes admin.conf from the master node and then migrates all of the controllers from bootstrap cluster to provisioned Kubernetes master node.

Create Cluster

After all of the files are generated, we can use the following command to create the Kubernetes cluster:

./clusterctl create cluster --v 4 --bootstrap-type kind --provider openstack  -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml  -p examples/openstack/out/provider-components.yaml

The above command will use kind to create a bootstrap cluster, which helps provision the master node for our Kubernetes cluster. After the master node is ready, the clusterctl command will migrate all of the controllers and Kubernetes resources to the new provisioned Kubernetes master node, and it will also delete the bootstrap cluster. Then the controllers running on the new master node will continue to provision other worker nodes.

If you have an existing Kubernetes cluster, you can use following command to create your Kubernetes clusters:

./clusterctl create cluster --bootstrap-cluster-kubeconfig /root/.kube/config --provider openstack  -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml  -p examples/openstack/out/provider-components.yaml

Please note that I am using an option for clusterctl named bootstrap-cluster-kubeconfig; this option specifies your existing Kubernetes cluster kubeconfig, and thus takes your existing Kubernetes cluster as the bootstrap cluster.

After the cluster is created, you can log on to your newly provisioned Kubernetes cluster and check your cluster info by using the following commands:

[email protected]:~# kubectl get clusters
NAME    AGE
test1   33h
[email protected]:~# kubectl get machines
NAME                  AGE
liugya-master-fx9nn   33h
liugya-node-qknd7     33h
[email protected]:~# kubectl get nodes
NAME                  STATUS   ROLES    AGE     VERSION
liugya-master-cpr5j   Ready    master   3h50m   v1.14.0
liugya-node-ngw58     Ready    <none>   3h47m   v1.14.0
Deleting the machine

As machine is also a Kubernetes resource, so you can use the command kubectl delete to delete the machine resource as follows,( here I was deleting my worker node).

[email protected]:~# kubectl delete machine liugya-node-ngw58
machine.cluster.k8s.io "liugya-node-ngw58" deleted
[email protected]:~# kubectl get machines
NAME                  AGE
liugya-master-cpr5j   4h38m
[email protected]:~# kubectl get nodes
NAME                  STATUS   ROLES    AGE     VERSION
liugya-master-cpr5j   Ready    master   4h40m   v1.14.0

You can see the worker node is now deleted and if you go to your OpenStack dashboard, you can also see the VM for your worker node was also deleted.

I0404 06:28:11.145721       1 controller.go:114] Running reconcile Machine for liugya-node-ngw58
I0404 06:28:12.129559       1 controller.go:173] Reconciling machine object liugya-node-ngw58 triggers idempotent update.
I0404 07:24:42.608702       1 controller.go:114] Running reconcile Machine for liugya-node-ngw58
I0404 07:24:42.608762       1 controller.go:147] reconciling machine object liugya-node-ngw58 triggers delete.
I0404 07:24:45.549715       1 controller.go:158] machine object liugya-node-ngw58 deletion successful, removing finalizer.

And if you check the log of the clusterapi controller for OpenStack, you will also see the following logs telling you that the machine is deleted.

Adding a machine

Here we can use and define a machine YAML template to add a new node. You can get the machine spec from your out/machines.yaml, but here I created a machine with the following command:

[email protected]:~# cat machine.yaml
apiVersion: "cluster.k8s.io/v1alpha1"
kind: Machine
metadata:
  name: liugya-node-1
  labels:
    set: node
spec:
  providerSpec:
    value:
      apiVersion: "openstackproviderconfig/v1alpha1"
      kind: "OpenstackProviderSpec"
      flavor: m1.medium
      image: KVM-Ubt18.04-Srv-x64
      sshUserName: cloudusr
      keyName: cluster-api-provider-openstack
      availabilityZone: nova
      networks:
      - uuid: e2d9ead6-759b-4592-873d-981d3db07c86
      floatingIP: 9.20.206.8
      securityGroups:
      - uuid: 97acf9d4-e5bf-4fff-a2c0-be0b04fbc44b
      userDataSecret:
        name: worker-user-data
        namespace: openstack-provider-system
      trunk: false
  versions:
    kubelet: 1.14.0

Then create the machine via kubectl apply, like the following:

[email protected]:~# kubectl apply -f node.yaml
machine.cluster.k8s.io/liugya-node-1 created
[email protected]:~# kubectl get machines
NAME                  AGE
liugya-master-cpr5j   5h24m
liugya-node-1         5s

You will see that the new machine was created. If you go back to your OpenStack dashboard, you will see a new VM was now being provisioned. And the clusterapi controller for OpenStack has the following logs telling you that a new machine is being created.

I0404 08:11:32.266191       1 controller.go:114] Running reconcile Machine for liugya-node-1
I0404 08:11:32.274115       1 controller.go:114] Running reconcile Machine for liugya-node-1
I0404 08:11:32.883271       1 controller.go:184] Reconciling machine object liugya-node-1 triggers idempotent create.
I0404 08:11:33.898700       1 actuator.go:132] Creating bootstrap token
W0404 08:12:21.943293       1 controller.go:186] unable to create machine liugya-node-1: Operation cannot be fulfilled on machines.cluster.k8s.io "liugya-node-1": the object has been modified; please apply your changes to the latest version and try again
I0404 08:12:22.944113       1 controller.go:114] Running reconcile Machine for liugya-node-1
I0404 08:12:23.969128       1 controller.go:173] Reconciling machine object liugya-node-1 triggers idempotent update.
I0404 08:12:24.639494       1 actuator.go:217] Populating current state for boostrap machine liugya-node-1
I0404 08:12:25.245633       1 controller.go:114] Running reconcile Machine for liugya-node-1
I0404 08:12:26.043068       1 controller.go:173] Reconciling machine object liugya-node-1 triggers idempotent update.

After the machine was provisioned by OpenStack, you’ll see the new machine was joined to your Kubernetes cluster as well.

[email protected]:~# kubectl get machines
kubec    NAME                  AGE
liugya-master-cpr5j   5h40m
liugya-node-1         16m
[email protected]:~# kubectl get nodes
NAME                  STATUS   ROLES    AGE     VERSION
liugya-master-cpr5j   Ready    master   5h40m   v1.14.0
liugya-node-1         Ready    <none>   14m     v1.14.0

Troubleshooting

When you run clusterctl create, you can always add the option --v 10 to run the command so as to get more logs from the clusterctl to help you check what might be wrong with the command.

You can also check the logs of pod clusterapi-controllers-xxx under the openstack-provider-system namespace to check some error details if your Kubernetes cluster management has some issues. We can use my cluster as an example to check how to debug.

[email protected]:/home/cloudusr# kubectl get pods -n openstack-provider-system
NAME                                     READY   STATUS    RESTARTS   AGE
clusterapi-controllers-cdf99445c-lfxhg   1/1     Running   0          32h
[email protected]:/home/cloudusr# kubectl logs -f clusterapi-controllers-cdf99445c-lfxhg -n openstack-provider-system
I0402 05:30:44.926979       1 main.go:73] Initializing Dependencies.
W0402 05:30:44.928905       1 controller.go:58] environment variable NODE_NAME is not set, this controller will not protect against deleting its own machine
2019/04/02 05:30:44 Starting the Cmd.
I0402 05:30:45.130286       1 controller.go:114] Running reconcile Machine for liugya-master-fx9nn
I0402 05:30:45.130359       1 controller.go:89] Running reconcile Cluster for test1
I0402 05:30:45.130376       1 controller.go:127] reconciling cluster object test1 triggers idempotent reconcile.
I0402 05:30:45.130384       1 actuator.go:34] Reconciling cluster test1.
I0402 05:30:46.034994       1 controller.go:173] Reconciling machine object liugya-master-fx9nn triggers idempotent update.
I0402 05:30:46.124564       1 networkservice.go:52] Reconciling network components for cluster default/test1
I0402 05:30:46.124608       1 secgroupservice.go:71] Reconciling security groups for cluster default/test1
I0402 05:30:46.126950       1 controller.go:114] Running reconcile Machine for liugya-node-qknd7
I0402 05:30:46.229409       1 controller.go:114] Running reconcile Machine for liugya-node-qknd7
I0402 05:30:46.325294       1 controller.go:89] Running reconcile Cluster for test1
I0402 05:30:46.325329       1 controller.go:127] reconciling cluster object test1 triggers idempotent reconcile.
I0402 05:30:46.325338       1 actuator.go:34] Reconciling cluster test1.
I0402 05:30:47.008618       1 controller.go:184] Reconciling machine object liugya-node-qknd7 triggers idempotent create.
I0402 05:30:47.080219       1 networkservice.go:52] Reconciling network components for cluster default/test1
I0402 05:30:47.080247       1 secgroupservice.go:71] Reconciling security groups for cluster default/test1
I0402 05:30:48.360307       1 actuator.go:132] Creating bootstrap token
E0402 05:31:21.794720       1 actuator.go:319] Machine error liugya-node-qknd7: Associate floatingIP err: Resource not found
W0402 05:31:21.795091       1 controller.go:186] unable to create machine liugya-node-qknd7: Associate floatingIP err: Resource not found
I0402 05:31:22.795322       1 controller.go:114] Running reconcile Machine for liugya-node-qknd7
I0402 05:31:23.290910       1 controller.go:173] Reconciling machine object liugya-node-qknd7 triggers idempotent update.
I0402 05:31:24.142941       1 actuator.go:217] Populating current state for boostrap machine liugya-node-qknd7
I0402 05:31:26.224401       1 controller.go:114] Running reconcile Machine for liugya-node-qknd7
I0402 05:31:27.652795       1 controller.go:173] Reconciling machine object liugya-node-qknd7 triggers idempotent update.
I0402 05:31:28.324527       1 actuator.go:217] Populating current state for boostrap machine liugya-node-qknd7
I0402 05:31:29.224497       1 controller.go:114] Running reconcile Machine for liugya-node-qknd7

Finally, you can check the /var/log/cloud-init-output.log file to get more details on the post-install process. OpenStack uses cloud-init to run user-data to do some post-install work for your provisioned VM, that will then do some installation work for Kubernetes, such as apt-get update or installing Docker. We can always check the cloud-init log to see the error logs for the VM post installationm, which will install Kubernetes via kubeadm.

Cluster API plans for the future

The Cluster API just released 0.1.0, and it is still in very early stages, but we are glad to see that it is now adopted by many cloud providers like AWS, Azure, GCE, and OpenStack. The Cluster API community is still working on defining the goals, requirements and use cases for Cluster API post-v1alpha. When that is complete, we’ll be diving more into the design changes required to meet those adjusted goals, requirements, and use cases.

If you have any comments or suggestions for Cluster API, please do not hesitate to post it to the Google docs.

Kubernetes Vs Docker

Kubernetes Vs Docker

This video on "Kubernetes vs Docker" will help you understand the major differences between these tools and how companies use these tools.

We will compare Kubernetes and Docker on the following factors:

  1. Definition
  2. Working
  3. Deployment
  4. Autoscaling
  5. Health check
  6. Setup
  7. Tolerance ratio
  8. Public cloud service providers
  9. Companies using them

Introduction to Microservices, Docker, and Kubernetes

Introduction to Microservices, Docker, and Kubernetes

Learn the basics of Microservices, Docker, and Kubernetes.

Introduction to Microservices, Docker, and Kubernetes

Learn the basics of Microservices, Docker, and Kubernetes. Code demo starts at 18:45. I mess up the terminal for the first few minutes, but I fix it by 21:50. Audio gets echoey a few times, but it goes away quickly. Sorry about that!

Deployment YAML: https://pastebin.com/rZa9Dm1w

Dockerfile: https://morioh.com/p/59a594cc28dc

Kubernetes vs. Docker

Kubernetes vs. Docker

In this post, you'll see the differences and similarities between two of the most influential open source projects of cloud computing.

Kubernetes vs. Docker is a topic that has been raised numerous times in the industry of cloud computing. Whether you come from a non-technical background and need a quick introduction or if you need to make a business decision, I hope that the following few points will clarify this matter once and for all.

We need to look beyond the hype that surrounds both Kubernetes and Docker. What these words mean is important to grasp before running your business on top of them.

The Symbiosis Between Kubernetes and Docker

The question “Kubernetes vs. Docker?” in itself is rather absurd, like comparing apples to oranges. One isn’t an alternative to the other. Quite the contrary, Kubernetes can run without Docker and Docker can function without Kubernetes. But Kubernetes can (and does) benefit greatly from Docker and vice versa.

Docker is a standalone application which can be installed on any computer to run containerized applications. Containerization is an approach of running applications on an OS such that the application is isolated from the rest of the system. You create an illusion for your application that it is getting its very own OS instance, although there may be other containers running on the same system. Docker is what enables us to run, create and manage containers on a single operating system.

Kubernetes turns it up to eleven. If you have Docker installed on a bunch of hosts (different operating systems), you can leverage Kubernetes. These nodes or Docker hosts can be bare metal servers or virtual machines. Kubernetes can then allow you to automate container provisioning, networking, load-balancing, security and scaling across all these nodes from a single command-line or dashboard. A collection of nodes that are managed by a single Kubernetes instance is referred to as a Kubernetes cluster.

Now, why would you need to have multiple nodes in the first place? The two main motivations behind it are:

  1. To make the infrastructure more robust — Your application will be online, even if some of the nodes go offline, i.e, High Availability.
  2. To make your application more scalable — If workload increases simply spawn more containers and/or add more nodes to your Kubernetes cluster.

“Kubernetes automates the process of scaling, managing, updating and removing containers. In other words, it is a container orchestration platform. While Docker is at the heart of the containerization, it enables us to have containers in the first place.“

Differences Between Kubernetes and Docker

In principle, Kubernetes can work with any containerization technology. Two of the most popular options that Kubernetes can integrate with are rkt and Docker. However, Docker has won the greatest market segment and that has led to a lot of effort in perfecting the integration between Docker and Kubernetes, more than any other containerization technology.

Similarly, Docker Inc., the company behind Docker, offers their own container orchestration engine, named Docker Swarm. But even they realized the fact that Kubernetes has risen to the point that even Docker for Desktop (MacOS and Windows) comes with its own Kubernetes distribution.

If anyone was nervous about adopting Kubernetes for their Docker-based product, that last point would get rid of all of the doubts. Both projects have wholeheartedly embraced each other and have benefited tremendously from this symbiosis.

Similarities Between Kubernetes and Docker

These projects are more than technologies, they are a community of people, who, despite their differences, are is composed of some of the brightest minds in the industry. When like-minded individuals collaborate, they exchange bright ideas and learn best practices from one another.

These are some of such ideas that both, Kubernetes and Docker, share:

  1. Their love for microservice based architecture (more on this later).
  2. Their love for open source community. Both are largely open source projects.
  3. They are largely written in Go which allows them to be shipped as small lightweight binaries.
  4. They use human-readable YAML files to specify application stacks and their deployments.

In theory, you can learn about one without having a clue about the other. But keep in mind that in practice you will benefit a lot more if you start with the simple case of Docker running on a single machine, and then gradually understand how Kubernetes comes into play.

Let’s go deeper into this topic...

What Is Docker?

There are two ways of looking at Docker. The first approach involves seeing Docker containers as really lightweight Virtual Machines. The second approach is to see Docker as a software packaging and delivery platform. This latter approach proved a lot more helpful to human developers and resulted in widespread adoption of the technology.

Let’s look at the two different viewpoints more closely...

An Overview of Docker Containers

Traditionally, cloud service providers used Virtual Machines to isolate running applications from one another. A hypervisor, or host operating system, provides virtual CPU, memory and other resources to many guest operating systems. Each guest OS works as if it is running on actual physical hardware, and it is, ideally, unaware of other guests running on the same physical server.

VMware was one of the first to popularize this concept. However, there are several problems with this virtualization. First of all, the provisioning of resources takes time. Each virtual disk image is large and bulky and getting a VM ready for use can take up to a minute!

Second, and a more important issue, was the inefficient utilization of system resources. OS kernels are control freaks that want to manage everything that’s supposedly available to them. So when a guest OS thinks 2GB of memory is available to it, it takes control of that memory even if the applications running on that OS uses only half of it.

On the other hand, when we run containerized applications, we virtualize the operating system (your standard libraries, packages, etc) itself, not the hardware. Now, instead of providing virtual hardware to a VM, you provide a virtual OS to your application. You can run multiple applications and impose limitations on their resource utilization if you want, and each application will run oblivious to the hundreds of other containers it is running alongside.

Docker — As a Developer’s Tool

One of the problems that developers have is the difference between the production server, where the applications run, and their own dev machines (usual laptops and workstations) where applications are developed. Let’s imagine that you have Windows 10 running on your desktop but you want to write applications for Ubuntu 18.04. Maybe you are using Python v3.6 to write your application, while the Ubuntu server is still running at 3.4.

There are just too many variables to take into account and so we use Docker to abstract that complexity away. Docker can be installed on any OS, even Windows and Mac OS X are well-supported. So you can package your code into a Docker image, run and test it locally using Docker to guaranteed that the containers that were created from that Docker image will behave the same way in production.

Note: All the dependencies like the version of programming language, standard library, etc., are all contained within that image.

This way of looking at Docker images as a software package has led to the following popular quote:

“Docker will do to apt what apt did to tar.”

Apt, the package manager, still uses tar under the hood, but users never have to worry about it. Similarly, while using Docker we would never have to worry about the package manager, although it will still be present. Even when developing on top of say, Node.js technology, developers prefer building their Docker images on top of Node’s official Docker image.

So, that’s a brief overview of what Docker is and why one might want to know about it even if they are not involved in DevOps.

Let’s continue with Kubernetes now.

What Is Kubernetes?

Kubernetes takes containerization technology, as described above, and turns it up to eleven. It allows us to run containers across multiple compute nodes (these can be VMs or a bare metal server). Once Kubernetes take control over a cluster of nodes, containers can then spun up or torn down depending upon our need at any given time.

If you visit their official site, Kubernetes states its purpose quite plainly as:

“Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.”

So far we have represented only a brief overview of Kubernetes as automating a bunch of container creation. An app needs to have storage, and there are some Kubernetes DNS records to manage. You need to make sure that the participating compute nodes are securely connected with one another and so on. Having a set of different nodes instead of a single host brings a whole different set of problems.

A brief overview of the Kubernetes architecture will help us shed some light on how it manages to achieve all of this and much more.

Kubernetes Architecture — A Brief Overview

There is two basic concept you need to know about a Kubernetes cluster. The first is node. This is a common term for VMs and/or bare metal servers that Kubernetes manages. The second term is pod which is a basic unit of deployment in Kubernetes. A pod is a collection of related Docker containers that need to coexist together. For example, your web server may need to be deployed with a Redis caching server so you can encapsulate the two of them into a single pod. Kubernetes deploys both of them side by side. If it makes matter simpler for you, you can totally picture a pod consisting of a single container and that would be fine.

Coming back to the nodes, there are two types of nodes. One is the Master Node where the heart of Kubernetes is installed that controls the scheduling of pods across various worker nodes where your application actually runs. The master node’s job is to make sure that the desired state of the cluster is maintained.

Here’s a brief summary of the Kubernetes’s diagram as shown above.

On Kubernetes Master we have:

  1. kube-controller-manager: This is responsible for taking into account the current state of the cluster (e.g, X number of running pods) and making decisions to achieve the desired state (e.g, having Y number of active pods instead). It listens on kube-apiserver for information about the state of the cluster
  2. kube-apiserver: This API server exposes the gears and levers of Kubernetes. It is used by WebUI dashboards and command-line utility like kubeclt. These utilities are in turn used by human operators to interact with the Kubernetes cluster.
  3. kube-scheduler: This is what decides how events and jobs would be scheduled across the cluster depending on the availability of resources, policy set by operators, etc. It, too, listens on kube-apiserver for information about the state of the cluster.
  4. etcd: This is the “storage stack” for the Kubernetes master nodes. It uses key-value pairs and is used to save policies, definitions, secrets, state of the system, etc

We can have multiple master nodes so that Kubernetes can survive even the failure of a master node.

On a worker node we have:

  1. kubelet: This relays the information about the health of the node back to the master as well as executing instructions given to it by the master node.
  2. kube-proxy: This network proxy allows various microservices of your application to communicate with each other, within the cluster, as well as expose your application to the rest of the world if you so desire. Each pod can talk to every other pod via this proxy, in principle.
  3. Docker: This is the last piece of the puzzle. Each node has a Docker engine to manage the containers.

There is, of course, a lot more of Kubernetes, and I encourage you to explore all of this.

Industry-Wide Adoption of Docker and Kubernetes

A lot of the concepts we have discussed so far sound good on paper, but are they economical? Will they actually help your business grow, reduce downtime and save resources both in terms of human hours and computing horsepower?

Docker in Production

The answer is simple when it comes to adopting Docker. Especially if you are adopting a microservice-based architecture for your software you should definitely use Docker containers for each microservice.

The technology is quite mature and very little can be said against it. Keep in mind, merely containerizing your code won’t make it better for you. Try avoiding monolithic designs and go for microservices if you actually want to make use of containerization platform.

Kubernetes in Production

One can’t be blamed for ranting about Kubernetes in Production and the reason behind it, in my personal opinion, is two-fold.

First, most organizations blindly jump without any understanding of the basic concepts of a distributed system. They try to set up their own Kubernetes cluster and use it to host simple websites or a small scalable application.

“This is quite risky if you don’t have an in-depth knowledge of the system. Things can easily break down”.

Secondly, Kubernetes is rapidly evolving, and other organizations are adding their own special sauce to it, like service mesh, networking plugins, etc. Most of these are open source and therefore are appealing to the operator. However, running them in production is not what I would recommend. Keeping up with them requires constant maintenance of your cluster and costs more human hours.

However, there are cloud-hosted Kubernetes platforms that organizations can use to run their applications. The worldwide availability of data centers that companies, like AWS, Azure, Joyent or GCE, offer can actually help you to get the most out of the distributed nature of Kubernetes. And, of course, you don’t have to worry about maintaining the cluster.

This is something small and medium scale organizations often miss. If you want to survive node failures and get high scalability you shouldn’t run Kubernetes on a single 1-U rack or even on a single data center.

So, Kubernetes in production? Yes, but for most folks, I would recommend cloud-hosted solutions.

Containers and A New Age of Cloud Computing

Docker wasn’t pitched as an OS-level virtualization software, it is marketed as a software packaging and delivery mechanism. The sole reason Docker containers got the attention that its competition didn’t is that of this software delivery approach.

Automated builds are a lot easier thanks to Dockerfiles. Complex multi-container deployments are now standardized thanks to docker-compose. Software engineers have taken containers to their logical extreme by providing complete CI/CD solutions involving building and testing Docker images and managing public or private Docker registries.

Kubernetes has freed containers from being stuck on a single computer, making the cloud an ever more enticing a place for this technology. Slowly, but surely, containerization will become the norm for every cloud dependent service and it’s, therefore, really important to adopt this technology earlier, rather than later. Doing so would minimize migration costs and associated risks.

A Case for The Distributed Operating System

Now that I have ranted about companies adopting Kubernetes without understanding it fully, allow me to make a case for why you should adopt Kubernetes. Cloud computing has evolved into this highly competitive market with Google, Microsoft, Amazon and many other players competing with one another.

This has drastically reduced the cost of deploying your software in the cloud. The best thing about Kubernetes is that it's a largely open source, so you can understand what’s happening without getting too bogged down by the details.

Here is Azure pitching its Kubernetes service:

“Use Azure Kubernetes Service to create and manage Kubernetes clusters. Azure will handle cluster operations, including creating, scaling, and upgrading, freeing up developers to focus on their application. To get started, create a cluster with Azure Kubernetes Service.”

Just knowing how it works on surface-level lets you reason about your software as it is running in a distributed system. But you don’t have to worry about actually managing the underlying cluster!

Similar solutions are being offered by Amazon, Google and soon by DigitalOcean. Even small businesses and individual developers can now scale their applications across the entire planet. A little understanding of how it is achieved doesn’t hurt, so you should at least have a passing familiarity with Kubernetes and Dockers.

Every time you think, “Kubernetes vs. Docker?” naysayers would respond by saying Docker is cool, but Kubernetes is a little extreme. But the entire computer science is about extreme automation and Kubernetes takes the containerization model to its logical extreme!

More Subtle Differences — Networking

A lot of Kubernetes vs. Docker debates have roots in the basics like the implementation of storage stack and networking. Both of Docker and Kubernetes like to do things differently.

A container needs a lot more than just a CPU and some memory to be useful. There are a lot of subtle differences between running an application on a platform like Kubernetes vs. Docker hosts. These differences are too many to be mentioned concisely here, but one that always catches my attention is the networking side of things.

Kubernetes specifies that each pod should be able to freely communicate with every other pod in the Cluster, in a given namespace. Whereas Docker has a concept of creating virtual network topologies and you have to specify which networks you want your containers to connect to. Distinctions like these can really put off people trying to test the waters, but they are crucial when you consider the fundamental differences of Kubernetes vs. Docker:

“The former is meant to run across a cluster while the latter runs on a single node.”

There’s really no alternative to this dilemma and you just need to be patient as you move along the learning curve. Gradually, the bigger picture will become clearer to your eyes.

9. Adoption Mindset for Docker vs. For Kubernetes

With Docker, the benefits are rather obvious. If you ship your application on a Docker container, then it can also be run on any Linux distro. Even Illumos-based operating systems, which are not Linux at all, support Docker, and can run Docker containers.

Your application can actually be broken down into several microservices, in this way each microservice can be packaged as a Docker container. With a well-defined API, new features can easily be added to the existing one. For example, if you want analytics, just spin up a Hadoop container that can talk to the database.

Similarly, when it comes to Kubernetes, both users and cloud service providers can actually benefit largely by adopting it. Since it is based on containerization, cloud service providers can get a high density of containers efficiently using their resources, unlike traditional VMs. This allows them to significantly lower the price.

Users, on the other hand, can deploy their app across the globe reducing latency and improving the user experience.

The only exception to this shift would be desktop application developers. Since most desktop app may use the cloud for updates and/or backups, but they are designed mostly to run on a single machine.

Conclusion

Containers are amazing! They allow us to think about services and systems in a completely new and digital way. Both Docker and Kubernetes are here to stay. They are continuously changing to transform themselves into something better in the future. Keep your company involved in the Technology era and implement the containers that your infrastructure needs the most.

Designing newer software for a container-centric platform would not only make your apps more scalable but also more future-proof. Sticking to the old VMs might work for now, but a few years down the line you will eventually have to either bear the heavy cost of migrating everything into containers or abandon your projects altogether. Hopefully, now if someone brings up the topic of “Kubernetes vs Docker” you won’t get swept away by jargons.