Kubernetes (k8s) is an open source container orchestration system that helps deploy and manage containerized. Learn how deployments have evolved, what’s the promise of containerization. Learn the basic architecture of Kubernetes, core concepts, and some examples. Our goal from this guide is to lower the barrier to entry and equip you with a mind map to navigate this landscape more confidently.
Lower the barrier to entry and equip yourself with a mind map to navigate Kubernetes landscape more confidently.
It’s easy to get lost in today’s continuously changing landscape of cloud-native technologies. The learning curve from a beginner’s perspective is quite steep, and without proper context, it becomes increasingly difficult to sift through all the buzzwords.
If you have been developing software, chances are you may have heard of Kubernetes by now. Before we jump into what Kubernetes is, it’s essential to familiarize ourselves with containerization and how it came about.
In this guide, we are going to paint a contextual picture of how deployments have evolved, what’s the promise of containerization, where Kubernetes fits into the picture, and common misconceptions around it. We’ll also learn the basic architecture of Kubernetes, core concepts, and some examples. Our goal from this guide is to lower the barrier to entry and equip you with a mind map to navigate this landscape more confidently.
Use the links to skip ahead in the guide (unfortunately, section links don’t work on the Medium app):
👉 Evolution of the deployment model 🥚
👉 Demystifying container orchestration 🔓
👉 What exactly is Kubernetes? ☸️
👉 Show me by example ⚽️
👉 Basic features 🐎
👉 Ecosystem 🔄
👉 Common questions ❓
This evolution can be categorized into three rough categories, namely traditional, virtualized, and containerized deployments. Let’s briefly touch upon each to better actualize this evolution.
Some of us are old enough to remember the archaic days when the most common way of deploying applications was on in-house physical servers. Cloud wasn’t a thing yet, organizations had to plan server capacity to be able to budget for it. Ordering new servers was a slow and tedious task that took weeks of vendor management and negotiations. Once shiny new servers did arrive, they came with the overhead of setup, deployment, uptime, maintenance, security, and disaster recovery.
From a deployment perspective, there was no good way to define resource boundaries in this model. Multiple applications, when deployed on the same server, would interfere with each other because of a lack of isolation. That forced deployment of applications on dedicated servers, which ended up being an expensive endeavor making resource utilization the biggest inefficiency.
In those days, companies had to maintain their in-house server rooms (many still do) and deal with prerequisites like air conditioning, uninterrupted power, and internet connectivity. Even with such high capital and operating expenses, they were limited in their ability to scale as demand increased. Adding additional capacity to handle increased load would involve installing new physical servers. “Write once and run everywhere” was a utopian dream, these were the days of “works on my machine”.
Enter virtualization. This solution is a layer of abstraction on top of physical servers, such that it allows for running multiple virtual machines on any given server. It enforces a level of isolation and security, allowing better resource utilization, scalability, and reduced costs.
It allows us to run multiple apps, each in a dedicated virtual machine offering complete isolation. If one goes down, it doesn’t interfere with the other. Additionally, we can specify resource budgets for each. For example, allocate 40% of physical server resources to VM1 and 60% to VM2.
Okay, so this addresses isolation and resource utilization issues but what about scaling with increased load? Spinning a VM is way faster than adding a physical server. However, the scaling of VMs is still bound by available hardware capacity.
This is where public cloud providers come into the picture. They streamline the logistics of buying, maintaining, running, and scaling servers against a rental fee. This means organizations don’t have to plan for capacity beforehand. This brings down the capital expense of buying the server and operating expense of maintaining it significantly.
If we have already addressed the issue of isolation, resource utilization, and scaling with virtual machines, then why are we even talking about containers? Containers take it up a notch. You can think of them as mini virtual machines that, instead of packaging a full-fledged operating system, try to leverage the underlying host OS for most things. Container-based virtualization guarantees higher application density and maximum utilization of server resources.
An important distinction between virtual machines and containers is that VM virtualizes underlying hardware whereas the container virtualizes the underlying operating system. Both have their use cases, in fact, many container deployments use VM as their host operating system rather than running directly on bare metal.
The emergence of Docker engine accelerated the adoption of this technology. It has now become the defacto standard to build and share containerized apps — from desktop to the cloud. Shift towards microservices as a superior approach to application development is another important factor that has fueled the rise of containerization.
While containers by themselves are extremely useful, they can become quite challenging to deploy, manage, and scale across multiple hosts in different environments. Container orchestration is another fancy word for streamlining this process.
As of today, there are several open-source and proprietary solutions to manage containers out there.
If we look at the open-source landscape, some notable options include
On the other hand, if we look at the propriety landscape, most of it is dominated by major public cloud providers. All of them came up with their home-grown solution to manage containers. Some of the notable mentions include:
Similar to how Docker became the de-facto for containerization, the industry has found Kubernetes to rule the container orchestration landscape. That’s why most major cloud providers have started to offer managed Kubernetes service as well. We’ll learn more about them later in the ecosystem section.
Kubernetes is open-source software that has become the defacto standard for orchestrating containerized workloads in private, public, and hybrid cloud environments.
It was initially developed by engineers at Google, who distilled years of experience in running production workloads at scale into Kubernetes. It was open-sourced in 2014 and has since been maintained by CNCF (Cloud Native Computing Foundation). It’s often abbreviated as k8s which is a numeronym (starting with the letter “k” and ending with “s” with 8 other characters in between).
Managing containers at scale is commonly referred to as quite challenging, why is that? Running a single Docker container on your laptop may seem trivial (we’ll see this in the example below) but doing that for a large number of containers across multiple hosts in an automated fashion ensuring zero downtime isn’t as trivial.
Let’s take an example of a Netflix-like video-on-demand platform consisting of 100+ microservices resulting in 5000+ containers running atop 100+ VMs of varying sizes. Different teams are responsible for different microservices. They follow continuous integration and continuous delivery (CI/CD) driven workflow and push to production multiple times a day. The expectation from production workloads is to be always available, scale up and down automatically if demand changes, and recover from failures when encountered.
In situations like these, the utility of container orchestration tools really shine. Tools like Kubernetes allow you to abstract away the underlying cluster of virtual or physical machines into one unified blob of resources. Typically they expose an API, using which you can specify how many containers you’d like to deploy for a given app and how they should behave under increased load. API-first nature of these tools allows you to automate deployment processes inside your CI pipeline, giving teams the ability to iterate quickly. Being able to manage this kind of complexity in a streamlined manner is one of the major reasons why tools like Kubernetes have gained such popularity.
To understand the Kubernetes’ view of the world, we need to familiarize ourselves with cluster architecture first. Kubernetes cluster is a group of physical or virtual machines which is divided into two high-level components, control plane and worker nodes. It’s okay if some of the terminologies mentioned below don’t make much sense yet.
Control plane — Acts as the brain for the entire cluster. In that, it is responsible for accepting instruction from the users, health checking all servers, deciding how to best schedule workloads, and orchestrating communication between components. Constituents of the control plane include:
Worker nodes — These are machines responsible for accepting instructions from the control plane and running containerized workloads. Node has the following sub-components:
The key takeaway here is that the control plane is the brain responsible for accepting user instructions and figuring out the best way to execute them. Whereas worker nodes are machines responsible for obeying instructions from the control plane and running containerized workloads.
Now that we have some know-how of Kubernetes architecture, the next milestone in our journey is understanding the Kubernetes object model. Kubernetes has a few abstractions that make up the building blocks of any containerized workload.
We’ll go over a few different types of objects available in Kubernetes that you are more likely to interact with:
Note: There are other objects like Replication Controller, Replica Set, Job, Cron Job, etc. that we have deliberately skipped for simplicity’s sake.
At some point we've all said the words, "But it works on my machine." It usually happens during testing or when you're trying to get a new project set up. Sometimes it happens when you pull down changes from an updated branch.
The docker manifest command does not work independently to perform any action. In order to work with the docker manifest or manifest list, we use sub-commands along with it. This manifest sub-command can enable us to interact with the image manifests. Furthermore, it also gives information about the OS and the architecture, that a particular image was built for. The image manifest provides a configuration and a set of layers for a container image. This is an experimenta
Following the second video about Docker basics, in this video, I explain Docker architecture and explain the different building blocks of the docker engine; docker client, API, Docker Daemon. I also explain what a docker registry is and I finish the video with a demo explaining and illustrating how to use Docker hub.
DevOps and Cloud computing are joined at the hip, now that fact is well appreciated by the organizations that engaged in SaaS cloud and developed applications in the Cloud. During the COVID crisis period, most of the organizations have started using cloud computing services and implementing a cloud-first strategy to establish their remote operations. Similarly, the extended DevOps strategy will make the development process more agile with automated test cases.
What is DevOps? How are organizations transitioning to DevOps? Is it possible for organizations to shift to enterprise DevOps? Read more to find out!