Beginner’s Guide to Kubernetes ☸️

Beginner’s Guide to Kubernetes ☸️

Kubernetes (k8s) is an open source container orchestration system that helps deploy and manage containerized. Learn how deployments have evolved, what’s the promise of containerization. Learn the basic architecture of Kubernetes, core concepts, and some examples. Our goal from this guide is to lower the barrier to entry and equip you with a mind map to navigate this landscape more confidently.

Lower the barrier to entry and equip yourself with a mind map to navigate Kubernetes landscape more confidently.

It’s easy to get lost in today’s continuously changing landscape of cloud-native technologies. The learning curve from a beginner’s perspective is quite steep, and without proper context, it becomes increasingly difficult to sift through all the buzzwords.

Image for post

If you have been developing software, chances are you may have heard of Kubernetes by now. Before we jump into what Kubernetes is, it’s essential to familiarize ourselves with containerization and how it came about.

In this guide, we are going to paint a contextual picture of how deployments have evolved, what’s the promise of containerization, where Kubernetes fits into the picture, and common misconceptions around it. We’ll also learn the basic architecture of Kubernetes, core concepts, and some examples. Our goal from this guide is to lower the barrier to entry and equip you with a mind map to navigate this landscape more confidently.

Use the links to skip ahead in the guide (unfortunately, section links don’t work on the Medium app):

👉 Evolution of the deployment model 🥚

👉 Demystifying container orchestration 🔓

👉 What exactly is Kubernetes? ☸️

👉 Show me by example ⚽️

👉 Basic features 🐎

👉 Ecosystem 🔄

👉 Common questions ❓

🥚 Evolution of the deployment model

This evolution can be categorized into three rough categories, namely traditional, virtualized, and containerized deployments. Let’s briefly touch upon each to better actualize this evolution.

Bare metal

Some of us are old enough to remember the archaic days when the most common way of deploying applications was on in-house physical servers. Cloud wasn’t a thing yet, organizations had to plan server capacity to be able to budget for it. Ordering new servers was a slow and tedious task that took weeks of vendor management and negotiations. Once shiny new servers did arrive, they came with the overhead of setup, deployment, uptime, maintenance, security, and disaster recovery.

From a deployment perspective, there was no good way to define resource boundaries in this model. Multiple applications, when deployed on the same server, would interfere with each other because of a lack of isolation. That forced deployment of applications on dedicated servers, which ended up being an expensive endeavor making resource utilization the biggest inefficiency.

In those days, companies had to maintain their in-house server rooms (many still do) and deal with prerequisites like air conditioning, uninterrupted power, and internet connectivity. Even with such high capital and operating expenses, they were limited in their ability to scale as demand increased. Adding additional capacity to handle increased load would involve installing new physical servers. “Write once and run everywhere” was a utopian dream, these were the days of “works on my machine”.

Virtual machines

Enter virtualization. This solution is a layer of abstraction on top of physical servers, such that it allows for running multiple virtual machines on any given server. It enforces a level of isolation and security, allowing better resource utilization, scalability, and reduced costs.

It allows us to run multiple apps, each in a dedicated virtual machine offering complete isolation. If one goes down, it doesn’t interfere with the other. Additionally, we can specify resource budgets for each. For example, allocate 40% of physical server resources to VM1 and 60% to VM2.

Okay, so this addresses isolation and resource utilization issues but what about scaling with increased load? Spinning a VM is way faster than adding a physical server. However, the scaling of VMs is still bound by available hardware capacity.

This is where public cloud providers come into the picture. They streamline the logistics of buying, maintaining, running, and scaling servers against a rental fee. This means organizations don’t have to plan for capacity beforehand. This brings down the capital expense of buying the server and operating expense of maintaining it significantly.

Containers

If we have already addressed the issue of isolation, resource utilization, and scaling with virtual machines, then why are we even talking about containers? Containers take it up a notch. You can think of them as mini virtual machines that, instead of packaging a full-fledged operating system, try to leverage the underlying host OS for most things. Container-based virtualization guarantees higher application density and maximum utilization of server resources.

An important distinction between virtual machines and containers is that VM virtualizes underlying hardware whereas the container virtualizes the underlying operating system. Both have their use cases, in fact, many container deployments use VM as their host operating system rather than running directly on bare metal.

The emergence of Docker engine accelerated the adoption of this technology. It has now become the defacto standard to build and share containerized apps — from desktop to the cloud. Shift towards microservices as a superior approach to application development is another important factor that has fueled the rise of containerization.

🔓 Demystifying container orchestration

While containers by themselves are extremely useful, they can become quite challenging to deploy, manage, and scale across multiple hosts in different environments. Container orchestration is another fancy word for streamlining this process.

Image for post

As of today, there are several open-source and proprietary solutions to manage containers out there.

Open-source landscape

If we look at the open-source landscape, some notable options include

  • Kubernetes
  • Docker Swarm
  • Apache Marathon on Mesos
  • Hashicorp Nomad
  • Titus by Netflix

Proprietary landscape

On the other hand, if we look at the propriety landscape, most of it is dominated by major public cloud providers. All of them came up with their home-grown solution to manage containers. Some of the notable mentions include:

  • Amazon Web Services (AWS): Elastic Beanstalk, Elastic Container Service (ECS), Fargate
  • Google Cloud Platform (GCP): Cloud Run, Compute Engine
  • Microsoft Azure: Container Instances, Web Apps for Containers

Gold standard

Similar to how Docker became the de-facto for containerization, the industry has found Kubernetes to rule the container orchestration landscape. That’s why most major cloud providers have started to offer managed Kubernetes service as well. We’ll learn more about them later in the ecosystem section.

☸️ What exactly is Kubernetes?

Kubernetes is open-source software that has become the defacto standard for orchestrating containerized workloads in private, public, and hybrid cloud environments.

It was initially developed by engineers at Google, who distilled years of experience in running production workloads at scale into Kubernetes. It was open-sourced in 2014 and has since been maintained by CNCF (Cloud Native Computing Foundation). It’s often abbreviated as k8s which is a numeronym (starting with the letter “k” and ending with “s” with 8 other characters in between).

Managing containers at scale is commonly referred to as quite challenging, why is that? Running a single Docker container on your laptop may seem trivial (we’ll see this in the example below) but doing that for a large number of containers across multiple hosts in an automated fashion ensuring zero downtime isn’t as trivial.

Let’s take an example of a Netflix-like video-on-demand platform consisting of 100+ microservices resulting in 5000+ containers running atop 100+ VMs of varying sizes. Different teams are responsible for different microservices. They follow continuous integration and continuous delivery (CI/CD) driven workflow and push to production multiple times a day. The expectation from production workloads is to be always available, scale up and down automatically if demand changes, and recover from failures when encountered.

In situations like these, the utility of container orchestration tools really shine. Tools like Kubernetes allow you to abstract away the underlying cluster of virtual or physical machines into one unified blob of resources. Typically they expose an API, using which you can specify how many containers you’d like to deploy for a given app and how they should behave under increased load. API-first nature of these tools allows you to automate deployment processes inside your CI pipeline, giving teams the ability to iterate quickly. Being able to manage this kind of complexity in a streamlined manner is one of the major reasons why tools like Kubernetes have gained such popularity.

Kubernetes Architecture

Image for post

To understand the Kubernetes’ view of the world, we need to familiarize ourselves with cluster architecture first. Kubernetes cluster is a group of physical or virtual machines which is divided into two high-level components, control plane and worker nodes. It’s okay if some of the terminologies mentioned below don’t make much sense yet.

Control plane — Acts as the brain for the entire cluster. In that, it is responsible for accepting instruction from the users, health checking all servers, deciding how to best schedule workloads, and orchestrating communication between components. Constituents of the control plane include:

  • kube-apiserver — Responsible for exposing Kubernetes API. In other words, this is the gateway into Kubernetes
  • etcd — Distributed, reliable key-value store that is used as a backing store for all cluster data
  • kube-scheduler — Responsible for selecting a worker node for newly created pods (also known as scheduling)
  • kube-controller-manager — Responsible for running controller processes like Node, Replication, Endpoints, etc. These controllers will start to make more sense after we discuss k8s objects
  • cloud-controller-manager — Holds cloud-specific control logic

Worker nodes — These are machines responsible for accepting instructions from the control plane and running containerized workloads. Node has the following sub-components:

  • kubelet — An agent that makes sure all containers are running in any given pod. We’ll get to what that means in a bit.
  • kube-proxy — A network proxy that is used to implement the concept of service. We’ll get to what that means in a bit.
  • Container runtime — This is the software responsible for running containers. Kubernetes supports Docker, containerd, rkt to name a few.

The key takeaway here is that the control plane is the brain responsible for accepting user instructions and figuring out the best way to execute them. Whereas worker nodes are machines responsible for obeying instructions from the control plane and running containerized workloads.

Kubernetes Objects

Now that we have some know-how of Kubernetes architecture, the next milestone in our journey is understanding the Kubernetes object model. Kubernetes has a few abstractions that make up the building blocks of any containerized workload.

We’ll go over a few different types of objects available in Kubernetes that you are more likely to interact with:

  • Pod — It is the smallest deployable unit of computing in Kubernetes hierarchy. It can contain one or more tightly coupled containers sharing environment, volumes, and IP space. Generally, it is discouraged for users to manage pods directly, instead, Kubernetes offers higher-level objects (deployment, statefulset & daemonset) to encapsulate that management.
  • Deployment — High-level object designed to ease the life cycle management of replicated pods. Users describe a desired state in the deployment object and the deployment controller changes the actual state to match the desired state. Generally, this is the object users interact with the most. It is best suited for stateless applications.
  • Stateful Set — You can think of it as a specialized deployment best suited for stateful applications like a relational database. They offer ordering and uniqueness guarantees.
  • Daemon Set — You can think of it as a specialized deployment when you want your pods to be on every node (or a subset of it). Best suited for cluster support services like log aggregation, security, etc.
  • Secret & Config Map — These objects allow users to store sensitive information and configuration respectively. These can then be exposed to certain apps thus allowing for more streamlined configuration and secrets management
  • Service — This object groups a set of pods together and makes them accessible through DNS within the cluster. Different types of services include NodePort, ClusterIP, and LoadBalancer.
  • Ingress — Ingress object allows for external access to the service in a cluster using an IP address or some URL. Additionally, it can provide SSL termination and load balancing as well
  • Namespace — This object is used to logically group resources inside a cluster

Note: There are other objects like Replication Controller, Replica Set, Job, Cron Job, etc. that we have deliberately skipped for simplicity’s sake.

kuberentes docker devops

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Ever Wondered Why We Use Containers In DevOps?

At some point we've all said the words, "But it works on my machine." It usually happens during testing or when you're trying to get a new project set up. Sometimes it happens when you pull down changes from an updated branch.

Docker manifest - A peek into image's manifest.json files

The docker manifest command does not work independently to perform any action. In order to work with the docker manifest or manifest list, we use sub-commands along with it. This manifest sub-command can enable us to interact with the image manifests. Furthermore, it also gives information about the OS and the architecture, that a particular image was built for. The image manifest provides a configuration and a set of layers for a container image. This is an experimenta

Docker Explained: Docker Architecture | Docker Registries

Following the second video about Docker basics, in this video, I explain Docker architecture and explain the different building blocks of the docker engine; docker client, API, Docker Daemon. I also explain what a docker registry is and I finish the video with a demo explaining and illustrating how to use Docker hub.

How to Extend your DevOps Strategy For Success in the Cloud?

DevOps and Cloud computing are joined at the hip, now that fact is well appreciated by the organizations that engaged in SaaS cloud and developed applications in the Cloud. During the COVID crisis period, most of the organizations have started using cloud computing services and implementing a cloud-first strategy to establish their remote operations. Similarly, the extended DevOps strategy will make the development process more agile with automated test cases.

What Is DevOps and Is Enterprise DevOps Any Good?

What is DevOps? How are organizations transitioning to DevOps? Is it possible for organizations to shift to enterprise DevOps? Read more to find out!