An Introduction to Kubernetes

An Introduction to Kubernetes

Kubernetes is a powerful open-source system, initially developed by Google, for managing containerized applications in a clustered environment. It aims to provide better ways of managing related, distributed components and services across varied infrastructure.

Introduction

Kubernetes is a powerful open-source system, initially developed by Google, for managing containerized applications in a clustered environment. It aims to provide better ways of managing related, distributed components and services across varied infrastructure.

In this guide, we’ll discuss some of Kubernetes’ basic concepts. We will talk about the architecture of the system, the problems it solves, and the model that it uses to handle containerized deployments and scaling.

What is Kubernetes?

Kubernetes, at its basic level, is a system for running and coordinating containerized applications across a cluster of machines. It is a platform designed to completely manage the life cycle of containerized applications and services using methods that provide predictability, scalability, and high availability.

As a Kubernetes user, you can define how your applications should run and the ways they should be able to interact with other applications or the outside world. You can scale your services up or down, perform graceful rolling updates, and switch traffic between different versions of your applications to test features or rollback problematic deployments. Kubernetes provides interfaces and composable platform primitives that allow you to define and manage your applications with high degrees of flexibility, power, and reliability.

Kubernetes Architecture

To understand how Kubernetes is able to provide these capabilities, it is helpful to get a sense of how it is designed and organized at a high level. Kubernetes can be visualized as a system built in layers, with each higher layer abstracting the complexity found in the lower levels.

At its base, Kubernetes brings together individual physical or virtual machines into a cluster using a shared network to communicate between each server. This cluster is the physical platform where all Kubernetes components, capabilities, and workloads are configured.

The machines in the cluster are each given a role within the Kubernetes ecosystem. One server (or a small group in highly available deployments) functions as the master server. This server acts as a gateway and brain for the cluster by exposing an API for users and clients, health checking other servers, deciding how best to split up and assign work (known as “scheduling”), and orchestrating communication between other components. The master server acts as the primary point of contact with the cluster and is responsible for most of the centralized logic Kubernetes provides.

The other machines in the cluster are designated as nodes: servers responsible for accepting and running workloads using local and external resources. To help with isolation, management, and flexibility, Kubernetes runs applications and services in containers, so each node needs to be equipped with a container runtime (like Docker or rkt). The node receives work instructions from the master server and creates or destroys containers accordingly, adjusting networking rules to route and forward traffic appropriately.

As mentioned above, the applications and services themselves are run on the cluster within containers. The underlying components make sure that the desired state of the applications matches the actual state of the cluster. Users interact with the cluster by communicating with the main API server either directly or with clients and libraries. To start up an application or service, a declarative plan is submitted in JSON or YAML defining what to create and how it should be managed. The master server then takes the plan and figures out how to run it on the infrastructure by examining the requirements and the current state of the system. This group of user-defined applications running according to a specified plan represents Kubernetes’ final layer.

Master Server Components

As we described above, the master server acts as the primary control plane for Kubernetes clusters. It serves as the main contact point for administrators and users, and also provides many cluster-wide systems for the relatively unsophisticated worker nodes. Overall, the components on the master server work together to accept user requests, determine the best ways to schedule workload containers, authenticate clients and nodes, adjust cluster-wide networking, and manage scaling and health checking responsibilities.

These components can be installed on a single machine or distributed across multiple servers. We will take a look at each of the individual components associated with master servers in this section.

etcd

One of the fundamental components that Kubernetes needs to function is a globally available configuration store. The etcd project, developed by the team at CoreOS, is a lightweight, distributed key-value store that can be configured to span across multiple nodes.

Kubernetes uses etcd to store configuration data that can be accessed by each of the nodes in the cluster. This can be used for service discovery and can help components configure or reconfigure themselves according to up-to-date information. It also helps maintain cluster state with features like leader election and distributed locking. By providing a simple HTTP/JSON API, the interface for setting or retrieving values is very straight forward.

Like most other components in the control plane, etcd can be configured on a single master server or, in production scenarios, distributed among a number of machines. The only requirement is that it be network accessible to each of the Kubernetes machines.

kube-apiserver

One of the most important master services is an API server. This is the main management point of the entire cluster as it allows a user to configure Kubernetes’ workloads and organizational units. It is also responsible for making sure that the etcd store and the service details of deployed containers are in agreement. It acts as the bridge between various components to maintain cluster health and disseminate information and commands.

The API server implements a RESTful interface, which means that many different tools and libraries can readily communicate with it. A client called kubectl is available as a default method of interacting with the Kubernetes cluster from a local computer.

kube-controller-manager

The controller manager is a general service that has many responsibilities. Primarily, it manages different controllers that regulate the state of the cluster, manage workload life cycles, and perform routine tasks. For instance, a replication controller ensures that the number of replicas (identical copies) defined for a pod matches the number currently deployed on the cluster. The details of these operations are written to etcd, where the controller manager watches for changes through the API server.

When a change is seen, the controller reads the new information and implements the procedure that fulfills the desired state. This can involve scaling an application up or down, adjusting endpoints, etc.

kube-scheduler

The process that actually assigns workloads to specific nodes in the cluster is the scheduler. This service reads in a workload’s operating requirements, analyzes the current infrastructure environment, and places the work on an acceptable node or nodes.

The scheduler is responsible for tracking available capacity on each host to make sure that workloads are not scheduled in excess of the available resources. The scheduler must know the total capacity as well as the resources already allocated to existing workloads on each server.

cloud-controller-manager

Kubernetes can be deployed in many different environments and can interact with various infrastructure providers to understand and manage the state of resources in the cluster. While Kubernetes works with generic representations of resources like attachable storage and load balancers, it needs a way to map these to the actual resources provided by non-homogeneous cloud providers.

Cloud controller managers act as the glue that allows Kubernetes to interact providers with different capabilities, features, and APIs while maintaining relatively generic constructs internally. This allows Kubernetes to update its state information according to information gathered from the cloud provider, adjust cloud resources as changes are needed in the system, and create and use additional cloud services to satisfy the work requirements submitted to the cluster.

Node Server Components

In Kubernetes, servers that perform work by running containers are known as nodes. Node servers have a few requirements that are necessary for communicating with master components, configuring the container networking, and running the actual workloads assigned to them.

A Container Runtime

The first component that each node must have is a container runtime. Typically, this requirement is satisfied by installing and running Docker, but alternatives like rkt and runc are also available.

The container runtime is responsible for starting and managing containers, applications encapsulated in a relatively isolated but lightweight operating environment. Each unit of work on the cluster is, at its basic level, implemented as one or more containers that must be deployed. The container runtime on each node is the component that finally runs the containers defined in the workloads submitted to the cluster.

kubelet

The main contact point for each node with the cluster group is a small service called kubelet. This service is responsible for relaying information to and from the control plane services, as well as interacting with the etcd store to read configuration details or write new values.

The kubelet service communicates with the master components to authenticate to the cluster and receive commands and work. Work is received in the form of a manifest which defines the workload and the operating parameters. The kubelet process then assumes responsibility for maintaining the state of the work on the node server. It controls the container runtime to launch or destroy containers as needed.

kube-proxy

To manage individual host subnetting and make services available to other components, a small proxy service called kube-proxy is run on each node server. This process forwards requests to the correct containers, can do primitive load balancing, and is generally responsible for making sure the networking environment is predictable and accessible, but isolated where appropriate.

Kubernetes Objects and Workloads

While containers are the underlying mechanism used to deploy applications, Kubernetes uses additional layers of abstraction over the container interface to provide scaling, resiliency, and life cycle management features. Instead of managing containers directly, users define and interact with instances composed of various primitives provided by the Kubernetes object model. We will go over the different types of objects that can be used to define these workloads below.

Pods

A pod is the most basic unit that Kubernetes deals with. Containers themselves are not assigned to hosts. Instead, one or more tightly coupled containers are encapsulated in an object called a pod.

A pod generally represents one or more containers that should be controlled as a single application. Pods consist of containers that operate closely together, share a life cycle, and should always be scheduled on the same node. They are managed entirely as a unit and share their environment, volumes, and IP space. In spite of their containerized implementation, you should generally think of pods as a single, monolithic application to best conceptualize how the cluster will manage the pod’s resources and scheduling.

Usually, pods consist of a main container that satisfies the general purpose of the workload and optionally some helper containers that facilitate closely related tasks. These are programs that benefit from being run and managed in their own containers, but are tightly tied to the main application. For example, a pod may have one container running the primary application server and a helper container pulling down files to the shared filesystem when changes are detected in an external repository. Horizontal scaling is generally discouraged on the pod level because there are other higher level objects more suited for the task.

Generally, users should not manage pods themselves, because they do not provide some of the features typically needed in applications (like sophisticated life cycle management and scaling). Instead, users are encouraged to work with higher level objects that use pods or pod templates as base components but implement additional functionality.

Replication Controllers and Replication Sets

Often, when working with Kubernetes, rather than working with single pods, you will instead be managing groups of identical, replicated pods. These are created from pod templates and can be horizontally scaled by controllers known as replication controllers and replication sets.

A replication controller is an object that defines a pod template and control parameters to scale identical replicas of a pod horizontally by increasing or decreasing the number of running copies. This is an easy way to distribute load and increase availability natively within Kubernetes. The replication controller knows how to create new pods as needed because a template that closely resembles a pod definition is embedded within the replication controller configuration.

The replication controller is responsible for ensuring that the number of pods deployed in the cluster matches the number of pods in its configuration. If a pod or underlying host fails, the controller will start new pods to compensate. If the number of replicas in a controller’s configuration changes, the controller either starts up or kills containers to match the desired number. Replication controllers can also perform rolling updates to roll over a set of pods to a new version one by one, minimizing the impact on application availability.

Replication sets are an iteration on the replication controller design with greater flexibility in how the controller identifies the pods it is meant to manage. Replication sets are beginning to replace replication controllers because of their greater replica selection capabilities, but they are not able to do rolling updates to cycle backends to a new version like replication controllers can. Instead, replication sets are meant to be used inside of additional, higher level units that provide that functionality.

Like pods, both replication controllers and replication sets are rarely the units you will work with directly. While they build on the pod design to add horizontal scaling and reliability guarantees, they lack some of the fine grained life cycle management capabilities found in more complex objects.

Deployments

Deployments are one of the most common workloads to directly create and manage. Deployments use replication sets as a building block, adding flexible life cycle management functionality to the mix.

While deployments built with replications sets may appear to duplicate the functionality offered by replication controllers, deployments solve many of the pain points that existed in the implementation of rolling updates. When updating applications with replication controllers, users are required to submit a plan for a new replication controller that would replace the current controller. When using replication controllers, tasks like tracking history, recovering from network failures during the update, and rolling back bad changes are either difficult or left as the user’s responsibility.

Deployments are a high level object designed to ease the life cycle management of replicated pods. Deployments can be modified easily by changing the configuration and Kubernetes will adjust the replica sets, manage transitions between different application versions, and optionally maintain event history and undo capabilities automatically. Because of these features, deployments will likely be the type of Kubernetes object you work with most frequently.

Stateful Sets

Stateful sets are specialized pod controllers that offer ordering and uniqueness guarantees. Primarily, these are used to have more fine-grained control when you have special requirements related to deployment ordering, persistent data, or stable networking. For instance, stateful sets are often associated with data-oriented applications, like databases, which need access to the same volumes even if rescheduled to a new node.

Stateful sets provide a stable networking identifier by creating a unique, number-based name for each pod that will persist even if the pod needs to be moved to another node. Likewise, persistent storage volumes can be transferred with a pod when rescheduling is necessary. The volumes persist even after the pod has been deleted to prevent accidental data loss.

When deploying or adjusting scale, stateful sets perform operations according to the numbered identifier in their name. This gives greater predictability and control over the order of execution, which can be useful in some cases.

Daemon Sets

Daemon sets are another specialized form of pod controller that run a copy of a pod on each node in the cluster (or a subset, if specified). This is most often useful when deploying pods that help perform maintenance and provide services for the nodes themselves.

For instance, collecting and forwarding logs, aggregating metrics, and running services that increase the capabilities of the node itself are popular candidates for daemon sets. Because daemon sets often provide fundamental services and are needed throughout the fleet, they can bypass pod scheduling restrictions that prevent other controllers from assigning pods to certain hosts. As an example, because of its unique responsibilities, the master server is frequently configured to be unavailable for normal pod scheduling, but daemon sets have the ability to override the restriction on a pod-by-pod basis to make sure essential services are running.

Jobs and Cron Jobs

The workloads we’ve described so far have all assumed a long-running, service-like life cycle. Kubernetes uses a workload called jobs to provide a more task-based workflow where the running containers are expected to exit successfully after some time once they have completed their work. Jobs are useful if you need to perform one-off or batch processing instead of running a continuous service.

Building on jobs are cron jobs. Like the conventional cron daemons on Linux and Unix-like systems that execute scripts on a schedule, cron jobs in Kubernetes provide an interface to run jobs with a scheduling component. Cron jobs can be used to schedule a job to execute in the future or on a regular, reoccurring basis. Kubernetes cron jobs are basically a reimplementation of the classic cron behavior, using the cluster as a platform instead of a single operating system.

Other Kubernetes Components

Beyond the workloads you can run on a cluster, Kubernetes provides a number of other abstractions that help you manage your applications, control networking, and enable persistence. We will discuss a few of the more common examples here.

Services

So far, we have been using the term “service” in the conventional, Unix-like sense: to denote long-running processes, often network connected, capable of responding to requests. However, in Kubernetes, a service is a component that acts as a basic internal load balancer and ambassador for pods. A service groups together logical collections of pods that perform the same function to present them as a single entity.

This allows you to deploy a service that can keep track of and route to all of the backend containers of a particular type. Internal consumers only need to know about the stable endpoint provided by the service. Meanwhile, the service abstraction allows you to scale out or replace the backend work units as necessary. A service’s IP address remains stable regardless of changes to the pods it routes to. By deploying a service, you easily gain discoverability and can simplify your container designs.

Any time you need to provide access to one or more pods to another application or to external consumers, you should configure a service. For instance, if you have a set of pods running web servers that should be accessible from the internet, a service will provide the necessary abstraction. Likewise, if your web servers need to store and retrieve data, you would want to configure an internal service to give them access to your database pods.

Although services, by default, are only available using an internally routable IP address, they can be made available outside of the cluster by choosing one of several strategies. The NodePort configuration works by opening a static port on each node’s external networking interface. Traffic to the external port will be routed automatically to the appropriate pods using an internal cluster IP service.

Alternatively, the LoadBalancer service type creates an external load balancer to route to the service using a cloud provider’s Kubernetes load balancer integration. The cloud controller manager will create the appropriate resource and configure it using the internal service service addresses.

Volumes and Persistent Volumes

Reliably sharing data and guaranteeing its availability between container restarts is a challenge in many containerized environments. Container runtimes often provide some mechanism to attach storage to a container that persists beyond the lifetime of the container, but implementations typically lack flexibility.

To address this, Kubernetes uses its own volumes abstraction that allows data to be shared by all containers within a pod and remain available until the pod is terminated. This means that tightly coupled pods can easily share files without complex external mechanisms. Container failures within the pod will not affect access to the shared files. Once the pod is terminated, the shared volume is destroyed, so it is not a good solution for truly persistent data.

Persistent volumes are a mechanism for abstracting more robust storage that is not tied to the pod life cycle. Instead, they allow administrators to configure storage resources for the cluster that users can request and claim for the pods they are running. Once a pod is done with a persistent volume, the volume’s reclamation policy determines whether the volume is kept around until manually deleted or removed along with the data immediately. Persistent data can be used to guard against node-based failures and to allocate greater amounts of storage than is available locally.

Labels and Annotations

A Kubernetes organizational abstraction related to, but outside of the other concepts, is labeling. A label in Kubernetes is a semantic tag that can be attached to Kubernetes objects to mark them as a part of a group. These can then be selected for when targeting different instances for management or routing. For instance, each of the controller-based objects use labels to identify the pods that they should operate on. Services use labels to understand the backend pods they should route requests to.

Labels are given as simple key-value pairs. Each unit can have more than one label, but each unit can only have one entry for each key. Usually, a “name” key is used as a general purpose identifier, but you can additionally classify objects by other criteria like development stage, public accessibility, application version, etc.

Annotations are a similar mechanism that allows you to attach arbitrary key-value information to an object. While labels should be used for semantic information useful to match a pod with selection criteria, annotations are more free-form and can contain less structured data. In general, annotations are a way of adding rich metadata to an object that is not helpful for selection purposes.

Conclusion

Kubernetes is an exciting project that allows users to run scalable, highly available containerized workloads on a highly abstracted platform. While Kubernetes’ architecture and set of internal components can at first seem daunting, their power, flexibility, and robust feature set are unparalleled in the open-source world. By understanding how the basic building blocks fit together, you can begin to design systems that fully leverage the capabilities of the platform to run and manage your workloads at scale.

What is Kubernetes | Kubernetes Tutorial For Beginners

What is Kubernetes | Kubernetes Tutorial For Beginners

This video on "What is Kubernetes | Kubernetes Tutorial For Beginners" will give you an introduction to one of the most popular Devops tool in the market - Kubernetes, and its importance in today's IT processes. This tutorial is ideal for beginners who want to get started with Kubernetes & DevOps

What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginners

The following topics are covered in this training session:

  1. Need for Kubernetes
  2. What is Kubernetes and What it's not
  3. How does Kubernetes work?
  4. Use-Case: Kubernetes @ Pokemon Go
  5. Hands-on: Deployment with Kubernetes

Building and Managing Kubernetes with Kubernetes

Building and Managing Kubernetes with Kubernetes

Building and Managing Kubernetes with Kubernetes: Kubernetes as a declarative and portable system can be used to do many things in different ways.

Kubernetes as a declarative and portable system can be used to do many things in different ways.

At eBay we built a fleet management system based on k8s. Everything(server, subnet, OS, package and state) is declarative and can be modeled as CRDs in k8s, or referred to as a commit id in git from the objects. By running various controllers on top of these CRD objects, we use k8s to manage k8s, and the entire eBay data center. - Our system provisions hosts the same way k8s creates and manages pods. - We build k8s clusters with Salt. each host has a set of states defined in its salt CRD object. controllers pull states from git based on commit ids to apply. - We build both schedulers and deployment transactions to manage the k8s clusters for both config deployments and upgrades. This declarative, highly scalable, auto healing, and cloud native system is what we think can unify eBay’s fleet.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Kubernetes

An illustrated guide to Kubernetes Networking

Google Kubernetes Engine By Example

An Introduction to the Kubernetes DNS Service

Deploying a Laravel app in Kubernetes on Google Cloud

How to build a Microservice Architecture with Spring Boot and Kubernetes?

Kubernetes Fundamentals - Learn Kubernetes from the Beginning

Kubernetes Fundamentals - Learn Kubernetes from the Beginning

Kubernetes is about orchestrating containerized apps. Docker is great for your first few containers. As soon as you need to run on multiple machines and need to scale/up down and distribute the load and so on, you need an orchestrator - you need Kubernetes

Kubernetes is about orchestrating containerized apps. Docker is great for your first few containers. As soon as you need to run on multiple machines and need to scale/up down and distribute the load and so on, you need an orchestrator - you need Kubernetes

This is the first part of a series of articles on Kubernetes, cause this topic is BIG!.

  • Part I - From the beginning, Part I, Basics, Deployment and Minikube: In this part, we cover why Kubernetes, some history and some basic concepts like deploying, Nodes, Pods.
  • Part II - Introducing Services and Labeling: In this part, we deepen our knowledge of Pods and Nodes. We also introduce Services and labeling using labels to query our artifacts.
  • Part III - Scaling: Here we cover how to scale our app
  • Part IV - Auto scaling: In this part we look at how to set up auto-scaling so we can handle sudden large increases of incoming requests
Resources Kubernetes

So what do we know about Kubernetes?

It's an open-source system for automating deployment, scaling, and management of containerized applications

Let'start with the name. It's Greek for Helmsman, the person who steers the ship. Which is why the logo looks like this, a steering wheel on a boat:

It's Also called K8s so K ubernete s, 8 characters in the middle are removed. Now you can impress your friends that you know why it's referred to as K8.

Here is some more Jeopardy knowledge on its origin. Kubernetes was born out of systems called Borg and Omega. It was donated to CNCF, Cloud Native Computing Foundation in 2014. It's written in Go/Golang.

If we see past all this trivia knowledge, it was built by Google as a response to their own experience handling a ton of containers. It's also Open Source and battle-tested to handle really large systems, like planet-scale large systems.

So the sales pitch is:

Run billions of containers a week, Kubernetes can scale without increasing your ops team

Sounds amazing right, billions of containers cause we are all Google size. No? :) Well even if you have something like 10-100 containers, it's for you.

In this part I hope to cover the following:

  • Why Kubernetes and Orchestration in General
  • Hello world: Minikube basics, talking through Minikube, simple deploy example
  • Cluster and basic commands, Nodes,
  • Deployments, what it is and deploying an app
  • Pods and Nodes, explain concepts and troubleshooting
Part I - From the beginning, Part I, Basics, Deployment and Minikube Why Orchestration

Well, it all started with containers. Containers gave us the ability to create repeatable environments so dev, staging, and prod all looked and functioned the same way. We got predictability and they were also light-weight as they drew resources from the host operating system. Such a great breakthrough for Developers and Ops but the Container API is really only good for managing a few containers at a time. Larger systems might consist of 100s or 1000+ containers and needs to be managed as well so we can do things like scheduling, load balancing, distribution and more.

At this point, we need orchestration the ability for a system to handle all these container instances. This is where Kubernetes comes in.

Getting started

Ok ok, let's say I buy into all of this, how do I get started?

Impatient ey, sure let's start to do something practical with Minikube

Ok, sounds good I'm a coder, I like practical stuff. What is Minikube?

Minikube is a tool that lets us run Kubernetes locally

Oh, sweet, millions of containers on my little machine?

Well, no, let's start with a few and learn Kubernetes basics while at it.

Installation

To install Minikube lets go to this installation page

It's just a few short steps that means we install

  • a Hypervisor
  • Kubectl (Kube control tool)
  • Minikube

Run

Get that thing up and running by typing:

minikube start

It should look something like this:

You can also ensure that kubectl have been correctly installed and running:

kubectl version

Should give you something like this in response:

Ok, now we are ready to learn Kubernetes.

Learning kubectl and basic concepts

In learning Kubernetes lets do so by learning more about kubectl a command line program that lets us interact with our Cluster and lets us deploy and manage applications on said Cluster.

The word Cluster just means a group of similar things but in the context of Kubernetes, it means a Master and multiple worker machines called Nodes. Nodes were historically called Minions

, but not so anymore.

The master decides what will run on the Nodes, which includes things like scheduled workloads or containerized apps. Which brings us to our next command:

kubectl get nodes

This should give us a result like this:

What this tells us what Nodes we have available to do work.

Next up let's try to run our first app on Kubernetes with the run command like so:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

This should give us a response like so:

Next up lets check that everything is up and running with the command:

kubectl get deployments

This shows the following in the terminal:

In putting our app on the Kluster, by invoking the run command, Kubernetes performed a few things behind the scenes, it:

  • searched for a suitable node where an instance of the application could be run, there was only one node so it got chosen
  • scheduled the application to run on that Node
  • configured the cluster to reschedule the instance on a new Node when needed

Next up we are going to introduce the concept Pod, so what is a Pod?

A Pod is the smallest deployable unit and consists of one or many containers, for example, Docker containers. That's all we are going to say about Pods at the moment but if you really really want to know more have a read here

The reason for mentioning Pods at this point is that our container and app is placed inside of a Pod. Furthermore, Pods runs in a private isolated network that, although visible from other Pods and services, it cannot be accessed outside the network. Which means we can't reach our app with say a curl command.

We can change that though. There is more than one way to expose our application to the outside world for now however we will use a proxy.

Now open up a 2nd terminal window and type:

kubectl proxy

This will expose the kubectl as an API that we can query with HTTP request. The result should look like:

Instead of typing kubectl version we can now type curl http://localhost:8001/version and get the same results:

The API Server inside of Kubernetes have created an endpoint for each pod by its pod name. So the next step is to find out the pod name:

kubectl get pods

This will list all the pods you have, it should just be one pod at this point and look something like this:

Then you can just save that down to a variable like so:

Lastly, we can now do an HTTP call to learn more about our pod:

curl http://localhost:8001/api/v1/namespaces/default/pods/$POD_NAME

This will give us a long JSON response back (I trimmed it a bit but it goes on and on...)

Maybe that's not super interesting for us as app developers. We want to know how our app is doing. Best way to know that is looking at the logs. Let's do that with this command:

kubectl logs $POD_NAME

As you can see below we know get logs from our app:

Now that we know the Pods name we can do all sorts of things like checking its environment variables or even step inside the container and look at the content.

kubectl exec $POD_NAME env

This yields the following result:

Now lets step inside the container:

kubectl exec -ti $POD_NAME bash

We are inside! This means we can see what the source code looks like even:

cat server.js

Inside of our container, we can now reach the running app by typing:

curl http://localhost:8080

Summary Part I

This is where we will stop for now.
What did we actually learn?

  • Kubernetes, its origin what it is
  • Orchestration why you will soon need it
  • Concepts like Master, Nodes and Pods
  • Minikube, kubectl and how to deploy an image onto our Cluster
Part II - Introducing Services and Labeling

In this part we will cover the following:

  • Deepen our knowledge on Pods and Nodes
  • Introduce Services and labeling
  • Perform an exercise, involving setting labels on Pods and use labels to query our artifacts
Concepts revisited

When we create a Deployment on Kubernetes, that Deployment creates Pods with containers inside them. So Pods are tied to Nodes and will continue to exist until terminated or deleted. Let's try to educate ourselves a bit more on Pods, Nodes and let's also introduce a new topic Services.

Pods

Pods are the atomic unit on the Kubernetes platform, i.e smallest possible deployable unit

We've stated the above before but it's worth mentioning again.

What else is there to know?

A Pod is an abstraction that represents a group of one or more containers, for example, Docker or rkt, and some shared resources for those containers. Those resources include:

  • Shared storage, as Volumes
  • Networking, as a unique cluster IP address
  • Information about how to run each container, such as the container image version or specific ports to use

A Pod can have more than one container. If it does contain more than one container it is so the other containers can support the primary application.
Typical examples of helper applications are data pullers, data pushers, and proxies. You can read more on that use case here

  1. The containers in a Pod share an IP Address and port space and are:
  2. Always co-located
  3. Co-scheduled

Let me show you an image to make it easier to visualize:

As we can see above a Pod can have a lot of different artifacts in them that are able to communicate and support the app in some way.

Nodes

A Pod always runs on a Node

So Node is the Pods parent?

Yes.

A Node is a worker machine and may be either a virtual or a physical machine, depending on the cluster

Each Node is managed by the Master. A Node can have multiple pods.

So it's a one to many relationship

The Kubernetes master automatically handles scheduling the pods across the Nodes in the cluster

Every Kubernetes Node runs at least a:

  • Kubelet, is responsible for the pod spec and talks to the cri interface

  • Kube proxy, is the main interface for coms between nodes

  • A container runtime, (like Docker, rkt) responsible for pulling the container image from a registry, unpacking the container, and running the application.

Ok so a Node contains a Kubelet and container runtime and one to many Pods. I think I got it.

Let's show an image to make this info stick, cause it's quite important that we know what goes on, at least at a high level:

Services

Pods are mortal, they can die. Pods, in fact, have a lifecycle.

When a worker node dies, the Pods running on the Node are also lost.

What happens to our apps? :(

You might think them and their data are lost but not so. The whole point with Kubernetes is to not let that happen. We normally deploy something like a ReplicaSet.

A ReplicaSet, what do you mean?

A ReplicaSet is a high-level artifact that can drive the cluster back to desired state via the creation of new Pods to keep your application running.

Ok so if a Pod goes down the ReplicaSet just creates a new Pod/s in its place?

Yes, exactly that. If you focus on defining a desired state the rest is up to Kubernetes.

Phew sounds really great then.

This concept of desired state is a very important one. You need to specify how many containers you want of each kind, at all times.

Oh so 4 database containers, 3 services etc?

Yes exactly.

So you don't have to care about the details just tell Kubernetes what state you want and it does the rest. If something goes up, Kubernetes ensures it comes back up again to desired state.

Each Pod in a Kubernetes cluster has a unique IP address, even Pods on the same Node, so there needs to be a way of automatically reconciling changes among Pods so that your applications continue to function.

Ok?

Yea, think like this. If a Pod containing your app goes down and another Pod is created in its place, running your app. Users should still be able to use your app after that.

Ok I got it. Makes me think...

The motivation for a Service

You should never refer to a Pod by it's IP address, just think what happens when a Pod goes down and comes back up again but this time with a different IP. It is for that reason a Service exists.

A Service in Kubernetes is an abstraction which defines a logical set of Pods and a policy by which to access them.

Makes me think of a routers and subnets

Yea I guess you can say there is a resemblance in there somewhere.

Services enable a loose coupling between dependent Pods and are defined using YAML or JSON file, just like all Kubernetes objects.

That's handy, just JSON and YAML :)

Services and Labels

The set of Pods targeted by a Service is usually determined by a LabelSelector.

Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. We can expose them through a proxy though as we showed in part I.

Wait, go back a second here, you said LabelSelector. I wasn't quite following?

Remember how we couldn't refer to Pods by IP, cause Pods might go down and a new Pod could come back in its place?

Yes

Well, labels are the answer to how Services and Pods are able to communicate. This is what we mean by loose coupling. By applying labels like for example frontend, backend, release and so on to Pods, we are able to refer to Pods by their logical name rather than their specifics, i.e IP number.

Oh I get it, so it's a high-level domain language

Mm, kind of.

Services and Traffic

Services allow your applications to receive traffic.

Services can be exposed in different ways by specifying a type in ServiceSpec, service specification.

  • ClusterIP (default) - Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
  • NodePort - Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using :. Superset of ClusterIP.
  • LoadBalancer - Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.
  • ExternalName - Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher of kube-dns.

Ok I think I get it. Ensure I'm speaking externally to a Service instead of specific Pods. Depending on what I expose the Service as, that leads to different behavior?

Yea that's correct.

You said something about labels though, how do we create and apply them to Pods?

Yea lets talk about that next.

Labels

As we just mentioned, Services are the abstraction that allows pods to die and replicate in Kubernetes without impacting your application.

Now, Services match a set of Pods using labels and selectors, it allows us to operate on Pods like a group.

Labels are key/value pairs attached to objects and can be used in any number of ways:

  • Designate objects for development, test, and production
  • Embed version tags
  • Classify an object using tags

Labels can be attached to objects at creation time or later on. They can be modified at any time.

Lab - Fun with Labels and kubectl

It's a good idea to have read the first part of this series where we create a deployment. If you haven't you need to first create a deployment like so:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

Now we should be good to go.

Ok. I know you are probably all tired from all theory by now.

I bet you are just itching to learn more hands on Kubernetes with kubectl.

Well, the time for that has come :). We will do two things:

  1. Create a Service and learn how we can expose our app using said Service
  2. Learn about Labeling and how we can improve our querying game by having appropriate labels on our artifacts.

Let's create a new service.

We will get acquainted with the expose command.

Let's check for existing pods,

kubectl get pods

Next let's see what services we have:

kubectl get services

Next lets create a Service like so:

kubectl expose deployment/kubernetes-first-app --type="NodePort" --port 8080

As you can see above we are just targeting one of our deployments kubernetes-first-app and referring to it with [type]/[deployment name] and type being deployment.

We expose it as service of type NodePort and finally, we choose to expose it at port 8080.

Now run kubectl get services again and see the results:

As you can see we now have two services in use, our basic kubernetes service and our newly created kubernetes-first-app.

Next up we need to grab the port of our service and assign that to a variable:

export NODE_PORT=$(kubectl get services/kubernetes-first-app -o go-template='{{(index .spec.ports 0).nodePort}}')
echo NODE_PORT=$NODE_PORT

We now have a our port stored on environment variable NODE_PORT and we are ready to start communicating with our service like so:

curl $(minikube ip):$NODE_PORT

Which leads to the following output:

Creating and applying Labels

When we created our deployment and our Pod, it was automatically assigned with a label.

By typing

kubectl describe deployment

we can see the name of said label.

Next up we can query the pods by that same label

kubectl get pods -l run=kubernetes-first-app

Above we are using -l to query for a specific label and kubernetes-bootcamp as the name of the label. This gives us the following result:

You can do a similar query to your services:

kubectl get services -l run=kubernetes-first-app

That just shows that you can query on different levels, for specific Pods or Services that have Pods with that label.

Next up we will look at how to change the label

First let's get the name of the pod, like so:

POD_NAME=kubernetes-first-app-669789f4f8-6glpx

Above I'm just assigning what my Pod is called to a variable POD_NAME. Check with a kubectl getpods what your Pod is called.

Then we can add/apply the new label like so:

kubectl label pod $POD_NAME app=v1

Verify that the new label have been set, like so:

kubectl describe pod

or

kubectl describe pods $POD_NAME

As you can see from the result our new label app=v1 has been appended to existing labels.

Now we can query like so:

kubectl get pods -l app=v1

That's pretty much how labeling works, how to get available labels, apply them and use them in a query. Ensure to give them a descriptive name like an app version, a certain environment or a name like frontend or backend, something that makes sense to your situation.

Clean up

Ok, so we created a service. We should learn how to clean up after ourselves. Run the following command to remove our service:

kubectl delete service -l run=kubernetes-bootcamp

Verify the service is no longer there with:

kubectl get services

also, ensure our exposed IP and port can no longer be reached:

curl $(minikube ip):$NODE_PORT

Just because the service is gone doesn't mean the app is gone. The app should still be reachable on:

kubectl exec -ti $POD_NAME curl localhost:8080

Summary Part II

So what did we learn? We learned a bit more on Pods and Nodes. Furthermore, we learned that we shouldn't speak directly to Pods but rather use a high-level abstraction such as Services. Services use labels as a way to define a domain language and apply those to different Pods.

Ok, so we understand a bit more on Kubernetes and how different concepts relate. We mentioned something called desired state a number of times but we didn't go into detail on how to set such a state. That's our next part in this series where we will cover how to set the desired state and how Kubernetes maintains it, so stay tuned.

Part III - Scaling

This third part aims to show how you scale your application. We can easily set the number of Replicas we want of a certain application and let Kubernetes figure out how to do that. This is us defining a so-called desired state.

When traffic increases, we will need to scale the application to keep up with user demand. We've talked about deployments and services, now lets talk scaling.

What does scaling mean in the context of Kubernetes?

We get more Pods. More Pods that are scheduled to nodes.

Now it's time to talk about desired state again, that we mentioned in previous parts.

This is where we relinquish control to Kubernetes. All we need to do is tell Kubernetes how many Pods we want and Kubernetes does the rest.

So we tell Kubernetes about the number of Pods we want, what does that mean? What does Kubernetes do for us?

It means we get multiple instances of our application. It also means traffic is being distributed to all of our Pods, ie. load balancing.

Furthermore, Kubernetes, or more specifically, services within Kubernetes will monitor which Pods are available and send traffic to those Pods.

Scaling demo Lab

If you haven't followed the first two parts I do recommend you go back and have a read. What you need for the following to work is at least a deployment. So if you haven't created one, here is how:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

Let's have a look at our deployments:

kubectl get deployments

Let's look closer at the response we get:

We have three pieces of information that are important to us. First, we have the READY column in which we should read the value in the following way, CURRENT STATE/DESIRED STATE. Next up is the UP_TO_DATE column which shows the number of replicas that were updated to match the desired state.
Lastly, we have the AVAILABLE column that shows how many replicas we have available to do work.

Let's scale

Now, let's do some scaling. For that we will use the scale command like so:

kubectl scale deployments/kubernetes-first-app --replicas=4 

as we can see above the number of replicas was increased to 4 and kubernetes is thereby ready to load balance any incoming requests.

Let's have a look at our Pods next:

When we asked for 4 replicas we got 4 Pods.

We can see that this scaling operation took place by using the describe command, like so:

kubectl describe deployments/kubernetes-first-app

In the above image, we are given quite a lot of information on our Replicas for example, but there is some other information in there that we will explain later on.

Does it load balance?

The whole point with the scaling was so that we could balance the load on incoming requests. That means that not the same Pod would handle all the requests but that different Pods would be hit.
We can easily try this out, now that we have scaled our app to contain 4 replicas of itself.

So far we used the describe command to describe the deployment but we can use it to describe the IP and port of. Once we have the IP and port we can then hit it with different HTTP requests.

kubectl describe services/kubernetes-first-app

Especially look at the NodePort and the Endpoints. NodePort is the port value that we want to hit with an HTTP request.

Now we will actually invoke the cURL command and ensure that it hits a different port each time and thereby prove our load balancing is working. Let's do the following:

NODE_PORT=30450

Next up the cURL call:

curl $(minikube ip):$NODE_PORT

As you can see above we are doing the call 4 times. Judging by the output and the name of the instance we see that we are hitting a different Pod for each request. Thereby we see that the load balancing is working.

Scaling down

So far we have scaled up. We managed to go from one Pod to 4 Pods thanks to the scale command. We can use the same command to scale down, like so:

kubectl scale deployments/kubernetes-first-app --replicas=2 

Now if we are really fast adding the next command we can see how the Pods are being removed as Kubernetes is trying to adjust to desired state.

2 out of 4 Pods are saying Terminating as only 2 Pods are needed to maintain the new desired state.

Running our command again we see that only 2 Pods remain and thereby our new desired state have been reached:

We can also look at our deployment to see that our scale instruction has been parsed correctly:

Self-healing

Self-healing is Kubernetes way of ensuring that the desired state is maintained. Pods don't self heal cause Pods can die. What happens is that a new Pod appears in its place, thanks to Kubernetes.

So how do we test this?

Glad you asked, we can delete a Pod and see what happens. So how do we do that? We use the delete command. We need to know the name of our Pod though so we need to call get pods for that. So let's start with that:

kubectl get pods

Then lets pick one of our two Pods kubernetes-first-app-669789f4f8-6glpx and assign it to a variable:

POD_NAME=kubernetes-first-app-669789f4f8-6glpx

Now remove it:

kubectl delete pods $POD_NAME

Let's be quick about it and check our Pod status with get pods. It should say Terminating like so:

Wait some time and then echo out our variable $POD_NAME followed by get pods. That should give you a result similar to the below.

So what does the above image tell us? It tells us that the Pod we deleted is truly deleted but it also tells us that the desired state of two replicas has been achieved by spinning up a new Pod. What we are seeing is * self-healing* at work.

Different ways to scale

Ok, we looked at a way to scale by explicitly saying how many replicas we want of a certain deployment. Sometimes, however, we might want a different way to scale namely auto-scaling. Auto-scaling is about you not having to set the exact number of replicas you want but rather rely on Kubernetes to create the number of replicas it thinks it needs. So how would Kubernetes know that? Well, it can look at more than one thing but a common metric is CPU utilization. So let's say you have a booking site and suddenly someone releases Bruce Springsteen tickets you are likely to want to rely on auto-scaling, cause the next day when the tickets are all sold out you want the number of Pods to go back to normal and you wouldn't want to do this manually.

Auto-scaling is a topic I plan to cover more in detail in a future article so if you are really curious how that is done I recommend you have a look here

Summary Part III

Ok. So we did it. We managed to scale an app by creating replicas of it. It wasn't so hard to accomplish. We showed how we only needed to provide Kubernetes with a desired state and it would do its utmost to preserve said state, also called * self-healing*. Furthermore, we mentioned that there was another way to scale, namely auto-scaling but decided to leave that topic for another article. Hopefully, you are now more in awe of how amazing Kubernetes is and how easy it is to scale your app.

Part IV - Auto scaling

In this article, we will cover the following:

  • Why auto scaling, we will discuss different scenarios in which it makes sense to rely on auto scaling over defining it statically like we do with desired state
  • How, lets talk about Horizontal Auto Scaling the concept/feature that allows us to scale in an elastic way.
  • Lab - lets scale, we will look at how to actually set this up in kubectl and simulate a ton of incoming requests. We will then inspect the results and see that Kubernetes acts the way we think
Why

So in our last part, we talked about desired state. That's an OK strategy until something unforeseen happens and suddenly you got a great influx of traffic. This is likely to happen to businesses such as e-commerce around a big sale or a ticket vendor when you release tickets to a popular event.

Events like these are an anomaly which forces you to quickly scale up. The other side of the coin though is that at some point you need to scale down or you suddenly have overcapacity you might need to pay for. What you really want is for the scaling to act in an elastic way so it scaled up when you need it to and scales down when there is less traffic.

How

Horizontal auto-scaling, what does it mean?

It's a concept in Kubernetes that can scale the number of Pods we need. It can do so on a replication controller, deployment or replica set. It usually looks at CPU utilization but can be made to look at other things by using something called custom metrics support, so it's customizable.

It consists of two parts a resource and a controller. The controller checks utilization, or whatever metric you decided, to ensure that the number of replicas matches your specification. If need be it spins up more Pods or removes them. The default is checking every 15 seconds but you can change that by looking at a flag called --horizontal-pod-autoscaler-sync-period.

The underlying algorithm that decides the number of replicas looks like this:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

Lab - lets scale

Ok, the first thing we need to do is to scale our deployment to use something other than desired state.

We have two things we need to specify when we do autoscaling:

  • min/max, we define a minimum and maximum in terms of how many Pods we want
  • CPU, in this version we set a certain CPU utilization percentage. When it goes above that it scales out as needed. Think of this one as an IF clause, if CPU value greater than the threshold, try to scale

Set up

Before we can attempt our scaling experiment we need to make sure we have the correct add-ons enabled. You can easily see what add-ons you have enabled by typing:

minikube addons list

If it looks like the above we are all good. Why am I saying that? Well, what we need, to be able to auto-scale, is that heapster and metrics-server add ons are enabled.

Heapster enables Container Cluster Monitoring and Performance Analysis.

Metrics server provide metrics via the resource metrics API. Horizontal Pod Autoscaler uses this API to collect metrics

We can easily enable them both with the following commands (we will need to for auto-scaling to show correct data):

minikube addons enable heapster

and

minikube addons enable metrics-server

We need to do one more thing, namely to enable Custom metrics, which we do by starting minikube with such a flag like so:

minikube start --extra-config kubelet.EnableCustomMetrics=true

Ok, now we are good to go.

Running the experiment

We need to do the following to run our experiment

  • Create a deployment
  • Apply autoscaling
  • Bombard the deployment with incoming requests
  • Watch the auto scaling how it changes

Create a deployment

kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80

Above we are creating a deployment php-apache and expose it as a service on port 80. We can see that we are using the image k8s.gcr.io/hpa-example

It should tell us the following:

service/php-apache created
deployment.apps/php-apache created

Autoscaling

Next up we will use the command autoscale. We will use it like so:

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

It should say something like:

horizontalpodautoscaler.autoscaling/php-apache autoscaled

Above we are applying the auto-scaling on the deployment php-apache and as you can see we are applying both min-max and cpu based auto scaling which means we give a rule for how the auto scaling should happen:

If CPU load is >= 50% create a new Pod, but only maximum 10 Pods. If the load is low go back gradually to one Pod

Bombard with requests

Next step is to send a ton of requests against our deployment and see our auto-scaling doing its work. So how do we do that?

First off let's check the current status of our horizontal pod auto-scaler or hpa for short by typing:

kubectl get hpa

This should give us something like this:

The above shows us two pieces of information. The first is the TARGETS column which shows our CPU utilization, actual usage/trigger value. The next bit of interest is the column REPLICAS that shows us the number of copies, which is 1 at the moment.

For our next trick open up a separate terminal tab. We need to do is to set things up so we can send a ton of requests.

Next up we create a container using this command:

kubectl run -i --tty load-generator --image=busybox /bin/sh

This should take us to a prompt within the container. This is followed by:

while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

The command above should result in something looking like this.

This will just go on and on until you hit CTRL+C, but leave it be for now.

This throws a ton on requests in while true loop.

I thought while true loops were bad?

They are but we are only going to run it for a minute so that the auto scaling can happen. Yes, the CPU will sound a lot but don't worry :)

Let this go on for a minute or so, then enter the following command into the first terminal tab (not the one running the requests), like so:

kubectl get hpa

It should now show something like this:

As you can see from the above the column TARGETS looks different and now says 339%/50% which means the current load on the CPU and REPLICAS is 7 which means it has gone from 1 to 7 replicas. So as you can see we have been bombarding it pretty hard.

Now go to the second terminal and hit CTRL+C or you will have a situation like this:

It will actually take a few minutes for Kubernetes to cool off and get the values back to normal. A first look at the Pods situation shows us the following:

kubectl get pods

As you can see we have 7 Pods up and running still but let's wait a minute or two and it should look like this:

Ok, now we are back to normal.

Summary Part IV

Now we did some great stuff in this article. We managed to set up auto-scaling, bombard it with requests and without frying our CPU, hopefully ;)

We also managed to learn some new Kubernetes commands while at it and got to see auto-scaling at work giving us new Pods based on our specification.