Introduction

As a data scientist in practice, I feel the trend of pushing the traditional routine of data processing, model training, and inference into an integrated pipeline with ( CI/CD)Continuous Integration/Continuous Deployment, whose concepts are borrowed from DevOps. There are two primary reasons, at least from my perspective. One is that modeling is stepping out of the prototyping phase to the massive adoption of the model to the production, either as an add-on or application. The other reason is the rising need of bringing the whole modeling experience to the cloud and the enhanced orchestration of the model development.

Urging by the need, I started my upskilling journeys with tools that fit in the transformation of modeling, like Kubernetes. It is a hit right now and it has acquired its reputation of easy scaling, portability, and extensibility.  This blog here specifically laid out the components that it offers. Yet when I started my research and get my hands on it, I feel overwhelmed by all these concepts and “official guides” for not being a trained software developer. I chewed up all these materials bit by bit and started to gain the fuller picture even though I don’t have an experienced background in DevOps. My notes and understanding are all in this tutorial and I hope this can be an easy yet accomplishing start for those who are also in need of upskilling themselves in Kubernetes.

Kubernetes in a nutshell

Kubernetes define itself as a production-grade, open-source platform that orchestrates executions of application containers within and across computer clusters. Simply put, Kubernetes is a manager of a couple of computers assembled for you to perform application on. It is the command center to execute the tasks you assign to it: scheduling applications, regular maintenance, scaling up capacity, and rolling out updates.

Image for post

Cluster Diagram from Kubernetes

The most basic components of a Kubernetes cluster are master and node. The master is the manager, the center of the cluster. A node is a virtual machine or a physical computer. Each node has a Kubelet that manages it and communicate with the master. Since the node is a VM, we should also have Docker/Container as a tool to execute our applications.

#data-science #cloud #kubernetes #google-cloud-platform

How to Deploy Kubernetes to Your GCP Cloud <Step-by-step Tutorial>
3.20 GEEK