What is Kubernetes?


Kubernetes has become an interesting topic in the present cloud native revolution. Have you ever used Docker? Docker is an open platform for developing, shipping and running applications by creating containers for software applications. Docker helps to deliver applications fast by separating the application from its underlying infrastructure. It is also like package once and deploy anywhere. If you have the Docker virtual machine running on whatever the operating system it is, your application will run seamlessly. It has lots of benefits in terms of replicating your development environment, production environment and testing environment. Some of the key benefits that Docker provides are listed below.

  • Faster deployment
  • Isolation : Docker is a way of running a process so that it thinks and behaves as if it is the only process running in the computer. 
  • Portability : Docker lets people to create and share software through Docker images. Using Docker, you don’t have to worry about whether your computer can run the software in a Docker image, a Docker container can always run it. 
  • Snapshotting : Docker can snapshot your environment by saving the state of the container file or image file, tag them and recreate them again.
  • Security sandbox : Because it runs on Linux containers.
  • Limit resource usage
  • Simplified dependency : Docker allows you to package an application with all of its dependencies into a standardized unit for software development.
  • Sharing 
It will be better to have some prior knowledge about Docker to proceed with this article.[1], [2], [3] are some blog posts that I have written about the fundamental concepts of Docker, Docker’s architecture and a simple ‘Hello-World’ example with Docker.

In this blog post, I’ll discuss some basic concepts of Kubernetes and the architecture of the process of running Kubernetes locally via Docker.


What is Kubernetes?


Containers are everywhere. So we need some kind of a distributed process manager.


Selection_001.png


If you are building an application, you have lots of components as parts of that application. Today we have very complex software applications. They need to be deployed and updated at a very rapid pace. With lots of containers, it becomes a hard work to manage and keep them running in production. Just think of a web application. You will have some kind of application server, a database server, a web server, a reverse proxy, load balancers and many other things. Considering in the perspective of micro services, this web application might be further decomposed into many loosely coupled services as well. Does that mean that everything needs to be a container? Even a simple web application requires a number of containers. On the other hand, each of these containers will require replicas for scale out and high availability. It is going to be a mess if we are to manage an infrastructure with lots of containers like this. It’s not just the number of containers that becomes challenging. Services are also required to be deployed together to various regions. Hence we need some kind of orchestration system to manage the containers. Well, that’s where Kubernetes comes in.

Kubernetes is an open source orchestration system developed for managing containerized applications across multiple hosts in a clustered environment. Orchestration means the process of automated arrangement, coordination and management of complex computer systems , middleware and services. In brief, Kubernetes handles the execution of a defined workflow.

The name Kubernetes originates from Greek, meaning “helmsman” or “pilot”. “K8s” is an abbreviation derived by replacing the eight letters “ubernete” with “8”. Kubernetes project was started by Google in 2014 and now it is supported by many companies like Microsoft, RedHat, IBM and Docker. There are almost 400 contributors from across industry, over 8000 stars and 12000+ commits on GitHub.


Google has been using containerization technology for over ten years. They have containers for everything. Google has an internal system called “Borg” that runs Google’s entire infrastructure to manage vast server clusters across the globe. Google has maintained it as a secret source that until not long ago was never mentioned, even as a secret code name. Any way, Google stepped further and started an open source implementation of a container management system called Kubernetes which was inspired by Borg and its predecessor. As they describe, Kubernetes is even better than Borg. Since Kubernetes is free and available to all of us, it is awesome! With Kubernetes, Google shares its container expertise.


The significance of Kubernetes lies in the fact that it provides declarative primitives for the ‘desired state’. This is done by self-healing, auto-restarting, scheduling across hosts and replicating. Say you tell for Kubernetes that you need three servers to be up, not more or not less. Then Kubernetes always makes sure that three servers are up, so that if a server goes down, it brings it back up. If an additional server spins up, it kills that server. And that’s what exactly Kubernetes does. Thus Kubernetes actively manages the containers to ensure that the state of the cluster continually matches the user’s intentions.

With Kubernetes, we get the following benefits.
              Scale our applications on the fly
              Roll out new features
              Optimize use of hardware by using only the required resources

How Kubernetes is related to Docker?


Kubernetes also supports Docker. The purpose of using Kubernetes is to manage a cluster of Linux containers as a single system. It can be used to manage and run Docker containers across multiple hosts. On the other hand, it provides co-location of containers, service discovery and replication control as well. 

Kubernetes treats groups of Docker containers as single units with their own addressable IP across hosts and scale them as you wish by letting Kubernetes take care of the details. Kubernetes provides a means of scaling and balancing the Docker containers across multiple Docker hosts. It also adds a higher level API to define how containers are logically grouped and load balanced.

Kubernetes architecture


This is a representation of the Kubernetes Master-minion architecture. 

kubernetes-key-concepts.png

Let’s try to understand the main components of Kubernetes.

  1. Pod
Selection_002.png

Pods are the smallest deployable units that can be created, scheduled and managed. In Kubernetes, containers run inside pods. Pod is a collection of containers that belong to an application. Closely related containers are grouped together in a pod. A pod can contain one or more containers. They can be deployed and scaled as a single application. They are managed and scheduled as a unit and they share an environment of resources. A pod can contain a main container accompanied with helper containers that facilitate related tasks.
The containers inside a pod live and die together. So if you have some processes that require same host or need to interact with each other very tightly, pod is a kind of way to group those processes together. A pod file is a file in JSON or YAML and it basically specifies what containers are to be launched by Kubernetes.

In Docker’s perspective, pod is a collocated group of Docker containers that share an IP and storage volume.


  • Group of containers : Reuse across environments 
  • Settings in a template : Repeatable, manageable
2. Service
Key abstraction in Kubernetes is called a service. Service is a set of pods that work together. They are exposed with a single and stable name and network address. With or without an external load balancer, service provides load balancing to the underlying pods. Kubernetes provides load balancing for all components of the system. Services provide an interface to a group of containers so that users do not have to worry about anything beyond a single access location.

  • Stable address : Clients shielded from implementation details 
  • Decoupled from Controllers : Independently control each, build for resiliency
3. Replication controllers

Replication controllers manage the lifecycle of pods by ensuring specific number of pod replicas or minions are running. Say you want three replicas of a pod to run, and Kubernetes always ensures that using replication controllers by defining how many pods or containers are need to be run at a particular time. If one of them fails, a new one is started. Kubernetes job is to keep three replicas running at all the time. When you need more or less, you update the definition and then it will be updated.

  • Keeps Pods running : Restarts Pods, desired state 
  • Gives direct control of number of Pods : Fine grained control for scaling
4. Label

Label is a simple name value pair. Each of the above components like pod, service and replication controllers talk to each other using labels.

5. Master

The controlling unit in a Kubernetes cluster is called the master server. It is the main management contact point for the administrators. Master server runs different services that are used to manage the cluster’s workload and direct communications across the system. Some components that are specific to Master server are as follows.

  • API server : This is the main management center for the entire cluster. API server allows a user to configure Kubernetes workloads and organizational units. Further, it makes sure that the etcd store and the service details of containers are in agreement. API server has a RESTful interface in order to communicate with many different tools and libraries. 
  • etcd : Kubernetes should have a globally available configuration store. etcd is used to store configuration data to be used by each of the nodes in the cluster. etcd is a lightweight, distributed key value store that can be distributed across multiple nodes. 
  • Controller manager server : This is used to handle the replication processes. Controller manager server watches for changes and if a change is seen, it reads the new information and implements the replication process that fulfills the desired state by scaling the application group up or down. 
  • Scheduler server : Scheduler assigns workloads to specific nodes in the cluster. Scheduler also looks for the resource utilization on each host to make sure that workloads are not scheduled in excess of the available resources. For that, the scheduler must know the total resources available on each server as well as resources allocated to existing workloads assigned on each server. 
6. kubelet

Each minion runs services to run containers and minions are managed by the master. In addition to Docker, Kubelet is another key service installed there. Kubelet reads container manifests as YAML files that describes a pod and it ensures that the containers defined in the pods are started and continue running.

7. Minion
Minion is like a node. Each of the multiple Docker hosts with the Kubelet service that receive orders from the master, and manages the host running containers. In a node or a minion, you can have a pod running and within the pod, you have a container running. 

8. kubectl
This is a script that controls the Kubernetes cluster manager.We can use different commands with kubectl.
kubectl get pods
kubectl create -f <file-name>
kubectl update
kubectl delete
kubectl resize -replicas=3 replicationcontrollers <name>



Kubernetes Pros


  • Manage related Docker containers as a unit : You can specify what Docker containers need to run and all their dependencies in one file. 
  • Container communication across hosts : Unlike Docker which runs on single host, Kubernetes runs across multiple hosts. 
  • Availability and scalability through automated deployment and monitoring of pods and their replicas across hosts. 

Kubernetes Cons


  • Lifecycle of applications : Has to build, deploy and manage applications 
  • No multi-tenancy 
  • On-premise 

References



2 comments :

  1. Great blog... It provide all information on Docker and Kubernetes, how it work, kubernetes architecture, use of kubernetes, Pros and Cons. It is very interesting and nicely written. Thanks for sharing

    ReplyDelete
  2. Very well written .But I am confused with the points mentioned under "Cons" .I think kubernetes does support multi-tenancy and on-cloud as well.Please correct me if its incorrect.

    ReplyDelete