Kubernetes
The complexity of modern system
Critical software system are large and complex, and the complexity is increasing. Maintenance of such monoliths is a challenge. Software engineering community are addressing the challenge by adhering The Unix Philosophy
and moving towards The Cloud
.
Traditionally, applications used to run on one or more physical servers. The main drawback was that it did not scale well as resources were not utilized to the maximum. Virtual Machines(VM) solved the problem and also provided better security by providing isolated environment to . The nature of application started to be complex. Today we have Containerized applications that abstracted and enhanced the entire functionalities of VM's, making it exceptionally easy to maintain distributed systems.
Write programs that do one thing and do it well - The Unix Philosophy
Kubernetes Architecture
Kubernetes Objects:
- Cluster is the pool of compute, storage, and network resources.
- Node is a host machine running within the Cluster.
- Namespace is the logical partitions of a Cluster.
- Pod is the basic unit of deployment.
- Labels are key-value pairs for identification and service discovery.
- Services identifies a set of Pods using Label selectors.
- Replication Sets ensures Pod's availability and scalability.
- Deployment manages Pod's lifecycle.
- Ingress exposes HTTP and HTTPS routes from outside the Cluster to Services.
Processes running in Kubernetes
kube-controller-manager
A controller that runs and manages controller processes.
kube-apiserver
It is the implementation of the Kubernetes API.
kube-scheduler
watches for newly created Pods with no assigned node, and selects a node for them to run on.
kubelet
Communicates with the Master.
kube-proxy
A network proxy which reflects Kubernetes networking services on each node.
Internal Of Kubernetes Description
Container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. The container runtime is responsible for starting and managing containers.
Kubernetes is a powerful container orchestration system that can manage the deployment and operation of containerized applications across clusters of servers. In addition to coordinating container workloads, Kubernetes provides the infrastructure and tools necessary to maintain reliable network connectivity between your applications and services.
A Node is physical or virtual machine. Every cluster must have at least one Master Node
which controls cluster, and one or many Worker Node
that hosts Pod
.
Kubernetes Objects are persistent entities in Kubernetes that defines everything in Kubernetes. All objects have unique names that allows idempotent creation and retrieval. These objects are stored in etcd
database as a key-value pair. Objects can categorized as the Basic Objects
which determines the deployed containerized application's workloads, their associated network and disk resources, and Higher Level Objects
which are build upon the basic objects to provide additional functionality and convenience features to manage the workloads. Higher level objects have a long-running service-like lifecycle, except Jobs
.
Basic Objects: Pod
, Service
, Volume
and Namespace
Higher Level Objects: Replication Controllers
, Replication Sets
, Deployments
, Stateful Sets
, Daemon Sets ,
Jobsand
Cron Jobs`
Cluster is a group of interconnected Node
. Cluster's state is defined Kubernetes Objects
. Cluster's desired state includes what applications or other workloads to run, what container images they use, the number of replicas, what network and disk resources to make available.
Namespaces a way to divide cluster resources between users by creating multiple virtual clusters in same physical cluster. They are used in environments with many users spread across multiple teams, or projects. Namespaces can not be nested inside one another and each Kubernetes resource can only be in one Namespace. Objects in the same Namespace will have the same access control policies by default. Labels
are used to distinguish resources within the same Namespace. Namespace
resources are not themselves in a Namespace
, and low-level resources, such as Nodes
and PersistentVolumes
, are not in any Namespace.
Pod represents a group of one or more Containers
running together and operating closely as a single, monolithic application in a Node
in the Cluster
. Pods are managed entirely as a unit and share resources like environment, volumes and IP space. Pods consist of a main container which serves workload and optionally some helper containers that facilitate closely related tasks. For example, a Pod may have one container running the primary application server and a helper container pulling down files to the shared filesystem when changes are detected in an external repository. Pods are managed by higher level objects by providing template definitions.
Pods represent and hold a collection of one or more containers. Generally, if you have multiple containers with a hard dependency on each other, you package the containers inside a single pod.
Each individual worker node in the cluster runs two processes: kubelet
and kube-proxy
.
Service groups Pods
together that perform the same function as a single entity. It keeps track of containers in the pods and routes to the containers for internal and external access. A service’s IP address remains stable regardless of changes to the pods it routes to which makes it easy to gain discoverability and can simplify containers designs. By default, services are only available using an internally routable IP address, they can be made available outside of the cluster by choosing one of several strategies.
Services use labels to determine what Pods they operate on. If Pods have the correct labels, they are automatically picked up and exposed by our services.
Kubernetes API is a resource-based (RESTful) programmatic interface provided via HTTP. It supports retrieving, creating, updating, and deleting primary resources via the standard HTTP verbs (POST, PUT, PATCH, DELETE, GET), includes additional subresources for many objects that allow fine grained authorization (such as binding a pod to a node), and can accept and serve those resources in different representations for convenience or efficiency. It also supports efficient change notifications on resources via "watches" and consistent lists to allow other components to effectively cache and synchronize the state of resources. It the communication medium for the end users, different parts of your cluster, and external components with one another. Most Kubernetes API resource types are Kubernetes Objects
, but a smaller number of API resource types are represented by operations.
Controller is a non-terminating loop that regulates the state of a system. It watches the state of the cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state. There are different types of controllers for specific purposes.
Kubernetes Control Plane is a collection of the Controllers. kube-apiserver
, kube-controller-manager
and kube-scheduler
are the three critical processes that makes up the control plane. Nodes that runs these processes are called Master Node
which are replicated for availability and redundancy.
Volume is simply an abstraction of data in the form of file and directory within a Pod. It exists as long as its Pod exists.
Secret are used to share sensitive information, like SSH keys and passwords, with other Kubernetes objects within the same namespace.
Kubernetes Object Definition
Every Kubernetes Object definition is a YAML file that contains at least the following items:
apiVersion
: The version of the Kubernetes API that the definition belongs to.
kind
: The Kubernetes object this file represents. For example, a pod or service.
metadata
: This contains the name of the object along with any labels that you may wish to apply to it.
spec
: This contains a specific configuration depending on the kind of object you are creating, such as the container image or the ports on which the container will be accessible from.
Instead of a spec
key, a Secret
uses a data
or stringData
key to hold the required information. The data
parameter holds base64 encoded data that is automatically decoded when retrieved. The stringData
parameter holds non-encoded data that is automatically encoded during creation or updates, and does not output the data when retrieving Secrets.
Pods Management Controllers
Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.
Node Controller
: Responsible for noticing and responding when nodes go down.
Replication Controller
: Responsible for maintaining the correct number of pods for every replication controller object in the system.
Endpoints Controller
: Populates the Endpoints object (that is, joins Services & Pods).
Service Account & Token Controllers
: Create default accounts and API access tokens for new namespaces
Deployments
are most frequently used objects for stateless application which makes life cycle management of replicated Pods easier. It manages Pods
as rolling updates, canary deploys and blue/green deployments. Deployments can be modified easily by changing the configuration and Kubernetes will adjust the replica sets, manage transitions between different application versions, and optionally maintain event history and undo capabilities automatically.
Stateful Sets
are specialized pod controllers for stateful applications that offer ordering and uniqueness guarantees. Primarily it is used when systems that require stable network identifiers, stable persistent storage, and ordering guarantees like data-oriented applications, like databases, which need access to the same volumes even if rescheduled to a new node.
Replication Controller
is responsible for ensuring that the number of Pods deployed in the cluster matches the number of pods in its configuration. If a Pod or underlying host fails, the Controller will start new pods to compensate. If the number of replicas in a Controller’s configuration changes, the Controller either starts up or kills Containers to match the desired number. Replication Controllers can also perform rolling updates to roll over a set of pods to a new version one by one, minimizing the impact on application availability. Deployments
uses as it's build block.
Replication Sets
are an iteration on the Replication Controller
design with greater flexibility in how the controller identifies the Pods it is meant to manage. The only thing it does not do is rolling updates.
Daemon Sets
are another specialized form of Pod Controller that run a copy of a Pod on each node in the cluster (or a subset, if specified). This is most often useful when deploying pods that help perform maintenance and provide services for the nodes themselves. For instance, collecting and forwarding logs, aggregating metrics, and running services that increase the capabilities of the node itself are popular candidates for daemon sets.
Jobs
are useful when containers are expected to exit successfully after some time once they have completed their work.
Build on jobs,
Service Types
Kubernetes Services have 4 types, specified by the type field in the Service configuration file:
ClusterIP
is the default, which grants the Service a stable internal IP accessible from anywhere inside of the cluster.
It is the default type means that this Service is only visible inside of the cluster
.
NodePort
configuration works by opening a static port on each node’s external networking interface. Traffic to the external port will be routed automatically to the appropriate pods using an internal cluster IP service. This will expose your Service on each Node at a static port, between 30000-32767 by default. When a request hits a Node at its Node IP address and the NodePort for your service, the request will be load balanced and routed to the application containers for your service.
It gives each node in the cluster an externally accessible IP
.
LoadBalancer
creates an external load-balancer to route to the service using a cloud provider’s Kubernetes load-balancer integration. The Cloud Controller Manager
will create the appropriate resource and configure it using the internal service service addresses. This will create a load balancer using your cloud provider’s load balancing product, and configure a NodePort and ClusterIP for your Service to which external requests will be routed.
Creating LoadBalancer for each Deployment running in the cluster will create a new cloud load balancer for each Service, which can become costly. Ingress Controller
is used to manage routing external requests to multiple services using a single load balancer.
It adds a load balancer from the cloud provider which forwards traffic from the service to Nodes within it
.
ExternalName
allows to map a Kubernetes Service to a DNS record. It can be used for accessing external services from Pods using Kubernetes DNS.
Label And Annotations
A Label
is a semantic tag that are simple key-value pairs which can be attached to Kubernetes Objects to mark them as a part of a group. Labels should be used for semantic information useful to match a pod with selection criteria, annotations are more free-form and can contain less structured data. These can then be selected for when targeting different instances for management or routing.
Each of the controller-based objects use labels to identify the Pods that they should operate on. Services use labels to understand the backend Pods they should route requests to. Each unit can have more than one label, but each unit can only have one entry for each key. Usually, a “name” key is used as a general purpose identifier, but you can additionally classify objects by other criteria like development stage, public accessibility, application version, etc.
Annotations
also allows to attach arbitrary key-value information to an object but are more free-form and can contain less structured data and are are a way of adding rich metadata to an object that is not helpful for selection purposes.
Storage Management
The lifecycle of a Volume
is tied to the lifecycle of the Pod
, but not to that of a Container. If a container within a Pod dies, the
Volumepersists and the newly launched container will be able to mount the same
Volumeand access its data. When a
Podgets restarted or dies, so do its
Volumes, although if the
Volumesconsist of cloud block storage, they will simply be unmounted with data still accessible by future
Pods`.
To preserve data across Pod
restarts and updates, the PersistentVolume
(PV) and PersistentVolumeClaim
(PVC) objects are used.
StorageClass
defines different types of storage offered which are categorized as "classes" setup by the Cluster Administrator. Different 'classes" might map to quality-of-service levels, or to backup policies, or to arbitrary policies. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called "profiles" in other storage systems.
PersistentVolume
abstracts the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system that is provisioned manually by cluster admin or dynamically using Storage Classes
. It is a resource in the cluster just like a node is a cluster resource. PersistentVolume
are volume plugins like Volumes
, but have a lifecycle independent of any individual Pod.
PersistentVolumeClaim
is a request for storage by a user. It is similar to a Pod. Pods
consume Node
resources and PersistentVolumeClaim
consume PersistentVolume
resources. Pods can request specific levels of resources (CPU and Memory). PersistentVolumeClaim
can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany). PersistentVolumeClaim
mounts the PV at the required path. The spec
for a PVC contains the following items:
- accessModes which vary by the use case. These are:
- ReadWriteOnce – mounts the volume as read-write by a single node
- ReadOnlyMany – mounts the volume as read-only by many nodes
- ReadWriteMany – mounts the volume as read-write by many nodes
- resources – the storage space that you require
Security And Policies
Security in Kubernetes is a big challenge as it is a composed many smaller standalone components. It provides many security mechanisms. Namespaces
can be used for authentication, authorization and access control. Resource Quotas
can be provided to avoid resource cannibalization. And Network Policies
can be setup for proper segmentation and traffic control.
Networking
All the components of Kubernetes are interconnected. For the entire system to function efficiently, reliability and securely, networking plays critical role. The basic requirements of a Kubernetes network are:
- all containers can communicate with all other containers without NAT
- all nodes can communicate with all containers (and vice-versa) without NAT
- the IP that a container sees itself as is the same IP that others see it as
Network Address Translation(NAT) is a method of remapping an IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device
Monitoring
Kubernetes includes some internal monitoring tools by default. These resources belong to its resource metrics pipeline, which ensures that the cluster runs as expected. The cAdvisor component collects network usage, memory, and CPU statistics from individual containers and nodes and passes that information to kubelet; kubelet in turn exposes that information via a REST API. The Metrics Server gets this information from the API and then passes it to the kube-aggregator for formatting.
References
Implementation/Usage Blogs