Skip to the content.

Kubernetes Primer

What is Kubernetes?

Local Deployment

Cheat Sheet

Key Concepts

Hardware Concepts

Concept Description
Node A physical or virtual machine running Kubernetes, capable of hosting Pods.
Cluster A set of worker machines, called nodes, that run containerized applications managed by Kubernetes control planes.
Persistent Volume A piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource.

Software Concepts

Concept Description
Pod The smallest deployable unit, encapsulating one or more containers.
Service A set of Pods working together, exposed as a network service.
Deployment Manages the desired state for Pods and ReplicaSets. Supports updates, rollbacks, and scaling.
ReplicaSet Ensures that a specified number of replicas of a Pod are running at all times.
Namespace Logical partitioning of a Kubernetes cluster, used to isolate resources.
ConfigMap Allows you to decouple environment-specific configuration from your container images.
Secret Used to store sensitive information, such as passwords or API keys in base64 format
Volume Represents a storage location, either on the host or a remote storage solution.
Ingress Manages external access to services within a cluster, typically HTTP.
StatefulSet Manages the deployment and scaling of a set of Pods, with persistent storage and unique network identifiers.
DaemonSet Ensures all or some Nodes run a copy of a Pod, typically used for node-level system services.
Horizontal Pod Autoscaler Automatically scales the number of Pods in a deployment, replica set, or replication controller based on observed CPU or memory utilization.

Key Components

Control Plane Components

Component Description
kube-apiserver Exposes the Kubernetes API.
etcd Consistent and highly-available key-value store for all cluster data.
kube-scheduler Schedules pods to run on nodes.
kube-controller-manager Runs controllers for nodes, replicas, endpoints, etc.
cloud-controller-manager Runs controllers specific to the underlying cloud provider.

Node Components

Component Description
kubelet Ensures that containers are running in a pod.
kube-proxy Manages network rules and enables communication to and from your pods.
Container Runtime Software for running containers (e.g., Docker, containerd).

Concepts Elaborated

Sure thing! Let’s break these concepts down into a casual, concise manner:


Data Persistence

Volume

What it is:

Features:

Why you'd use it:


Persistent Volume (PV)

What it is:

Features:

Why you'd use it:


Persistent Volume Claim (PVC)

What it is:

Features:

Why you'd use it:


Volume vs PVC vs PV

  1. Volume: Temporary storage for a Pod.
  2. PV: Long-term storage resource in the cluster.
  3. PVC: A way to request a chunk of that PV storage.

Real-life Analogy:


Namespace

Why Use Namespaces?

  1. Isolation: Namespaces provide a scope for names, ensuring that resources are isolated from each other.
  2. Organization: By grouping related resources together, namespaces simplify management and access control.
  3. Resource Allocation: You can set resource limits on a per-namespace basis, ensuring fair usage across different teams or applications.
  4. Access Control: You can set different permissions for different namespaces, allowing precise control over who can do what within each environment.

Creating a Namespace

apiVersion: v1
kind: Namespace
metadata:
  name: my-namespace
kubectl create namespace my-namespace

Adding a Namespace

Once a namespace is created, you can create, view, and manage resources within that namespace. When creating a resource, you can specify the namespace in the YAML file:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace

Filtering by Namespace

kubectl get pods --namespace=my-namespace

Default Namespaces

Kubernetes comes with a few default namespaces:

Resource Quotas and Limits

You can set quotas and limits on resources within a namespace to control CPU, memory, and other resource utilization. Here’s an example:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: my-quota
  namespace: my-namespace
spec:
  hard:
    pods: '10'
    requests.cpu: '4'

Services

ClusterIP

ClusterIP

NodePort

NodePort

LoadBalancer

Load Balancer


Ingress

Ingress Controller

Components of an Ingress Controller:

  1. Controller Software:
    • This is the actual controller component, which is continuously running and watching for changes to Ingress resources.
    • When changes are detected, it updates its configuration accordingly.
    • Popular examples include ingress-nginx, traefik, and HAProxy Ingress.
  2. Pods:
    • The controller software typically runs within one or more Pods.
    • These Pods are responsible for routing the incoming traffic based on Ingress rules.
  3. ConfigMap/Secrets:
    • Ingress Controllers often use ConfigMaps and Secrets to store configurations and TLS certificates, respectively.
    • Changes to these resources can lead the controller to reload or update its configuration.
  4. Service:
    • To expose the Ingress Controller Pods to external traffic, there’s typically a Service of type LoadBalancer or NodePort associated with it.
  5. RBAC (Roles, RoleBindings, ServiceAccounts):
    • To securely access and watch the Ingress resources and associated configurations, the Ingress Controller often runs with specific ServiceAccount, Role, and RoleBinding settings.
  6. Backend Pods and Services:
    • While not part of the Ingress Controller per se, these are essential components in the whole setup.
    • The Ingress Controller uses the Ingress rules to route traffic to these backend services and eventually to the backend pods.
  7. Ingress Resources:
    • Although separate from the Ingress Controller, Ingress resources define the rules for how traffic should be routed.
    • The Ingress Controller watches and implements these rules.

Traffic Flow with Ingress Controller

Ingress

  1. External Traffic:
    • External traffic, such as user requests from the internet, first hits the public-facing external load balancer.
  2. External Load Balancer (ELB):
    • Managed by the cloud provider (like AWS ELB, Google Cloud Load Balancer, Azure Load Balancer, DigitalOcean Load Balancer, etc.).
    • It’s designed to distribute incoming traffic across multiple nodes in the Kubernetes cluster.
    • The ELB is configured to know which node ports to forward traffic to, usually by the LoadBalancer service.
  3. LoadBalancer Service:
    • It’s a Kubernetes service of type LoadBalancer.
    • Acts as an interface between the external load balancer and the internal ClusterIP services and pods.
    • Requests the cloud provider to provision an external load balancer (or utilizes an existing one) and automatically configures it to forward traffic to the service’s pods, often via NodePorts.
  4. ClusterIP Service:
    • Once the traffic reaches the cluster (thanks to the LoadBalancer service), the ClusterIP service takes over.
    • ClusterIP is the default type of service in Kubernetes. It exposes the service on an internal IP in the cluster, making the service reachable only from within the cluster.
    • In the context of ingress-nginx, this service forwards the traffic to the ingress controller pods.
  5. Ingress-nginx Controller Pods (with NGINX):
    • Inside these pods is where the actual decisions on traffic routing, based on the host, path, or other request parameters, are made.
    • They continuously watch for changes in Ingress resources across the cluster.
    • When a request comes in, the embedded NGINX determines the destination using Ingress rules, guiding the traffic to the appropriate service within a specific namespace (dev, prod, etc.).
  6. Ingress Resources:
    • These are defined per application or environment. They dictate how incoming requests should be routed.
    • For instance, requests with a host header of dev.botiga.com might be routed to services in the dev namespace, while prod.botiga.com would target services in the prod namespace.
  7. Destination Services & Pods:
    • Based on the decisions made by NGINX in the ingress controller pods, traffic is then forwarded to the target services.
    • These services, in turn, route the traffic to the respective application pods in their designated namespaces (dev, prod, etc.).

In this model, the LoadBalancer service effectively bridges the gap between the external world and your Kubernetes cluster, ensuring traffic can smoothly flow into your applications. The combination of both LoadBalancer and ClusterIP services with the ingress-nginx controller pods allows for efficient, rule-based routing of external traffic deep into the appropriate areas of the cluster.


Labels

metadata:
  labels:
    env: prod
    tier: backend
    app: auth
    release: stable
Label Description
app.kubernetes.io/name The name of the application (e.g., “mysql”).
app.kubernetes.io/instance A unique instance name of the application (e.g., “wordpress-abcxzy”).
app.kubernetes.io/version The version of the application (e.g., “v1.0.0”).
app.kubernetes.io/component The component within the architecture (e.g., “database”, “webserver”).
app.kubernetes.io/part-of The name of a higher-level application that this is a part of (e.g., “wordpress”).
app.kubernetes.io/managed-by The tool being used to manage the operation of an application (e.g., “helm”).
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  labels:
    app.kubernetes.io/name: my-app
    app.kubernetes.io/instance: my-app-12345
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/component: backend
    app.kubernetes.io/part-of: bigger-app
    app.kubernetes.io/managed-by: helm

Adhering to the set of recommended labels makes it easier to manage applications, enhances interoperability, and promotes best practices within the Kubernetes community.


Label Selectors

Equality-Based Selectors

selector:
  matchLabels:
    env: prod
    app: web

Set-Based Selectors

selector:
  matchExpressions:
    - {key: env, operator: In, values: [prod, staging]}
    - {key: app, operator: NotIn, values: [auth]}
    - {key: tier, operator: Exists}

In this example:

You can combine equality-based and set-based selectors as needed to match resources based on complex criteria.


Annotations vs Labels

Feature Labels Annotations
Purpose Identify and select objects Attach non-identifying metadata
Use Cases Filtering, grouping resources Additional information, tool metadata
Searchable Yes No
Syntax Key/Value pairs Key/Value pairs
Modification Dynamic Dynamic
Visibility Used by Kubernetes system Primarily for end-users/tools
Character Limit 63 chars for key and value No specific limit
Validation Specific constraints No specific constraints

Controllers

Controller Description
ReplicaSet Controller Ensures the desired number of replicas for a Pod are running. Creates or deletes Pods as necessary.
Deployment Controller Manages the lifecycle of applications. Facilitates updates, rollbacks, and scaling.
DaemonSet Controller Ensures that specific Pods run on all (or selected) nodes in the cluster.
StatefulSet Controller Manages stateful applications with stable network IDs, persistent storage, and ordered deployment.
Job Controller Manages one-off tasks that need to run to completion.
CronJob Controller Schedules Jobs to run at specified times or intervals.
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: my-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: nginx-container
        image: nginx

Pods vs ReplicaSets vs Deployments

Feature Pod ReplicaSet Deployment
Definition Single unit hosting one or more containers Ensures desired number of Pods are running Manages ReplicaSets & enables rolling updates
Granularity Individual container(s) Group of identical Pods Manages multiple ReplicaSets for different versions
Scaling Manual Manual Automatic (up & down)
Rolling Updates Not supported Not supported Supported
Rollbacks Not supported Not supported Supported
Recovery No automatic replacement Replaces failed Pods Replaces failed Pods through ReplicaSets
Versioning N/A N/A Manages different versions with ease
Use Case Basic container orchestration Ensuring specific count of identical Pods Full application lifecycle management

Summary:

pods vs deployments


Health Probes

Probe Type Description When Used Configuration Parameters
Liveness Determines if the pod is running. If the check fails, the container is killed and subjected to its restart policy. Ensure the container is healthy and responsive. initialDelaySeconds, timeoutSeconds, periodSeconds, successThreshold, failureThreshold
Readiness Determines if the pod should receive traffic. If the check fails, the pod is removed from service load balancers. Ensure the app inside the container is fully initialized and ready to accept traffic. initialDelaySeconds, timeoutSeconds, periodSeconds, successThreshold, failureThreshold
Startup Determines if the application within the pod has started. If the check fails, the container is killed and restarted. Ensure the app inside the container has started up correctly. failureThreshold, periodSeconds

Why do we need StatefulSets?

The Problem

Using Deployments

If we try to deploy a Cassandra cluster using a Kubernetes Deployment:

Using StatefulSets

Now, let’s see how a StatefulSet addresses these challenges:

Conclusion

References

Amazing Blogs


Udemy Videos