Managing Persistent Volumes in Deployment

#90 Days of DevOps Challenge - Day 36

Managing Persistent Volumes in Deployment

In real world, any applications perform either read or write operations, over the data. So, we need some mechanism to store data for the application running inside the Kubernetes cluster. There are two ways you can storage data in Kubernetes:

Note: For any kind of volume in a given pod, data is preserved across container restarts.

So, let's first understand what is Volume?

We've seen in Docker, if you stores data locally, it will be lost if the containers crashes. The new container that would replace this container will have no previous data, so it would be a complete loss of the data. Thus, we cannot rely on containers for storage of data.

Also, in case of the POD, we can have multiple containers running inside the same POD, so, there will be where a sidecar container access/process the data produced by the main application container.

To solve above 2 mentioned problems, Kubernetes Volumes comes for the rescue. Kubernetes Volume is exposed to the applications as an abstraction, which eventually stores the data on the physical storage that you have provided. At its core,a volume could be a directory or a LUN or a Block Volume or a NFS file share etc etc...

Types of Volumes:

  • Ephemeral Volumes:-

This volume is dependent the on the lifecycle of the POD independent of container's lifecycle running inside it. You can also share the data between the containers inside the same POD. The Only issue with this type of the volume, you will loose your data if POD crashes. So, not a good use case for any production kind of workload.

  • Persistent Volumes (PV):-

Independent of the pod's life cycle (depending upon the RECLAIM POLICY), so you will have your data available even the POD got deleted. There are different types of PV

Ephemeral Volumes

  1. emptyDir

    An emptyDir an empty volume that got created when a Pod is assigned to a node. It exists as long as that Pod is running on that node. All containers in the Pod can read and write the same files in the emptyDir volume, even though the mount point is different in each container. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.

  2. hostPath

    A hostPath volume mounts a file or directory from the host node's filesystem into your Pod. For example in case of Docker , it uses a hostPath of /var/lib/docker. HostPath volumes present many security risks, and it is a best practice to avoid the use of HostPaths when possible.

Persistent Volumes

So far what we have learn about the volumes (Ephemeral Volumes), the volume/data is available till the life of the POD. And in Kubernetes world, PODs are ephemerals, they come and go. So, this ephemerals volumes are ONLY recommended for the TEST application.

Now the question arises how to keep data persist in Kubernetes, the answer is PersistentVolume, PersistentVolumeClaims and StorageClass. we will go thru all of these one by one. Before that I would like to mention that in Kubernetes you can provision the PersistentVolumes either Manually or Dynamically. Also, we will be discussing the concept of Container Storage Interface (CSI), which is kind of plugin that we have install based on our storage selection.

  • Manual Provisioning

img

  • Persistent Volumes(PV) :-

    A persistent volume represents a piece of storage that has been provisioned for use with Kubernetes pods. A persistent volume can be used by one or many pods, and can be dynamically or statically provisioned.

  • Persistent volume claims(PVC) :-

A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany.

  • Container Storage Interface (CSI):-

Kubernetes just provides an interface for Pods to consume it. The storage properties like, speed, replication, resiliency etc. all are the storage provides responsibility . For any storage provider to integrate there storage with Kubernetes, they have to write the CSI plugins.

img

CSI code maintenance, updates or any kind of the bug fixes is the responsibility of the storage provider, Kubernetes has nothing to do with that.

  • Dynamic Provisioning

  1. Storage Class(SC):-

StorageClass is an API object. It provides a way for administrators to describe the "classes" of storage they offer. Keeping microservices architecture in mind, Storage needs to be dynamic, this is what StorageClass offers out of the box. At large enterprise level application it is not possible for an administrator to provision volumes manually. I tis not at all scalable solution. Storage class enable dynamic provisioning of the volumes.

Every cloud provider supports different provisioners, check your cloud documentation for more information.

img

Task 1: Add a Persistent Volume to your Deployment todo app.

  • Create a Persistent Volume using a file on your node.
vi pv.yml

  apiVersion: v1
  kind: PersistentVolume
  metadata:
    name: pv-todo-app
  spec:
    capacity:
      storage: 1Gi
    accessModes:
      - ReadWriteOnce
    persistentVolumeReclaimPolicy: Retain
    hostPath:
      path: "/tmp/data"

  • Now we need to create a Persistent Volume by running the below command
kubectl apply -f pv.yml

  • Create a Persistent Volume Claim that references the Persistent Volume
vi pvc.yml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-todo-app
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

  • Now we need to create a Persistent Volume Claim by running the below command
kubectl apply -f pvc.yml

  • Update your deployment.yml file to include the Persistent Volume Claim. After Applying pv.yml pvc.yml to your deployment file.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: todo-app-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: todo-app
  template:
    metadata:
      labels:
        app: todo-app
    spec:
      containers:
        - name: todo-app
          image: saikat55mukherjee/django-todo
          ports:
            - containerPort: 8000
          volumeMounts:
            - name: todo-app-data
              mountPath: /app
      volumes:
        - name: todo-app-data
          persistentVolumeClaim:
            claimName: pvc-todo-app

  • Now need to update the deployment file using the below command
 kubectl apply -f deployment.yml

  • Verify that the Persistent Volume has been added to your Deployment by checking the status of the Pods and Persistent Volumes in your cluster. Use this commands kubectl get pods ,kubectl get pv
# Using Below command we can get the deatils of Persistent Volume
  kubectl get pv

  # Using Below command we can get the deatils of Persistent Volume Claim
  kubectl get pvc

  # Using Below command we can get the deatils of the application
  kubectl describe pvc pvc-todo-app

Task 2: Accessing data in the Persistent Volume

  • Connect to a Pod in your Deployment using the command :

    kubectl exec -it -- /bin/bash

  • Verify that you can access the data stored in the Persistent Volume from within the Pod by creating a file inside the /app directory on the worker node using the following command:

# Using below command we can get the deatils of pods
  kubectl get pods

  # Usinf the below command we can get inside any pod 
  kubectl exec -it <pod-name> /bin/bash
  Ex:- kubectl exec -it todo-app-deployment-6f9f59546b-8zjwk /bin/bash
  # Now change dir to /app and need to create test.txt file and need to verify by using ls command 
  cd /app/
  echo "Hello...This file creating from 6f9f59546b-8zjwk" > /app/test.txt
  ls

  • Now we will delete the pod where we are storing the text file and since it is auto healing, the moment we deleted a new pod will creating... we are deleting pod by running below command

      kubectl get pods
      kubectl delete pod <name_of_the_pod>
    

    Now if we run the command then we can see the new pod will generate

  • Now we will go inside the new pod and we will verify whether text file we have created before is there or not

# Using below command we can get the deatils of pods
  kubectl get pods

  # Usinf the below command we can get inside any pod 
  kubectl exec -it <pod-name> /bin/bash
  # Ex:- kubectl exec -it todo-app-deployment-6f9f59546b-2j4w9 /bin/bash

  cd /app/
  ls
  cat test.txt

Devops#devops,#90daysofDevOps

Thank you for reading!! I hope you find this article helpful!!

if any queries or corrections to be done to this blog please let me know.

Happy Learning!!

Saikat Mukherjee

Did you find this article valuable?

Support Saikat Mukherjee's blog by becoming a sponsor. Any amount is appreciated!