Amazon Web Services Posted February 2, 2023 Share Posted February 2, 2023 The Kubernetes project is made up of a number of special interest groups (SIGs) that focus on a particular part of the Kubernetes ecosystem. The Storage SIG is focused on different types of storage (block and file) and ensuring that storage is available to containers when they are scheduled. One of the subprojects of the Storage SIG is the Local Volume Static Provisioner, and it is a Container Storage Interface (CSI) driver that creates Kubernetes PersistentVolumes for persistent disks attached during instance startup. This post discusses deploying the Local Volume Static Provisioner CSI driver using Amazon EKS managed node groups and pre-bootstrap commands to expose the NVMe EC2 instance store drives as Kubernetes PV objects. Customers may wish to leverage the local NVMe storage volumes to achieve higher performance than what’s possible from the general-purpose Amazon EBS boot volume. Note that instance storage volumes are for temporary storage, and the data is lost when the Amazon Elastic Compute Cloud (Amazon EC2) instance is stopped or terminated. To persist data stored in instance store volumes across the lifecycle of an instance, you need to handle replication at the application layer. Storage in Kubernetes A Kubernetes PersistentVolume (PV) is a cluster resource that defines the capabilities and location of a single storage volume. A PersistentVolumeClaim (PVC) defines a request for a certain amount of storage resources. When a Kubernetes Pod needs storage, it references a PVC. Kubernetes then works to match the PVC to an available PV or automatically provision a new PV if the CSI driver supports dynamic provisioning. Below is an example of a Kubernetes Pod, PV, and PVC. The manifest first defines a PV named my-pv , which offers 5 GiB of storage and is classified as being of storage class nfs. Next, the manifest creates a PVC named nfs-claim , which requests 4 GiB of nfs storage. Finally, the Pod named app mounts the storage, which is claimed by the PVC of nfs-claim. apiVersion: v1 kind: PersistentVolume metadata: name: my-pv spec: capacity: storage: 5Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: nfs mountOptions: - hard - nfsvers=4.1 nfs: path: /tmp server: 172.17.0.2 --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: nfs-claim spec: accessModes: - ReadWriteOnce storageClassName: nfs resources: requests: storage: 4Gi --- apiVersion: v1 kind: Pod metadata: name: app spec: containers: - name: app image: centos command: ["/bin/sh"] args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"] volumeMounts: - name: persistent-storage mountPath: /data volumes: - name: persistent-storage persistentVolumeClaim: claimName: nfs-claim Walkthrough This post will walk you through the following steps to install and test the Local Volume Static Provisioner: Create a service account with cluster level permissions Create a ConfigMap for CSI driver Create a DaemonSet to deploy the CSI driver Two options for creating Amazon EKS managed node groups with boot scripts to expose the NVMe instance store to Kubernetes Pods Clean up Prerequisites We need a few prerequisites and tools to successfully run through these steps. Ensure you have the following in your working environment: An existing EKS cluster version 1.21 or higher kubectl eksctl Installing the Local Storage Static Provisioner The Local Volume Static Provisioner CSI handles both the detection and creation of PVs for local disks mounted in a predefined file system path. Installing requires deploying a DaemonSet and granting Kubernetes API and host level permissions. Kubernetes Service Accounts and Permissions The CSI driver needs permission to issue API calls to the Kubernetes control plane to manage the lifecycle of the PVs. The manifest below defines a Kubernetes service account and attaches a Kubernetes cluster role that grants the necessary Kubernetes API permissions. Copy and save the manifest below as service-account.yaml # K8s service account for CSI Driver apiVersion: v1 kind: ServiceAccount metadata: name: local-volume-provisioner namespace: kube-system --- # List of Permissions apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: local-storage-provisioner-node-clusterrole rules: - apiGroups: [""] resources: ["persistentvolumes"] verbs: ["get", "list", "watch", "create", "delete"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["get", "list", "watch"] - apiGroups: [""] resources: ["events"] verbs: ["watch"] - apiGroups: ["", "events.k8s.io"] resources: ["events"] verbs: ["create", "update", "patch"] - apiGroups: [""] resources: ["nodes"] verbs: ["get"] --- # Attach the K8s ClusterRole to our K8s ServiceAccount apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: local-storage-provisioner-node-binding namespace: kube-system subjects: - kind: ServiceAccount name: local-volume-provisioner namespace: kube-system roleRef: kind: ClusterRole name: local-storage-provisioner-node-clusterrole apiGroup: rbac.authorization.k8s.io Run the following command to create the ServiceAccount, ClusterRole, and ClusterRoleBinding: kubectl apply -f service-account.yaml CSI Driver ConfigMap The Local Volume Static Provisioner CSI driver stores in a Kubernetes ConfigMap where to look for mounted EC2 NVMe instance store volumes and how to expose them as PVs. The below ConfigMap specifies that Local Volume Static Provisioner look for mounted NVMe instance store volumes in the /mnt/fast-disk directory. Kubernetes StorageClass specifies a type of storage available in the cluster. The manifest includes a new StorageClass of fast-disks to identify that the PVs relate to NVMe instance store volumes. Copy and save the manifest below as config-map.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast-disks provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer # Supported policies: Delete, Retain reclaimPolicy: Retain --- # Configuration for our Local Persistent Volume CSI Driver apiVersion: v1 kind: ConfigMap metadata: name: local-volume-provisioner-config namespace: kube-system data: # Adds the node's hostname as a label to each PV created nodeLabelsForPV: | - kubernetes.io/hostname storageClassMap: | fast-disks: # Path on the host where local volumes of this storage class # are mounted under. hostDir: /mnt/fast-disks # Optionally specify mount path of local volumes. # By default, we use same path as hostDir in container. mountDir: /mnt/fast-disks # The /scripts/shred.sh is contained in the CSI drivers container # https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/blob/master/deployment/docker/scripts/shred.sh blockCleanerCommand: - "/scripts/shred.sh" - "2" # The volume mode of PV defines whether a device volume is # intended to use as a formatted filesystem volume or to remain in block # state. Value of Filesystem is implied when omitted. volumeMode: Filesystem fsType: ext4 # name pattern check # only discover local disk mounted to path matching pattern("*" by default). namePattern: "*" Run the following command to create the StorageClass and ConfigMap. kubectl apply -f config-map.yaml CSI Driver DaemonSet The Local Volume Static Provisioner CSI Driver runs on each Amazon EKS node needing its NVMe instance store volumes exposed as Kubernetes PVs. Often Kubernetes clusters have multiple instance types in the cluster, where some nodes might not have NVMe instance store volumes. The DaemonSet in the following manifest specifies a nodeAffinity selector to only schedule the DaemonSet on an Amazon EKS node with a label of fast-disk-node and corresponding value of either pv-raid or pv-nvme. Copy and save the following manifest as daemonset.yaml # The Local Persistent Volume CSI DaemonSet apiVersion: apps/v1 kind: DaemonSet metadata: name: local-volume-provisioner namespace: kube-system labels: app.kubernetes.io/name: local-volume-provisioner spec: selector: matchLabels: app.kubernetes.io/name: local-volume-provisioner template: metadata: labels: app.kubernetes.io/name: local-volume-provisioner spec: serviceAccountName: local-volume-provisioner containers: # The latest version can be found in the changelog. # In production, one might want to use the container digest hash # over version for improved security. # https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/blob/master/CHANGELOG.md - image: "registry.k8s.io/sig-storage/local-volume-provisioner:v2.5.0" # In production you might want to set this to use a locally cached # image by setting this to: IfNotPresent imagePullPolicy: "Always" name: provisioner securityContext: privileged: true env: - name: MY_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: MY_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace ports: # List of metrics at # https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/blob/cee9e228dc28a4355f664b4fe2236b1857fe4eca/pkg/metrics/metrics.go - name: metrics containerPort: 8080 volumeMounts: - name: provisioner-config mountPath: /etc/provisioner/config readOnly: true - mountPath: /mnt/fast-disks name: fast-disks mountPropagation: "HostToContainer" volumes: - name: provisioner-config configMap: name: local-volume-provisioner-config - name: fast-disks hostPath: path: /mnt/fast-disks # Only run CSI Driver on the `fast-disk` tagged nodegroup affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: fast-disk-node operator: In values: - "pv-raid" - "pv-nvme" Run the following command to create the DaemonSet. kubectl apply -f daemonset.yaml Amazon EKS Managed Node Group – Pre-bootstrap Commands The ConfigMap deployed has the Local Volume Static Provisioner CSI Driver looking for disks mounted in the /mnt/fast-disks directory and running on nodes with a label of fast-disk-node and a value of pv-raid or pv-nvme. Now we need to configure our Amazon EKS managed node group to spin up EC2 instances with the fast-disk-node label and, on startup, to mount the NVMe instance store volumes to the /mnt/fast-disks directory. This post goes over two approaches: Multiple persistent volumes, one for each NVMe instance store volume One single persistent volume RAID-0 array across all the NVMe instance store volumes Both of these options deliver high random I/O performance and very low latency storage volumes to your Kubernetes Pods. The two approaches offer different options depending on the use case. In Option 1, a persistent volume is created for each NVMe instance store volume. The i3.8xlarge instance used in this blog has four NVMe volumes. Option 1 will create four persistent volumes for the use case when multiple Pods need fast storage. Option 2 creates a single persistent volume using RAID-0, which is useful when only a single Pod needs fast storage. Option 1: Multiple Persistent Volumes, One for Each NVMe Instance Store Using the eksctl utility, we create a new Amazon EKS managed node group. In this example, we’ve requested two i3.8xlarge EC2 instances. In the metadata, replace eksworkshop-eksctl and us-west-2 with your respective EKS cluster’s name and AWS Region. Copy and save the manifest below as pv-nvme-nodegroup.yaml apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: # Replace with your EKS Cluster's name name: eksworkshop-eksctl # Replace with the AWS Region your cluster is deployed in region: us-west-2 managedNodeGroups: # Name to give the managed node-group - name: eks-pv-nvme-ng # Label the nodes that they contain fast-disks labels: { fast-disk-node: "pv-nvme" } instanceType: i3.8xlarge desiredCapacity: 2 volumeSize: 100 # EBS Boot Volume size privateNetworking: true preBootstrapCommands: - | # Install NVMe CLI yum install nvme-cli -y # Get a list of instance-store NVMe drives nvme_drives=$(nvme list | grep "Amazon EC2 NVMe Instance Storage" | cut -d " " -f 1 || true) readarray -t nvme_drives <<< "$nvme_drives" for disk in "${nvme_drives[@]}" do # Format the disk to ext4 mkfs.ext4 -F $disk # Get disk UUID uuid=$(blkid -o value -s UUID $disk) # Create a filesystem path to mount the disk mount_location="/mnt/fast-disks/${uuid}" mkdir -p $mount_location # Mount the disk mount $disk $mount_location # Mount the disk during a reboot echo $disk $mount_location ext4 defaults,noatime 0 2 >> /etc/fstab done Run the following command to create the Amazon EKS managed node group. eksctl create nodegroup -f pv-nvme-nodegroup.yaml Option 2: Single Persistent Volume RAID-0 array across all the NVMe instance stores Similar to the last example, we use the eksctl utility and create a new Amazon EKS managed node group. However, this time a software RAID-0 array across the NVMe instance store volumes is created. In the metadata, replace eksworkshop-eksctl and us-west-2 with your respective EKS cluster’s name and AWS Region. Copy and save the manifest below as pv-raid-nodegroup.yaml apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: # Replace with your EKS Cluster's name name: eksworkshop-eksctl # Replace with the AWS Region your cluster is deployed in region: us-west-2 managedNodeGroups: # Name to give the managed node-group - name: eks-pv-raid-ng # Label the nodes that they contain fast-disks labels: { fast-disk-node: "pv-raid" } instanceType: i3.8xlarge desiredCapacity: 2 volumeSize: 100 # EBS Boot Volume size privateNetworking: true preBootstrapCommands: - | # Install NVMe CLI yum install nvme-cli -y # Get list of NVMe Drives nvme_drives=$(nvme list | grep "Amazon EC2 NVMe Instance Storage" | cut -d " " -f 1 || true) readarray -t nvme_drives <<< "$nvme_drives" num_drives=${#nvme_drives[@]} # Install software RAID utility yum install mdadm -y # Create RAID-0 array across the instance store NVMe SSDs mdadm --create /dev/md0 --level=0 --name=md0 --raid-devices=$num_drives "${nvme_drives[@]}" # Format drive with Ext4 mkfs.ext4 /dev/md0 # Get RAID array's UUID uuid=$(blkid -o value -s UUID /dev/md0) # Create a filesystem path to mount the disk mount_location="/mnt/fast-disks/${uuid}" mkdir -p $mount_location # Mount RAID device mount /dev/md0 $mount_location # Have disk be mounted on reboot mdadm --detail --scan >> /etc/mdadm.conf echo /dev/md0 $ ext4 defaults,noatime 0 2 >> /etc/fstab Run the following command to create the Amazon EKS managed node group. eksctl create nodegroup --config-file=pv-raid-nodegroup.yaml Viewing Persistent Volumes and DaemonSets After the Amazon EKS managed node group is created, the Local Volume Static Provisioner will be scheduled as a DaemonSet on each of the managed node group’s EC2 instances. The DaemonSet will discover the mounted NVMe instance store volumes mounted in /mnt/fast-disks and expose them as persistent volumes. To view the DaemonSets running, run the following command: kubectl get daemonset --namespace=kube-system To view the persistent volumes, run the following command: kubectl get pv Clean up First, we remove the instance store node groups. Note that the eksctl command starts a CloudFormation script that can take a few minutes before the nodes and associated resources are terminated. # Replace `eksworkshop-eksctl` with your EKS Cluster's name eksctl delete nodegroup --cluster=eksworkshop-eksctl --name=eks-pv-nvme-ng eksctl delete nodegroup --cluster=eksworkshop-eksctl --name=eks-pv-raid-ng Then delete the persistent volumes. kubectl get pv --no-headers | awk '$6=="fast-disks" { print $1 }' | xargs kubectl delete pv Finally, remove the associated Kubernetes objects. kubectl delete -n=kube-system daemonset local-volume-provisioner kubectl delete -n=kube-system configmap local-volume-provisioner-config kubectl delete storageclass fast-disks kubectl delete clusterrolebinding local-storage-provisioner-node-binding kubectl delete clusterrole local-storage-provisioner-node-clusterrole kubectl delete -n=kube-system serviceaccount local-volume-provisioner Conclusion In this post we described how to use the Local Volume Static Provisioner CSI driver developed by the Kubernetes Storage special interest group. By using the Amazon EKS managed node groups pre-bootstrap commands, you can customize the provisioning of NVMe instance store volumes to meet your unique PV needs. Application developers can use this deployment pattern to provide each pod access to an isolated instance store or a shared storage layer for cross-pod access. The Local Volume Static Provisioner is a third-party open-source software released under the Apache 2 license. We encourage you to visit the AWS Containers Roadmap page on GitHub to stay in touch with the latest additions and upcoming features to AWS container services. View the full article 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.