Difference between revisions of "Kubernetes/Storage"
| Line 14: | Line 14: | ||
<source lang=bash>  | <source lang=bash>  | ||
#Create a volume, a zone must match your K8s cluster node's zone a pod is running on  | #Create a volume, a zone must match your K8s cluster node's zone a pod is running on  | ||
gcloud compute disks create --size=1GiB --zone=  | gcloud compute disks create --size=1GiB --zone=europe-west1-b mongodb    | ||
gcloud compute disks list  | gcloud compute disks list  | ||
NAME                                               LOCATION        LOCATION_SCOPE  SIZE_GB  TYPE         STATUS  | NAME                                               LOCATION        LOCATION_SCOPE  SIZE_GB  TYPE         STATUS  | ||
Revision as of 08:19, 31 July 2019
PV - Persistent Volumes
- StorageClass (SC)
 - Simply you describe how to create PhysicalVolume specifying a provisioner in a manifest, in Google it will be GCE-provisioner, in AWS it will be EBS-provisioner, you can also specify other settings to be passed to the provisioner like diskType: magentic, ssd, IOPS etc. It provides a way for administrators to describe the classes of storage they offer (IOPS, performance, ssd), it's called profiles in other storage systems.
 
- PersistentVolumeClaim (PVC)
 - is a request for storage by a user, similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only). This is a way to "claim" already provision storage by a pod. PVC can request StorageClass or direct alrady created PV for space.
 
- PersistentVolume (PV)
 - is a piece of storage in the cluster that has been provisioned by an administrator (using eg gcloud compute disks ... command) or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, or a cloud-provider-specific storage system eg. AWS EBS or gcloud disks.
 
PV Access modes: (it's a node capability not a pod, volume can be mounted only using one Access Mode at the time, even if it supports many)
- ReadWriteOnce - only a single node can mount a volume for writing and reading
 - ReadOnlyMany - multiple nodes can mount for reading only
 - ReadWriteMany - multiple nodes can mount volume for reading and writing
 
Operations
In this example we use Google Cloud and run from Cloud Shell.
#Create a volume, a zone must match your K8s cluster node's zone a pod is running on
gcloud compute disks create --size=1GiB --zone=europe-west1-b mongodb 
gcloud compute disks list
NAME                                               LOCATION        LOCATION_SCOPE  SIZE_GB  TYPE         STATUS
gke-standard-cluster-1-default-pool-c43dab38-4qdn  europe-west1-b  zone            100      pd-standard  READY
gke-standard-cluster-1-default-pool-c43dab38-553f  europe-west1-b  zone            100      pd-standard  READY
gke-standard-cluster-1-default-pool-c43dab38-qc0z  europe-west1-b  zone            100      pd-standard  READY
mongodb                                            europe-west1-b  zone            1        pd-standard  READY
#Create a pod, from a table below
kubectl apply -f mongodb.yaml
kubectl describe pod mongodb
...
Volumes:
  mongodb-data:
    Type:       GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
    PDName:     mongodb
    FSType:     ext4
    Partition:  0
    ReadOnly:   false
  default-token-fvhrt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fvhrt
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason                  Age   From                     Message
  ----    ------                  ----  ----                     -------
  Normal  Scheduled               29s   default-scheduler        Successfully assigned default/mongodb to gke-standard-cluster-1-default-pool-c43dab38-553f
  Normal  SuccessfulAttachVolume  22s   attachdetach-controller  AttachVolume.Attach succeeded for volume "mongodb-data"
| Pod | PersistentVolume | 
|---|---|
apiVersion: v1
kind: Pod
metadata:
  name: mongodb 
spec:
  volumes:
  - name: mongodb-data
    gcePersistentDisk:
      pdName: mongodb #gcloud disk name
      fsType: ext4
  containers:
  - image: mongo
    name: mongodb
    volumeMounts:
    - name: mongodb-data #volume name from spec.volumes above
      mountPath: /data/db #where MongoDb stores its data
    ports:
    - containerPort: 27017 #standard MongoDB port
      protocol: TCP
 | 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv
spec:
  capacity: 
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
    - ReadOnlyMany
  persistentVolumeReclaimPolicy: Retain
  gcePersistentDisk:
    pdName: mongodb
    fsType: ext4
 | 
Save data to DB, delete pod and re-create to verify data persistance
kubectl exec -it mongodb mongo #connect to MongoDB
> use aaa
switched to db mydb
> db.foo.insert({name:'foo'})
WriteResult({ "nInserted" : 1 })
> db.foo.find()
{ "_id" : ObjectId("5d3aa87f4f89408f62df4e8b"), "name" : "foo" }
exit
#Drain a node that the pod is running on
kubectl get nodes
NAME                                                STATUS   ROLES    AGE   VERSION
gke-standard-cluster-1-default-pool-c43dab38-4qdn   Ready    <none>   44m   v1.12.8-gke.10
gke-standard-cluster-1-default-pool-c43dab38-553f   Ready    <none>   44m   v1.12.8-gke.10
gke-standard-cluster-1-default-pool-c43dab38-qc0z   Ready    <none>   44m   v1.12.8-gke.10
kubectl get pods -owide
NAME      READY   STATUS    RESTARTS   AGE   IP         NODE                                                NOMINATED NODE
mongodb   1/1     Running   0          24m   10.4.2.4   gke-standard-cluster-1-default-pool-c43dab38-553f   <none>
kubectl drain gke-standard-cluster-1-default-pool-c43dab38-553f --ignore-daemonsets
kubectl get nodes
NAME                                                STATUS                     ROLES    AGE   VERSION
gke-standard-cluster-1-default-pool-c43dab38-4qdn   Ready                      <none>   51m   v1.12.8-gke.10
gke-standard-cluster-1-default-pool-c43dab38-553f   Ready,SchedulingDisabled   <none>   51m   v1.12.8-gke.10
gke-standard-cluster-1-default-pool-c43dab38-qc0z   Ready                      <none>   51m   v1.12.8-gke.10
#Create pod again
kubectl apply -f mongodb.yaml
pod/mongodb created
kubectl get pods -owide #notice it's running on different node now
NAME      READY   STATUS    RESTARTS   AGE    IP         NODE                                                NOMINATED NODE
mongodb   1/1     Running   0          104s   10.4.0.9   gke-standard-cluster-1-default-pool-c43dab38-4qdn   <none>
kubectl exec -it mongodb mongo #data we created earlier or should be still there
> use mydb
switched to db aaa
> db.foo.find()
{ "_id" : ObjectId("5d3aa87f4f89408f62df4e8b"), "name" : "foo" }
>
Persistent volumes can be managed like another K8s resources. Use YAML manifest from the table above 2nd column to create one
kubectl get persistentvolume -owide NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE mongodb-pv 512Mi RWO,ROX Retain Available 61s
PersistentVolumeClaim
Persistent Volume Claims (PVCs) are a way for an application developer to request storage for the application without having to know where the underlying storage is. The claim is then bound to the Persistent Volume (PV), and it will not be released until the PVC is deleted.
| Pod using PVC | PersistentVolumeClaim | 
|---|---|
apiVersion: v1
kind: Pod
metadata:
  name: mongodb 
spec:
  containers:
  - image: mongo
    name: mongodb
    volumeMounts:
    - name: mongodb-data
      mountPath: /data/db
    ports:
    - containerPort: 27017
      protocol: TCP
  volumes:
  - name: mongodb-data
    persistentVolumeClaim:
      claimName: mongodb-pvc
 | 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongodb-pvc 
spec:
  resources:
    requests:
      storage: 1Gi #cannot be more than PV
  accessModes:
  - ReadWriteOnce
  storageClassName: ""
 | 
- Storage Objects in use Protection, it does not allow to delete PV or PVC if in use by a Pod
 
| PV | PVC | 
|---|---|
kubectl describe pv mongodb-pv
...
Finalizers: [kubernetes.io/pv-protection]
...
Source:
    Type:    ...
    PDName:  mongodb
    FSTyope: ext4
...
 | 
kubectl describe pv mongodb-pvc ... Finalizers: [kubernetes.io/pvc-protection] ...  | 
Storage Class
It's K8s object that allows to manage (create) physical storage using provisioners.
- eg. Google-GCE 
provisioner: kubernetes.io/gce-pdit's an equivalent of command: gcloud compute disks create --size=1GiB --zone=us-west1-a mongodb-vol-ssd --zone us-west1-a --type pd-ssd- provisioner would create a disk in GCE with prefixed-ID name opposit to the native gcloud command, see below
 
gcloud compute disks list NAME LOCATION LOCATION_SCOPE SIZE_GB TYPE STATUS gke-cluster-1-b4800067-pvc-8c2655c5-b360-11e9-93cd-42010a84024e europe-west1-b zone 1 pd-ssd READY
| PersistentVolumeClaim | StorageClass | 
|---|---|
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongodb-pvc 
spec:
  storageClassName: fast
  resources:
    requests:
      storage: 100Mi
  accessModes:
    - ReadWriteOnce
 | 
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd  | 
List drives
gcloud compute disks list NAME LOCATION LOCATION_SCOPE SIZE_GB TYPE STATUS gke-standard-cluster-1-default-pool-29207899-1s7d us-west1-a zone 100 pd-standard READY gke-standard-cluster-1-default-pool-29207899-7sxs us-west1-a zone 100 pd-standard READY gke-standard-cluster-1-default-pool-29207899-w4hh us-west1-a zone 100 pd-standard READY gke-standard-cluster-1-pvc-148a2097-b29b-11e9-b66b-42010a8a00c7 us-west1-a zone 1 pd-ssd READY #StorageClass mongodb-vol us-west1-a zone 7 pd-standard READY mongodb-vol-ssd us-west1-a zone 1 pd-ssd READY #gcloud created
Notice default storageClass name 'standard' already created in GCE
kubectl describe storageclasses.storage.k8s.io Name: standard IsDefaultClass: Yes Annotations: storageclass.kubernetes.io/is-default-class=true Provisioner: kubernetes.io/gce-pd Parameters: type=pd-standard AllowVolumeExpansion: <unset> MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none>
List storage classes
kubectl get sc NAME PROVISIONER AGE fast kubernetes.io/gce-pd 9m11s #storage class created standard (default) kubernetes.io/gce-pd 49m #Default storage class when you create a PVC, and leave StorageClass: <blank> kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mongodb-pvc Bound mongodb-pv 2Gi RWO,ROX 40m mongodb-pvc-storage-class Bound pvc-148a2097-b29b-11e9-b66b-42010a8a00c7 1Gi RWO fast 12m
Local-storage
There are many volume types, incl. gdrive, EBS, hostPath, empty directory type.
| Pod emptyDir (useful for sharing tmp dir with containers on the in a pod) | hostPath (PV) | 
|---|---|
apiVersion: v1
kind: Pod
metadata:
  name: emptydir-pod
spec:
  containers:
  - image: busybox
    name: busybox
    command: ["/bin/sh", "-c", "while true; do sleep 3600; done"]
    volumeMounts:
    - mountPath: /tmp/storage
      name: vol
  volumes:
  - name: vol
    emptyDir: {}
 | 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: hostpath-pv
spec:
  storageClassName: local-storage
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"
 |