Kubernetes/Scheduling

From Ever changing code
Jump to navigation Jump to search

Default scheduler rules

  1. Identify if a node has adequate hardware resources
  2. Check if a node is running out of resources. check for memory or disk pressure conditions
  3. Check if a pod schedule is scheduled to a node by a name
  4. Check if a node has a label matching node selector in a pod spec
  5. Check if a pod is requesting to bound to a specific host port and if so, does the node have that port available
  6. Check if a pod is requesting a certain type of volume be mounted and if other pods are using the same volume
  7. Check if a pod tolerates taints of the node, eg. master nodes is tainted with "noSchedule"
  8. Check if a pod or a node affinity rules and checking if scheduling the pod would break these rules
  9. If there is more than one node could schedule a pod, the scheduler priorities the nodes and choose the best one. If they have the same priority it chooses in round-robin fashion.

Label nodes

kubectl label node worker1.acme.com share-type=dedicated


YAML for the deployment to include the node affinity rules:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pref
spec:
  replicas: 5
  template:
    metadata:
      labels:
        app: pref
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution: #all pods,but not current pod on the node
          - weight: 80
            preference:
              matchExpressions:
              - key: availability-zone
                operator: In
                values:
                - zone1
          - weight: 20              #4 time less priority then AZ
            preference:
              matchExpressions:
              - key: share-type     #label key
                operator: In
                values:
                - dedicated         #label value
      containers:
      - args:
        - sleep
        - "999"
        image: busybox:v1.28.4
        name: main

Capacity and resources

Check node's capacity

kubectl describe nodes worker-2.acme.com | grep -A 20 Capacity:
Capacity:
 cpu:                2
 ephemeral-storage:  20263528Ki
 hugepages-2Mi:      0
 memory:             4044936Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  18674867374
 hugepages-2Mi:      0
 memory:             3942536Ki
 pods:               110
System Info:
 Machine ID:                 ******c49b4bed31684a******
 System UUID:                ******-D110-CB50-EAA3-*******
 Boot ID:                    ****8-be21-45ca-b86c-311a479******
 Kernel Version:             4.4.0-1087-aws
 OS Image:                   Ubuntu 16.04.6 LTS
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.6.1
...


YAML to schedule request to a specific node and request specific resources

apiVersion: v1
kind: Pod
metadata:
  name: resource-pod1
spec:
  nodeSelector:
    kubernetes.io/hostname: "worker-2.acme.com"
  containers:
  - image: busybox
    command: ["dd", "if=/dev/zero", "of=/dev/null"]
    name: budybox-dd
    resources:
      requests:
        cpu: 800m     #mili cores -> 2000m (large for 2nd deployment)
        memory: 20Mi  #Mb


Deploy above

kubectl describe nodes worker-2.acme.com 
Non-terminated Pods:         (6 in total)
  Namespace                  Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                   ------------  ----------  ---------------  -------------  ---
  default                    busybox                                0 (0%)        0 (0%)      0 (0%)           0 (0%)         24h
  default                    nginx-loadbalancer-86bb844fb7-bl5fs    0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d
  default                    resource-pod1                          800m (40%)    0 (0%)      20Mi (0%)        0 (0%)         6m7s
  kube-system                kube-flannel-ds-amd64-97hvr            100m (5%)     100m (5%)   50Mi (1%)        50Mi (1%)      14d
  kube-system                kube-proxy-fxl6f                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         14d
  rbac1                      test-f57db4bfd-ghshj                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         12d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                900m (45%)  100m (5%)
  memory             70Mi (1%)   50Mi (1%)
  ephemeral-storage  0 (0%)      0 (0%)


Then deploying another pod, requesting 2000mi cpus, will end up with scheduling error when describing the pod

kubectl describe pod pod2-2000mi-cpu
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  44s (x10 over 4m57s)  default-scheduler  0/3 nodes are available: 2 node(s) didn't match node selector, 3 Insufficient cpu.


Limits YAML

apiVersion: v1
kind: Pod
metadata:
  name: pod-limit-resources
spec:
  containers:
  - image: busybox
    command: ["dd", "if=/dev/zero", "of=/dev/null"]
    name: main
    resources:
      limits:
        cpu: 2        #by default requests are eq limits if not specified
        memory: 40Mi  #

Resources