Kubernetes/Monitoring
Monitor cluster resources
Metric-server
In order to get cluster resources you need a metric collector plugin. Popular one was heapster now deprecated replaced by metric-server.
Install metrics-server
# General installation kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml kubectl get deployment metrics-server -n kube-system NAME READY UP-TO-DATE AVAILABLE AGE metrics-server 1/1 1 1 6m # Alternative git clone https://github.com/kubernetes-incubator/metrics-server.git kubectl apply -f ~/metrics-server/deploy/1.8+/
EKS Installation [1]
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml -O metrics-server-0.3.6.yaml # Edit and add following commnd with arguments to the deploy/kubernetes/metrics-server-deployment.yaml: command: - /metrics-server - --logtostderr - --kubelet-insecure-tls=true - --kubelet-preferred-address-types=InternalIP - --v=2 # kubelet-insecure-tls – do not check kubelet-clients CA certificate on nodes # kubelet-preferred-address-types – how to find resources in the Kubernetes space – by using Hostname, InternalDNS, # InternalIP, ExternalDNS or ExternalIP, for the EKS set it to the InternalIP value # v=2 – logs detalization level kubectl apply -f metrics-server-0.3.6.yaml stern metrics-server -n kube-system
Get metrics, you may need to wait 1-2 minutes to complete first metrics scrape [2]
# verify metrics server API kubectl get --raw /apis/metrics.k8s.io/ {"kind":"APIGroup","apiVersion":"v1","name":"metrics.k8s.io","versions":[{"groupVersion":"metrics.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"metrics.k8s.io/v1beta1","version":"v1beta1"}} kubectl get apiservices | grep metrics v1beta1.metrics.k8s.io kube-system/metrics-server True 30m kubectl top node # CPU,memory utilization of the nodes in your cluster kubectl top pods # CPU,memory utilization of the pods in your cluster kubectl top pods -A # CPU,memory of pods in all namespaces kubectl top pods -A --sort-by=memory kubectl top pod -l run=<label> # CPU and memory of pods with a label selector: kubectl top pod <pod-name> # CPU,memory of a specific pod kubectl top pods --containers # CPU,memory of the containers inside the pod
[1] EKS errors if installed oob unable to fully scrape metrics
:
metrics-server-aaaaaaaaaa-h64c5 metrics-server E0714 15:59:42.204640 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-10-35-70-169.eu-west-1.compute.internal: unable to fetch metrics from Kubelet ip-10-35-70-169.eu-west-1.compute.internal (ip-10-35-70-169.dev.acme.com): Get https://ip-10-35-70-169.dev.acme.com:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup ip-10-35-70-169.dev.acme.com on 172.20.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:
[2] Working metric-server scrapes metrics by default every 1 minute.
metrics-server-bbbbbbbbbb-s5pn5 metrics-server I0714 16:20:07.784364 1 manager.go:95] Scraping metrics from 4 sources metrics-server-bbbbbbbbbb-s5pn5 metrics-server I0714 16:20:07.787577 1 manager.go:120] Querying source: kubelet_summary:ip-10-00-64-185.eu-west-1.compute.internal metrics-server-bbbbbbbbbb-s5pn5 metrics-server I0714 16:20:07.812605 1 manager.go:120] Querying source: kubelet_summary:ip-10-00-70-169.eu-west-1.compute.internal metrics-server-bbbbbbbbbb-s5pn5 metrics-server I0714 16:20:07.814077 1 manager.go:120] Querying source: kubelet_summary:ip-10-00-68-179.eu-west-1.compute.internal metrics-server-bbbbbbbbbb-s5pn5 metrics-server I0714 16:20:07.814843 1 manager.go:120] Querying source: kubelet_summary:ip-10-00-69-23.eu-west-1.compute.internal metrics-server-bbbbbbbbbb-s5pn5 metrics-server I0714 16:20:07.820754 1 manager.go:148] ScrapeMetrics: time: 36.362483ms, nodes: 4, pods: 57
cAdvisor deprecated in v1.11
Every node in a Kubernetes cluster has a Kubelet process. Within each Kubelet is a cAdvisor process. The cAdvisor is continuously gathering metrics about the state of the cluster. It's always available
minikube start --extra-config=kubelet.CAdvisorPort=4194 kubectl proxy & # open a proxy to the Kubernetes API port open $(minikube ip):4194 # cAdvisor also serves up the metrics is a helpful HTML format # Each node provide statistics that are provided by cAdvisor. Access the node stats curl localhost:8001/api/v1/nodes/$(kubectl get nodes -o=jsonpath="{.items[0].metadata.name}")/proxy/stats/ # Kubernetes API also gather the cAdvisor metrics at /metrics curl localhost:8001/metrics
Liveness and Readiness probes
Check this Visual explanation
readinessProbe
- checks if a pod is ready to receive a client requests, when passed, then the pod is added toendpoint
. When the probe fails - the pod is not restarted, instead removed fromendpoint
.livenessProbe
- when the probe fails, pod gets restarted
Get service endpoints. Only healthy and ready pods will be added to the endpoint
kubectl get endpoint
Liveness and readiness probes in both Pod and Deployment manifests are at .spec.containers.image
level
<syntaxhighlightjs lang=yaml>
apiVersion: v1
kind: Pod
metadata:
name: liveness-readiness-pod
spec:
containers: - image: nginx name: main livenessProbe: httpGet: # exec: or tcpSocket: path: /healthz # not all containers have this endpoint port: 8081 readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 5 # default, tell kubelet to wait 5 second after container starts, before performing the first probe periodSeconds: 5 # default, tell kueblet to run probe ever 5s
</syntaxhighlightjs>
Logs
Container logs
Containerized applications usually write their logs to STDOUT and STDERR instead of writing their logs to files. Docker then redirects those streams to files. You can retrieve those files with the kubectl logs
These are stored on nodes in /var/log/
directory and contain everything containers send to STDOUT.
/var/log/containers/
contains container logs, these are symlinks to../pods/
/var/log/containers/
contains directory per each pod in form<namespace-<rs|deployment>/<pod-name>/0.log(logfile)
0.log
it's a symlink to/var/lib/docker/containers/uid-part1/uid-part2-json.log
$ ls -l /var/log/containers total 56 lrwxrwxrwx 1 root root 101 Oct 7 06:51 coredns-5644d7b6d9-hztth_kube-system_coredns-9de9395495186177f5112d795ca950dd0227e6f025f40c83ddf2a99c56802939.log -> /var/log/pods/kube-system_coredns-5644d7b6d9-hztth_5da159b3-64e7-48e4-b9f8-003f9623481d/coredns/0.log ...
In case your container logs multiple files, it will be difficult to distinguish them using kubectl logs
command. Therefore you can introduce sidecars containers that tail individual logs and access them like that:
kubectl logs <pod> container-log-1
kubectl logs <pod> container-log-2
kubelet
runs as a process therefore writes logs to system location
/var/log
journalctl -u kubelet.service
</source>
Retrieve logs
kubectl logs <pod> <container> # container name is optional for a single container pods kubectl logs <pod> <container> --previous | -p flag # in case the container has crashed kubectl logs <pod> --all-containers=true kubectl logs --since=10m <pod> kubectl logs deployment/<pod> -c <container> # view the logs from a container within a pod within a deployment kubectl logs --tail=20 haproxy # tail x lines kubectl logs -l app=haproxy # logs from containers matching a label
Kubernetes worker nodes docker log configuration - size
The Docker runtime log configuration is setup on each node in /etc/docker/daemon.json
{ "bridge": "none", "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "10" }, "live-restore": true, "max-concurrent-downloads": 10 }
Kubernetes logs where are they coming from
Logs from the STDOUT
and STDERR
of containers in the pod are captured and stored inside files in /var/log/containers
. This is what is presented when kubectl log
is run. In order to understand why output from commands run by kubectl exec
is not shown when running kubectl log
, let's have a look how it all works with an example:
# Launch a pod running ubuntu that are sleeping forever kubectl run test --image=ubuntu --restart=Never -- sleep infinity # Exec into it kubectl exec -it test bash
Seen from inside the container it is the STDOUT
and STDERR
of PID 1
that are being captured. When you do a kubectl exec
into the container a new process is created living alongside PID 1
:
root@test:/# ps -auxf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 7 0.0 0.0 18504 3400 pts/0 Ss 10:02 0:00 bash root 19 0.0 0.0 34396 2908 pts/0 R+ 10:05 0:00 \_ ps -auxf root 1 0.0 0.0 4528 836 ? Ss 10:01 0:00 sleep infinity
Redirecting to STDOUT
is not working because /dev/stdout
is a symlink to the process accessing it (/proc/self/fd/1
rather than /proc/1/fd/1
).
root@test:/# ls -lrt /dev/stdout lrwxrwxrwx 1 root root 15 Nov 5 10:01 /dev/stdout -> /proc/self/fd/1
In order to see the logs from commands run with kubectl exec
the logs need to be redirected to the streams that are captured by the kubelet
(STDOUT
and STDERR
of pid 1
). This can be done by redirecting output to /proc/1/fd/1
.
root@test:/# echo "send-to-kubernetes-container-log" > /proc/1/fd/1
Exiting the interactive shell and checking the logs using kubectl logs
should now show the output
$> kubectl logs test send-to-kubernetes-container-log
Termination message
WKubernetes allows to write a custom message to a custom file on termination. This message can be view directly using kubectl describe
in Last State: Termination, Message: <custom message>
apiVersion: v1 kind: Pod metadata: name: pod2 spec: containers: - image: busybox name: main command: - sh - -c - 'echo "I say that this container has been terminated at $(date)" > /var/termination-reason ; exit 1' terminationMessagePath: /var/termination-reason
Troubelshooting
# get a yaml without status information (almost clean yaml manifest) kubectl -n web pod <failing-pod> -oyaml --export
References
- debug-application K8s docs
- Logging K8s docs
- Logs K8s docs
- VictoriaMetrics Github