Nahuel Hernandez

Nahuel Hernandez

Another personal blog about IT, Automation, Cloud, DevOps and Stuff.

CKA Kubernetes Administrator Certification

The Certified Kubernetes Administrator or CKA is a challenging exam. Performance-based test that requires solving multiple issues from a command line. I studied and passed the 3 Kubernetes Certifications (CKA/CKAD/CKS), and I want to share valuable information to prepare and pass this exam.

57-Minute Read

CKAD Certification

Certified Kubernetes Administrator

The Certified Kubernetes Administrator or CKA is a hands-on test and consists of a set of performance-based items (17 problems) to be solved using a command line and is expected to take approximately two (2) hours to complete.

The exam is challenging. However, if you purchased your CKA, two Killer.sh simulator sessions will be already included. The Killer.sh simulator is more complicated than the actual exam. So, after doing the Killer.sh simulator completely, the real exam was not so difficult. In addition, the Linux Foundations give you a retake chance if you fail on the first one.

My last advice, the CKA and CKAD have about %60 overlapping on the topics, is an excellent plan to prepare both simultaneously.

Useful links:

Exan Objectives:

Domain Weight
Cluster Architecture, Installation & Configuration 25%
Workloads & Scheduling 15%
Service & Networking 20%
Storage 10%
Troubleshooting 30%

Table of contents

My notes are from the Kubernetes official documentation and a part from the KodeKloud Course.

Core Concepts

Etcd

Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.

ETCD Basic Etcd service runs in port 2379. If we install ETCD and we want to save a key value is very simple

./etcdctl set key1 value1

ectdctl is the command line for Etcd

If we want to read the data, also is very easy.

./etcdctl get key1
value

ETCD on Kubernetes

Every information we see when we run the kubectl get is from the ETCD server (Nodes, PODs, Configs, Secrets, Accounts, Roles, Bindings, others)

  • Two methods to deploy on K8S

    • Setup manual: Download binaries, and install it as a service

    • Setup kubeadm: Deployed as a pod in kube-system (etcd-master) Get all keys

      kubectl exec etc-master -n kube-system etcdctl get / --prefix -keys-only
      

Kube API Server

The API server is a component of the Kubernetes control plane that exposes the Kubernetes API. The API server is the front end for the Kubernetes control plane. kube-apiserver is designed to scale horizontally—that is, it scales by deploying more instances. You can run several instances of kube-apiserver and balance traffic between those instances.

When we run a kubectl command, this command is infact reaching to the kube-apiserver, and when the kube-apiserver validates the access, reach the ETCD for the information. We can also do the same using the API with a post request.

curl -X post /api/v1/namespaces/default/pods... 

The Kube-api server is responsible for:

  1. Authenticate User
  2. Validate Request
  3. Retrieve data
  4. Update ETCD
  5. Scheduler
  6. Kubelet

Note: kube-api server is the only component that interacts directly with the ETCD Datastore

Kube Controller Manager

Control Plane component that runs controller processes.

Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.

Some types of these controllers are:

  • Node controller: Responsible for noticing and responding when nodes go down.
  • Replication controller: Monitoring the status of replica sets and ensuring that the desired number of pods are available at all times within the set. If a pods dies it creates another one.
  • Job controller: Watches for Job objects that represent one-off tasks, then creates Pods to run those tasks to completion.
  • Endpoints controller: Populates the Endpoints object (that is, joins Services & Pods).
  • Service Account & Token controllers: Create default accounts and API access tokens for new namespaces.

Note: A controller is a process that continuously monitors the state of various components within the system and works towards bringing the whole system to the desired functioning state

Kube Scheduler

Control plane component that watches for newly created Pods with no assigned node, and selects a node for them to run on. (Scheduler is only responsible for deciding, Kubelet creates the pod on the nodes) Factors taken into account for scheduling decisions include: individual and collective resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, and deadlines.

Kubelet

An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod.

The kubelet takes a set of PodSpecs that are provided through various mechanisms and ensures that the containers described in those PodSpecs are running and healthy. The kubelet doesn’t manage containers which were not created by Kubernetes.

Note: Kubeadm does not deploy Kubelet, we to do that manually, install the kubelet on our worker nodes, Download the installer, extract, and install.

Kube-proxy

kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.

kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.

kube-proxy uses the operating system packet filtering layer if there is one and it’s available. Otherwise, kube-proxy forwards the traffic itself.

Note: Its job is to look for new services and every time a new service is created it creates the appropriate rules on each node to forward traffic to those services to the backend pods, one way it does this is using IPTABLES rules. In this case it creates an IP tables rule on each node in the cluster to forward traffic heading to the IP of the service to the IP of the actual pod

Note: Kubeproxy it is deployed as a Daemon Set, so a simple POD is always deployed on each node in the cluster.

kubectl get daemonset -n kube-system

Scheduling

Here we take a closer look at the various options available for customizing and configuring the way the scheduler behaves through different topics.

Manual Scheduling

If we don’t have a Kube Scheduler we can schedule the Pods ourself.

In a pod definition we can configure a nodeName (for default isn’t set). Kubernetes with the Kube Scheduler checks if any pods doesn’t have this property set, and choose in which node to schedule, and creates a binding object.

If we don’t have a Scheduler the pods will maintain in “pending state”, in this case we can manually assign pods to node ourself. For example:

apiVersion: v1
kind: Pod
...
spec:
	nodeName: node02
	containers:
	- name: nginx
	  image: nginx

Labels and Selector

Labels and selectors are a standard method to group things together.

Labels

Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users.

Labels allow for efficient queries and watches and are ideal for use in UIs and CLIs. Non-identifying information should be recorded using annotations.

pod-definitions.yaml

apiVersion: v1
kind: Pod
metadata:
	name: simple-webapp
	labels:
		app: App1
		function: Front-end
spec:
	containers:
	- name: simple-webapp
	  image: simple-webapp
	  ports:
	    - containerPort: 8080

Label selectors

Unlike names and UIDs, labels do not provide uniqueness. In general, we expect many objects to carry the same label(s).

Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.

We can select pods with specified labels

kubectl get pods --selector app=App1

For example, when we use ReplicaSet, we use Labels and Selectors

replicaset-definition.yaml

apiVersion: apps/v1
kind: ReplicaSet
metadata:
	name: simple-webapp
	labels:
		app: App1
		function: Front-end
spec:
	replicas: 3
	selector:
		matchLabels:
			app: App1
	template:
		metadata:
			labels:
				app: App1
				function: Front-end
			containers:
			- name: simple-webapp
			  image: simple-webapp

Note: Labels in the template section is for the pods, the labels in the top are the labels for the replica set it self.

Exercises:

  • Select pod with multiple Labels
  • Select ALL resource with a Label

Example to filtrate using more than one selector on all namespaces.

k get all -l env=prod,bu=finance,tier=frontend

Taints and tolerations

Taints allow a node to repel a set of pods.

Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints.

Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes.

Note: Taints are set on nodes, and tolerations are set on pods

Taint nodes

The taint-effect defines what would happen to the pods if they do not tolerate the taint, there are three main effects:

  • NoSchedule: the pods will not be scheduled on the node

  • PreferNoSchedule: the system will try to avoid placing the pod on the node, but that is not guaranteed.

  • NoExecute: Pods will not be scheduled on the node and existing pods on the node, if any, will be evicted if they do not tolerate the taint

    kubectl taint nodes node-name key=value:taint-effect
    

    Example:

    kubectl taint nodes node1 app=blue:NoSchedule
    

Tolerations Pods

pod-definition.yml

...
spec:
	containers:
	- name: nginx-container
	  image: nginx
	tolerations:
	- key: "app"
	  operator: "Equal"
	  value: "blue"
	  effect: "NoSchedule"

With this “toleration” the pod can be deployed on the node1 with the taint

Note: A taint is set to the master node and automatically that prevents any pods from being scheduled there on master nodes. We can see this taint

kubectl describe node kubemaster | grep Taint
Taints:			node-role.kubernetes.io/master:NoSchedule
kubectl describe node kubemaster | grep Taint
Taints:			node-role.kubernetes.io/master:NoSchedule

Node Selectors

nodeSelector is the simplest recommended form of node selection constraint. nodeSelector is a field of PodSpec. It specifies a map of key-value pairs. pods/pod-nginx.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    disktype: ssd

The Pod will get scheduled on the node that you attached the label to. You can verify that it worked by running kubectl get pods -o wide and looking at the “NODE” that the Pod was assigned to

Label nodes

List the nodes in your cluster, along with their labels:

$ kubectl get nodes --show-labels
NAME      STATUS    ROLES    AGE     VERSION        LABELSworker0   Ready     <none>   1d      v1.13.0        ...,kubernetes.io/hostname=worker0worker1   Ready     <none>   1d      v1.13.0        ...,kubernetes.io/hostname=worker1

Chose one of your nodes, and add a label to it:

kubectl label nodes <your-node-name> disktype=ssd

Example:

kubectl label nodes node-1 size=large

Node Selector Limitations

We can’t configure complex labels, such as “Large OR Medium” “Not Small” for something like this we need “Node Affinity”

Node Affinity

Node affinity is conceptually similar to nodeSelector – it allows you to constrain which nodes your pod is eligible to be scheduled on, based on labels on the node.

There are currently two types of node affinity, called requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution. You can think of them as “hard” and “soft” respectively, in the sense that the former specifies rules that must be met for a pod to be scheduled onto a node (just like nodeSelector but using a more expressive syntax), while the latter specifies preferences that the scheduler will try to enforce but will not guarantee. The “IgnoredDuringExecution” part of the names means that, similar to how nodeSelector works, if labels on a node change at runtime such that the affinity rules on a pod are no longer met, the pod will still continue to run on the node. In the future we plan to offer requiredDuringSchedulingRequiredDuringExecution which will be just like requiredDuringSchedulingIgnoredDuringExecution except that it will evict pods from nodes that cease to satisfy the pods’ node affinity requirements.

pods/pod-with-node-affinity.yaml

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/e2e-az-name
            operator: In
            values:
            - e2e-az1
            - e2e-az2
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: another-node-label-key
            operator: In
            values:
            - another-node-label-value
  containers:
  - name: with-node-affinity
    image: k8s.gcr.io/pause:2.0

Note: Remember the operator Exists is without values

Set Node Affinity to the deployment to place the pods on node01 only

apiVersion: apps/v1
kind: Deployment
metadata:
  name: blue
spec:
  replicas: 6
  selector:
    matchLabels:
      run: nginx
  template:
    metadata:
      labels:
        run: nginx
    spec:
      containers:
      - image: nginx
        imagePullPolicy: Always
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: color
                operator: In
                values:
                - blue

Resource Requirements

When you specify a Pod, you can optionally specify how much of each resource a Container needs. The most common resources to specify are CPU and memory (RAM); there are others.

When you specify the resource request for Containers in a Pod, the scheduler uses this information to decide which node to place the Pod on. When you specify a resource limit for a Container, the kubelet enforces those limits so that the running container is not allowed to use more of that resource than the limit you set. The kubelet also reserves at least the request amount of that system resource specifically for that container to use.

Requests and limits

If the node where a Pod is running has enough of a resource available, it’s possible (and allowed) for a container to use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit.

For example, if you set a memory request of 256 MiB for a container, and that container is in a Pod scheduled to a Node with 8GiB of memory and no other Pods, then the container can try to use more RAM.

If you set a memory limit of 4GiB for that Container, the kubelet (and container runtime) enforce the limit. The runtime prevents the container from using more than the configured resource limit. For example: when a process in the container tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error.

Resource Requests

apiVersion:v1
kind: Pod
...
spec:
	containers:
	- name: simple-webapp-color
	  image: simple-webapp-color
	  resources:
	    requests:
	      memory: "1Gi"
	      cpu: 1

Note on default resources requirements and limits: For the POD to pick up defaults request values you must have first set those as default values for request and limit by creating a LimitRange in that namespace.

Mem:

apiVersion: v1
kind: LimitRange
metadata:
  name: mem-limit-range
spec:
  limits:
  - default:
      memory: 512Mi
    defaultRequest:
      memory: 256Mi
    type: Container

Cpu:

apiVersion: v1kind: LimitRangemetadata:  name: cpu-limit-rangespec:  limits:  - default:      cpu: 1    defaultRequest:      cpu: 0.5    type: Container

DaemonSets

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.

Some typical uses of a DaemonSet are:

  • running a cluster storage daemon on every node
  • running a logs collection daemon on every node
  • running a node monitoring daemon on every node
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2

Note: From v1.12 Daemon Sets uses NodeAffinity and default scheduler to guarantee schedule pods on each node.

Note: An easy way to create a DaemonSet is to first generate a YAML file for a Deployment with the command kubectl create deployment elasticsearch --image=k8s.gcr.io/fluentd-elasticsearch:1.20 -n kube-system --dry-run=client -o yaml > fluentd.yaml. Next, remove the replicas and strategy fields from the YAML file using a text editor. Also, change the kind from Deployment to DaemonSet

Static PODs

We can deploy a POD on a node using Kubelet without any control-plane component. Only pod, not deploment, service, etc, To do that, we need to create the pod-definition.yaml and deploy in a specific directory. Kubelet will look in this place and him will create the pod. The static pods need to be deployed on the /etc/kubernetes/manifests (the path is defined on the /var/lib/kubelet/config.yaml file)

Create static pod example:

kubectl run --restart=Never --image=busybox static-busybox --dry-run=client -o yaml --command -- sleep 1000 > /etc/kubernetes/manifests/static-busybox.yaml

Multiple Schedulers

Differents way of manually scheduller a pod on a node.

The first scheduler is the default-scheduler, however we can deploy another schedulers, for example my-custom scheduler with another scheduler configuration.

The kube-schedulers are deployed as a Pod using kubeadm, with the kubeconfig=/etc/kubernetes/scheduler.conf we can define the configuration. The –leader-elect=true is when we have many schedulers with multiple master node. Only one scheduler can be the master.

We can also specified the scheduler on a pod

apiVersion: v1 
kind: Pod 
metadata:
  name: nginx 
spec:
  schedulerName: my-scheduler
  containers:
  - image: nginx
    name: nginx

Logging and Monitoring

Monitor Cluster Components

This topic is about how we monitor resource consuption on Kubernetes or what we you like to monitor. For example we want to know the numbers of nodes in the cluster, the performance metrics such as CPU, memory and so on. Also we can know POD metrics such as pod memory, cpu, etc. K8s doest not come with a built-in monitoring solution, some solutions are:

  • Metric server: The most basic built-in monitoring solution. The metric server retrieces metrics from each Kubernetes nodes and pods, aggregates them and stores them in memory (Does not store in Disk because of that we cannot see historical performance data)
  • Prometheus
  • Elastic Stack
  • Datadog
  • Dynatrace

Kubelet Agent runs on each node and is responsible for receiving instructions from the Kubernetes API master server and running PODs on the nodes. The kubelet also contains a subcomponent know as cAdvisor or Container Advisor. cAdvisor is responsible for retrieving performance metrics from pods, and exposing them through the kubelet API to make the metrics available for the Metrics Server.

kubectl top node provides the CPU and Memory consumption of each of the nodes. Needs the Metric server installed.

Other good command is kubectl top pods , this command allow us to view CPU/Mem of the pods.

Managing Application Logs

The most simple way to view app logs is using

kubectl logs -f POD_NAME CONTAINER_NAME

The argument CONTAINER_NAME is only necessary if we have more than on container in a pod.

Application Lifecycle Management

This sections has a lot of topics overlapping with CKAD.

Rolling Updates and Rollbacks

When a new rollout is triggered a new deployment revision is created named revision 2. This helps us keep track of the changes made to our deployment and enables us to roll back to a previous version of deployment if necessary.

We can see the status of our rollout by running

kubectl rollout status deployment/myapp-deployment

We can also see the revisions and history of our deployment

kubectl rollout history deployment/myapp-deployment

Note: A Deployment’s rollout is triggered if and only if the Deployment’s Pod template (that is, .spec.template) is changed, for example if the labels or container images of the template are updated. Other updates, such as scaling the Deployment, do not trigger a rollout.

Commands and Arguments in Kubernetes

Entrypoint on Docker correspond to command on Kubernetes CMD on Docker correspond to args on Kubernetes.

Docker

from UbuntuENTRYPOINT ["sleep"]CMD ["5"]

Kubernetes

image: ubuntu-sleepercommand:["sleep"]args:["10"]

Configure Environment Variables in Applications

In the K8s space, there are 4 ways environment variables can be set for Pods. These are namely:

  1. Using string literals
  2. From ConfigMaps
  3. From Secrets
  4. From Pod configuration

Env are arrays, each item has a name and value property. Key/Value Format.

Env value types: Simple:

spec:	
  containers:	
  name: nginx	
  env:	  
  - name: SERVER	    
    value: db-prod

From configmap

env:
  -name: SERVER
   valueFrom:
     configMapKeyRef:

From secret

env:
  -name: SERVER
   valueFrom:
     secretKeyRef:

Configure ConfigMaps in Applications

ConfigMaps can be mounted as data volumes. ConfigMaps can also be used by other parts of the system, without being directly exposed to the Pod. For example, ConfigMaps can hold data that other parts of the system should use for configuration.

  • Create a configmap using imperative way

    kubectl create configmap app-config --from-literal=APP_COLOR=blue
    

Where:

  • app-config: config-name
  • APP_COLOR=key
  • blue=value

Another way from file:

  kubectl create configmap app-config --from-file=appconfig.properties
  • Create a configmap using declarative way

    kubectl create -f configmap.yml
    

    configmap.yml

    apiVersion: v1
    kind: ConfigMap
    metadata:
    	name: app-config
    data:
    	APP_COLOR: blue
    	APP_MODE: prod
    

Note: This resource doesn’t use a spec field, instead use a data field.

View Configmaps

kubectl get configmaps
kubectl describe configmap

Configmap in pods

apiVersion: v1
Kind: Pod
Metadata:
	name: simple-webapp-color
	labels:
		name: simple-webapp-color
spec:
	containers:
	- name: simple-webapp-color
	  image: simple-webapp-color
	  ports:
	  	- containerPort: 8080
      envFrom:
      - ConfigMapRef:
      		name: app-config

Practice:

  • Create configmap from literal
  • Create a pod using a configmap

Secrets

Kubernetes Secrets let you store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image

Create Secrets

  • Imperative:
kubectl create secret generic \ 
	app-secret --from-literal=DB_Host=mysql \
			   --from-literal=DB_User=root

or we can use a file:

kubectl create secret generic \
	app-secret --from-file=app_secret.properties
  • Declarative:
kubectl create -f secret-data.yaml

secret-data.yaml:

apiVersion: v1
kind: Secret
metadata:
	name: app-secret
data:
	DB_Host: bxlzcWw=
	DB_User: cm9vdA==
	DB_Password: cFGzd3Jk	

For encode the data, we need to do for example:

echo -n 'mysql' | base64

For decode the data:

echo -n 'bxlzcWw=' | base64 --decode

View Secrets

kubectl get secrets
kubectl describe secrets

To view the values:

kubectl get secret app-secret -o yaml

Secrets in pods

apiVersion: v1
Kind: Pod
Metadata:
	name: simple-webapp-color
	labels:
		name: simple-webapp-color
spec:
	containers:
	- name: simple-webapp-color
	  image: simple-webapp-color
	  ports:
	  	- containerPort: 8080
      envFrom:
      - SecretRef:
      		name: app-secret

In Secrets in pods as Volume we can see the secret inside the container

ls /opt/app-secret-volumes
DB_Host		DB_Password		DB_User
cat /opt/app-secret-volumes/DB_Password
paswrd

A note about secrets

Remember that secrets encode data in base64 format. Anyone with the base64 encoded secret can easily decode it. As such the secrets can be considered as not very safe.

The concept of safety of the Secrets is a bit confusing in Kubernetes. The kubernetes documentation page and a lot of blogs out there refer to secrets as a “safer option” to store sensitive data. They are safer than storing in plain text as they reduce the risk of accidentally exposing passwords and other sensitive data. In my opinion it’s not the secret itself that is safe, it is the practices around it.

Secrets are not encrypted, so it is not safer in that sense. However, some best practices around using secrets make it safer. As in best practices like:

  • Not checking-in secret object definition files to source code repositories.
  • Enabling Encryption at Rest for Secrets so they are stored encrypted in ETCD.

Also the way kubernetes handles secrets. Such as:

  • A secret is only sent to a node if a pod on that node requires it.
  • Kubelet stores the secret into a tmpfs so that the secret is not written to disk storage.
  • Once the Pod that depends on the secret is deleted, kubelet will delete its local copy of the secret data as well.

Read about the protections and risks of using secrets here

Having said that, there are other better ways of handling sensitive data like passwords in Kubernetes, such as using tools like Helm Secrets, HashiCorp Vault.

Multicontainers PODs

The primary purpose of a multi-container Pod is to support co-located, co-managed helper processes for a main program. There are some general patterns of using helper processes in Pods

Example create a multi-container pod with 2 containers

apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: container-1
    image: nginx
    ports:
    - containerPort: 80  
  - name: container-2
    image: alpine
    command: ["watch", "wget", "-qO-", "localhost"]

Note: Multi-Container Pods share Lifecycle, Network and Storage

Init Containers

Init containers can contain utilities or setup scripts not present in an app image.

A Pod can have multiple containers running apps within it, but it can also have one or more init containers, which are run before the app containers are started.

Init containers are exactly like regular containers, except:

  • Init containers always run to completion.
  • Each init container must complete successfully before the next one starts.

If a Pod’s init container fails, the kubelet repeatedly restarts that init container until it succeeds. However, if the Pod has a restartPolicy of Never, and an init container fails during startup of that Pod, Kubernetes treats the overall Pod as failed.

To specify an init container for a Pod, add the initContainers field into the Pod specification, as an array of container items (similar to the app containers field and its contents).

Also, init containers do not support lifecycle, livenessProbe, readinessProbe, or startupProbe because they must run to completion before the Pod can be ready.

Example: This example defines a simple Pod that has two init containers. The first waits for myservice, and the second waits for mydb. Once both init containers complete, the Pod runs the app container from its spec section.

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    app: myapp
spec:
  containers:
  - name: myapp-container
    image: busybox:1.28
    command: ['sh', '-c', 'echo The app is running! && sleep 3600']
  initContainers:
  - name: init-myservice
    image: busybox:1.28
    command: ['sh', '-c', "until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
  - name: init-mydb
    image: busybox:1.28
    command: ['sh', '-c', "until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done"]

Cluster Maintanance

OS Upgrades

If we lost a Node with pods more than 5 minuts (pod eviction timeout set on the controller manager), Kubernetes mark the pods as terminated, if the pods where part of a replicaset then will be recreated on other nodes.

For OS upgrade, we can make a quick upgrade and reboot, if you do not for sure if a node is going to be back in less than 5 minutes, we have a safer way to do it.

Drain a node of all the workloads:

The pods are gracefully terminated and recreates on others nodes. The node is also marked as unschedulable (cordon)

kubectl drain node1

Note: The drains fails if exists pods without a replicaset, we need to delete the pods manually before apply the drain or with force

kubectl drain node1 --force

When the node go back, we need to rollback the state uncordon

kubectl uncordon node1

Another command es cordon, cordon mark the node as unschedulable, however not drain the node

kubectl cordon node1

Kubernetes Software Versions

Kuberneres versions, example: v1.11.3 1 (major): stable version

11 (minor): features and functionalities

3 (patch): bug fixes

Note: First mayor version v1.0 released on july 2015

Releases:

  • v.1.10.0-alpha: The new features first go to an alpha release. New features are not enabled by default
  • v1.10.0-beta: Code well tested. New features are enabled by default
  • v1.10.0: The stable version

Cluster Upgrade Process

The diferents K8S components such as kube-apiserver, controller-manager, kube-scheduler, kubelet, and kube-proxy can have diferents versions, however, none of theses componentes can’t have a version major than the kube-apiserver (because every component talk with the kube-apiserver) Example:

  • Kube-apiserver v1.10
  • Controller-manager/Kube-Scheduler X-1 (1.09,1.10)
  • Kubelet/Kube-proxy X-2 (1.08,1.09,1.10)
  • Kubectrl X+1 > X-1 (1.10, 1.09, 1.11)

This is important because we can upgrade with parts.

Note: Kubernetes supports only up to the recent three minor versions

For example, if the last version is the v1.22, only supported the v1.21, v1.20

Note: The recommended approach is to upgrade one minor version at a time

Upgrades:

  • Cloud providers: Few clicks

  • Kubeadm

    kubeadm upgrade plan
    kubeadm upgrade apply
    

    Note: Kubeadm doesn’t install/upgrade kubelet

  • From scratch: Manually upgrade the different components by ourself.

Upgrade steps:

  1. upgrade master nodes When we upgrade the master node, the worker nodos go down, however all workload hosted on the worker nodes continue to serve users as normal.

  2. upgrade worker nodes (3 strategies)

    1. Upgrade all at once The pods go down all at the same time, and the users no longer be able to access the applications. Requires downtime

    2. Upgrade one node at a time

      Before upgrade the first node, we drain the node, and we can upgrade. and continue with the next nodes.

    3. Add nodes to the cluster Add nodes with the upgrades version to the cluster and we can move the workload to the new ones, it’s very easily on cloud environments.

More information about Kubeadm upgrade:

Before upgrade the cluster, we need to upgrade the kubeadm tool itself. For example, if we have the 1.20v and we want to go to the 1.22v

Upgrade master nodes:

apt-upgrade -y kubeadm=1.21.0-00
kubeadm upgrade apply v1.21.0
apt-upgrade -y kubelet=1.21.0-00
systemctl restart kubelet
kubectl get nodes | grep 1.21

Note: kubelet on the master node is optional.

Upgrade worker nodes: (on node each time)

kubectl drain node01
apt-upgrade -y kubeadm=1.21.0-00
apt-upgrade -y kubelet=1.21.0-00
kubeadm upgrade node config --kubelet-version v1.12.0
systemctl restart kubelet
kubectl uncordon node01

**NOTE: PRACTICE UPGRADE WITH KUBEADM ** Master

apt install kubeadm=1.20.0-00
kubeadm upgrade apply v1.20.0
apt install kubelet=1.20.0-00
systemctl restart kubelet

Node:

kubectl drain node01 --ignore-daemonsets --force
ssh node01
apt updateapt install kubeadm=1.20.0-00
kubeadm upgrade nodeapt install kubelet=1.20.0-00
systemctl restart kubelet
exit #(go back master)
kubectl uncordon node01

Note: The kubeadm upgrade command are different between the master and the workers nodes. On the nodes takes the configuration from the master node, we don’t need to specified the version

Backup and Restore Methods

Backup Candidates:

  • Resource configuration

    • We can do it using the declarative way and upload to a git repository.

    • Another way is quering all the configuratoin to the kube-apiserver

      kubectl get all --all-namespaces -o yaml > all-config.yml
      
    • Using a external tool: Velero

  • ETCD Cluster

    • We could copy the data directory, normally on /var/lib/etcd/

    • We can snapshot the db

      etcdctl snapshot save ourbackup.db
      etcdctl snapshot status ourbackup.db
      
    • Restore

      service kube-apiserver stop
      etcdctl snapshot restore ourbackup.db --data-dir /var/lib/etcd-from-backup
      

      Now we need to configure the etcd.service to point to the new etc directory.

      ExecStart=/usr/local/bin/etcd \\--data-dir /var/lib/etcd-from-backup
      

      Now we need to reload the daemon and the services

      systemctl daemon-reload
      service etcd restart
      service kube-apiserver start
      

      Note: In all the etcdctl we need to specified the endpoints, cacert, cert, keys, of the K8S cluster. Note: If we backup from the resource configuration files, we don’t need to backup the ETCD cluster (example on a managed k8s)

Working with ETCDCTL

To see all the options for a specific sub-command, make use of the -h or –help flag.

For example, if you want to take a snapshot of etcd, use:

etcdctl snapshot save -h and keep a note of the mandatory global options.

If the ETCD database is TLS-Enabled, the following options are mandatory:

  • –cacert verify certificates of TLS-enabled secure servers using this CA bundle

  • –cert identify secure client using this TLS certificate file

  • –endpoints=[127.0.0.1:2379] This is the default as ETCD is running on master node and exposed on localhost 2379.

  • –key identify secure client using this TLS key file

Example backup:

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 \--cacert=/etc/kubernetes/pki/etcd/ca.crt \--cert=/etc/kubernetes/pki/etcd/server.crt \--key=/etc/kubernetes/pki/etcd/server.key \snapshot save /opt/snapshot-pre-boot.db

Restore:

ETCDCTL_API=3 etcdctl  --data-dir /var/lib/etcd-from-backup \snapshot restore /opt/snapshot-pre-boot.db

Note: In this case, we are restoring the snapshot to a different directory but in the same server where we took the backup (the controlplane node) As a result, the only required option for the restore command is the –data-dir.

Security

TLS in Kubernetes

Important files:

  • Client Certificates for Clients
    • Admin: admin.crt admin.key
    • Kube-api server: apiserver-kubelet-client.crt, apiserver-kubelet-client.key
    • Kube-api server: apiserver-etcd-client.crt, apiserver-etcd-client.key
    • Kube-scheduler:
    • kube-controller-manager:
    • kube-proxy:
    • kubelet:
  • Server Certificates for Servers:
    • ETCD: etcdserver.crt, etcdserver.key
    • KUBE-API: apiserver.crt, apiserver.key
    • KUBELET: kubelet.crt, kubelet.key

Note: The crt is the private key, and the key is the public key

View Certificates Details

  • The hard way

    cat /etc/systemd/system/kube-apiserver.service
    
  • Kubeadm

    cat /etc/kubernetes/manifest/kube-apiserver.yml
    

    Note: We need to look on the differents .crt and .key files.

Exercises:

  • Identify the certificate file used for the kube-api server

    $ grep tls-cert /etc/kubernetes/manifests/kube-apiserver.yaml     - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    
  • Identify the Certificate file used to authenticate kube-apiserver as a client to ETCD Server

    $ grep etcd /etc/kubernetes/manifests/kube-apiserver.yaml | grep crt | grep api    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    
  • Identify the key used to authenticate kubeapi-server to the kubelet server

    $ grep kubelet /etc/kubernetes/manifests/kube-apiserver.yaml | grep key     - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    
  • Identify the ETCD Server Certificate used to host ETCD server

    $ grep cert-file /etc/kubernetes/manifests/etcd.yaml      - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    
  • What is the name of the CA who issued the Kube API Server Certificate?

    $ openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text | grep Issuer        Issuer: CN = kubernetes
    
  • What is the Common Name (CN) configured on the ETCD Server certificate?

    $ openssl x509 -in /etc/kubernetes/pki/etcd/server.crt -text | grep Subject        Subject: CN = controlplane
    
  • How long, from the issued date, is the Kube-API Server Certificate valid for?

    $ openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text | grep Validity -A 2        Validity            Not Before: Oct 22 15:10:30 2021 GMT            Not After : Oct 22 15:10:30 2022 GMT
    

Certificates API

The Certificates API enables automation of X.509 credential provisioning by providing a programmatic interface for clients of the Kubernetes API to request and obtain X.509 certificates from a Certificate Authority (CA).

Exercises:

  • Create a CertificateSigningRequest object with the name akshay with the contents of the akshay.csr file

    apiVersion: certificates.k8s.io/v1kind: CertificateSigningRequestmetadata:  name: akshayspec:  groups:  - system:authenticated  request: $FROMTHEFILE  signerName: kubernetes.io/kube-apiserver-client  usages:  - client auth
    

    Note: Doc https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/

  • Check the CSR

    kubectl get csr
    
  • Approve the CSR Request

    kubectl certificate approve akshay
    
  • Reject and delete CSR Request

    kubectl certificate deny agent-smithkubectl delete csr agent-smith
    

Kubeconfig

The file is in ~/.kube/config Importants files:

  • Clusters: Example: dev, prod, aws
  • Context: Relation between the cluster and the user. Example admin@production
  • Users: admin, developer, prod

Example file:

apiVersion: v1
kind: Config
preferences: {}
current-context: dev-frontend

clusters:
- cluster:
  name: development
- cluster:
  name: scratch

users:
- name: developer
- name: experimenter

contexts:
- context:
  name: dev-frontend
- context:
  name: dev-storage
- context:
  name: exp-scratch

Note: We can specified the current-context

Commands:

  • Add context details to your configuration file:

    kubectl config --kubeconfig=config-demo set-context dev-frontend --cluster=development --namespace=frontend --user=developer
    kubectl config --kubeconfig=config-demo set-context dev-storage --cluster=development --namespace=storage --user=developer
    kubectl config --kubeconfig=config-demo set-context exp-scratch --cluster=scratch --namespace=default --user=experimenter
    
  • View the config file

    kubectl config --kubeconfig=config-demo view
    
  • Set the current context

    kubectl config --kubeconfig=config-demo use-context dev-frontend
    

    Note: Now whenever you enter a kubectl command, the action will apply to the cluster, and namespace listed in the dev-frontend context. And the command will use the credentials of the user listed in the dev-frontend context.

  • Finally, suppose you want to work for a while in the storage namespace of the development cluster.

    kubectl config --kubeconfig=config-demo use-context dev-storage
    
  • To see only the configuration information associated with the current context, use the --minify flag.

    kubectl config --kubeconfig=config-demo view --minify
    

Note: if we use the default config file we don’t need to specified the --kubeconfig parameter.

Role Based Access Controls

Role-based access control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within your organization.

An RBAC Role or ClusterRole contains rules that represent a set of permissions. Permissions are purely additive (there are no “deny” rules).

A Role always sets permissions within a particular namespace; when you create a Role, you have to specify the namespace it belongs in.

Here’s an example Role in the “default” namespace that can be used to grant read access to pods:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

RoleBinding:

Rolebinding Is to link the user to the rol. A role binding grants the permissions defined in a role to a user or set of users. It holds a list of subjects (users, groups, or service accounts), and a reference to the role being granted. A RoleBinding grants permissions within a specific namespace whereas a ClusterRoleBinding grants that access cluster-wide.

Here is an example of a RoleBinding that grants the “pod-reader” Role to the user “jane” within the “default” namespace. This allows “jane” to read pods in the “default” namespace.

apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
  name: jane # "name" is case sensitive
  apiGroup: rbac.authorization.k8s.io
roleRef:
  # "roleRef" specifies the binding to a Role / ClusterRole
  kind: Role #this must be Role or ClusterRole
  name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
  apiGroup: rbac.authorization.k8s.io

Commands RBAC

  • View roles:

    kubectl get roles
    
  • View rolesbinding:

    kubectl get rolebindings
    
  • Check Access: (for example if you are a user and you want to verify and access)

    kubectl auth can-i create deployments
    kubectl auth can-i delete nodes
    kubectl auth can-i create pods --as dev-user
    kubectl auth can-i create pods --as dev-user --namespace test
    

Exercises:

  • Inspect the environment and identify the authorization modes configured on the cluster.

    $ kubectl describe pod kube-apiserver-controlplane -n kube-system | grep autho
          --authorization-mode=Node,RBAC
    
  • A user dev-user is created. User’s details have been added to the kubeconfig file. Inspect the permissions granted to the user. Check if the user can list pods in the default namespace.

    $ kubectl auth can-i list pods --as dev-user --namespace defaultno
    
  • Create the necessary roles and role bindings required for the dev-user to create, list and delete pods in the default namespace.

    • Role: developer

    • Role Resources: pods

    • Role Actions: list

    • Role Actions: create

    • RoleBinding: dev-user-binding

    • RoleBinding: Bound to dev-user

      kubectl create role developer --namespace=default --verb=list,create --resource=pods
      kubectl create rolebinding dev-user-binding --namespace=default --role=developer --user=dev-user
      

Cluster Roles

ClusterRole, by contrast to roles, is a non-namespaced resource. The resources have different names (Role and ClusterRole) because a Kubernetes object always has to be either namespaced or not namespaced; it can’t be both.

A ClusterRole can be used to grant the same permissions as a Role. Because ClusterRoles are cluster-scoped, you can also use them to grant access to:

  • cluster-scoped resources (like nodes)

  • non-resource endpoints (like /healthz)

  • namespaced resources (like Pods), across all namespaces

    For example: you can use a ClusterRole to allow a particular user to run kubectl get pods --all-namespaces

Here is an example of a ClusterRole that can be used to grant read access to secrets in any particular namespace, or across all namespaces (depending on how it is bound):

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  # "namespace" omitted since ClusterRoles are not namespaced
  name: secret-reader
rules:
- apiGroups: [""]
  #
  # at the HTTP level, the name of the resource for accessing Secret
  # objects is "secrets"
  resources: ["secrets"]
  verbs: ["get", "watch", "list"]

ClusterRoleBinding:

To grant permissions across a whole cluster, you can use a ClusterRoleBinding. The following ClusterRoleBinding allows any user in the group “manager” to read secrets in any namespace.

apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
  name: jane # "name" is case sensitive
  apiGroup: rbac.authorization.k8s.io
roleRef:
  # "roleRef" specifies the binding to a Role / ClusterRole
  kind: Role #this must be Role or ClusterRole
  name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
  apiGroup: rbac.authorization.k8s.io

Note: After you create a binding, you cannot change the Role or ClusterRole that it refers to. If you try to change a binding’s roleRef, you get a validation error. If you do want to change the roleRef for a binding, you need to remove the binding object and create a replacement.

Cluster Scoped

  • Nodes
  • PV
  • ClusterRoles / ClusterRolebindings
  • CertificatesSingningRequests
  • Namespaces

Exercises:

  • A new user michelle joined the team. She will be focusing on the nodes in the cluster. Create the required ClusterRoles and ClusterRoleBindings so she gets access to the nodes. Grant permission to list nodes

    kubectl create clusterrole node-list --verb=list --resource=nodes
    kubectl create clusterrolebinding michelle-binding --clusterrole=node-list --user=michelle
    kubectl auth can-i list nodes --as michelle
    
  • michelle’s responsibilities are growing and now she will be responsible for storage as well. Create the required ClusterRoles and ClusterRoleBindings to allow her access to Storage.

    Get the API groups and resource names from command kubectl api-resources. Use the given spec:

    • ClusterRole: storage-admin

    • Resource: persistentvolumes

    • Resource: storageclasses

    • ClusterRoleBinding: michelle-storage-admin

    • ClusterRoleBinding Subject: michelle

    • ClusterRoleBinding Role: storage-admin

      kubectl create clusterrole storage-admin --verb=list,get,delete,create,watch --resource=persistentvolumes,storageclasses
      kubectl create clusterrolebinding michelle-storage-admin --clusterrole=storage-admin --user=michellekubectl auth can-i list storageclasses --as michelle
      

Service Accounts

A service account provides an identity for processes that run in a Pod.

When you (a human) access the cluster (for example, using kubectl), you are authenticated by the apiserver as a particular User Account (currently this is usually admin, unless your cluster administrator has customized your cluster). Processes in containers inside pods can also contact the apiserver. When they do, they are authenticated as a particular Service Account (for example, default).

Create Service Account

kubectl create serviceaccount dashboard-sa

View SA:

kubectl get serviceaccount

View SA Token:

$ kubectl describe serviceaccount dashboard-sa
Name:		dasboard-sa
Namespace:  default
...
Tokens:	    dashboard-sa-token-kddbm

Note: When a SA is created first creates a SA object (the name) and after that generates the Token for the SA, and in the end creates a Secret for that token inside the object (SA Object -> Token -> Secret)

Secret:

token:
aosvebpeh.gsxcuqptmeszxbp...

To view the token we need to run:

$ kubectl describe secret dashboard-sa-token-kddbm
...
namespace:  default
token:
aosvebpeh.gsxcuqptmeszxbp...

This token can then be used as an authentication bearer token while making your REST call to the Kubernetes API.

For example in this simple example using curl, you could provide the bearer token as an authorization. Header while making a REST call to the Kubernetes API. In case of my custom dashboard application.

View roles and rolebindings from SA

kubectl get roles,rolebindings

Default Service Account

For every namespace in Kubernetes, a service account named “default” is automatically created. Each namespace has its own default service account. Whenever a pod is created,

The default service account and it’s token are automatically mounted to that pod as a volume mount.

For example we have a simple pod definition file that creates a pod using my custom Kubernetes dashboard image.

We haven’t specified any secrets or Volume mounts in the definition file.

However when the pod is created if you look at the details of the pod by running the kubectl describe pod command you’ll see that a volume is automatically created from the secret named “default-token” which is in fact the secret containing the token for this default service account.

The secret token is mounted at location /var/run/secrets/kubernetes.io/service/account inside the pod.

Remember that the default service account is very much restricted. It only has permission to run basic Kubernetes API queries.

Note: Remember, you cannot edit the service account of an existing pod. You must delete and recreate the pod. However in case of a deployment you will be able to get the service account as any changes to the pod definition file will automatically trigger a new rollout for the deployment.

Image Security

When we use the image nginx for example on our deployments, we really use the image docker.io/library/nginx:latest (registry/user/repository/version)

Private repository If we need to use a image from a private reposotiry, we need to use a set of credentials.

First we need to create a secret with our credentials.

kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>

Note: is your Private Docker Registry FQDN. Use https://index.docker.io/v1/ for DockerHub.

Create a pod that uses your secret:

apiVersion: v1
kind: Pod
metadata:
  name: private-reg
spec:
  containers:
  - name: private-reg-container
    image: your.private.registry.example.com/janedoe/jdoe-private:v1
  imagePullSecrets:
  - name: regcred

Note: The imagePullSecrets value is the configuration for the credentials secret.

Security Context

On Docker we can do:

  • If we want the docker run with an specified ID:
docker run --user=1001 ubuntu sleep 3600
  • If we want add capabilities
docker run --cap-add MAC_ADMIN ubuntu

On Kubernetes is similar:

To Pod level:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
  volumes:
  - name: sec-ctx-vol
    emptyDir: {}
  containers:
  - name: sec-ctx-demo
    image: busybox
    command: [ "sh", "-c", "sleep 1h" ]
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
    securityContext:
      allowPrivilegeEscalation: false

To Container level with capabilities:

...
containers:
	- name: ubuntu
	  image: ubuntu
	  command: ["sleep", "3600"]
	  securityContext:
		runAsUser: 1000
		capabilities:
			add: ["MAC_ADMIN", "SYS_TIME"]

Note: Capabilities are only supported at the container level and not at the POD level

Practice:

  • Edit a Pod and change the process to use another user with ID 1010
  • Add the SYS_DATE capabilities, and change the date.

Networking Policies

If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.

The entities that a Pod can communicate with are identified through a combination of the following 3 identifiers:

  1. Other pods that are allowed (exception: a pod cannot block access to itself)
  2. Namespaces that are allowed
  3. IP blocks (exception: traffic to and from the node where a Pod is running is always allowed, regardless of the IP address of the Pod or the node)

Example:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: test-network-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      role: db
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - ipBlock:
        cidr: 172.17.0.0/16
        except:
        - 172.17.1.0/24
    - namespaceSelector:
        matchLabels:
          project: myproject
    - podSelector:
        matchLabels:
          role: frontend
    ports:
    - protocol: TCP
      port: 6379
  egress:
  - to:
    - ipBlock:
        cidr: 10.0.0.0/24
    ports:
    - protocol: TCP
      port: 5978

Note: POSTing this to the API server for your cluster will have no effect unless your chosen networking solution supports network policy.

Mandatory Fields: As with all other Kubernetes config, a NetworkPolicy needs apiVersion, kind, and metadata fields. For general information about working with config files, see Configure Containers Using a ConfigMap, and Object Management.

spec: NetworkPolicy spec has all the information needed to define a particular network policy in the given namespace.

podSelector: Each NetworkPolicy includes a podSelector which selects the grouping of pods to which the policy applies. The example policy selects pods with the label “role=db”. An empty podSelector selects all pods in the namespace.

policyTypes: Each NetworkPolicy includes a policyTypes list which may include either Ingress, Egress, or both. The policyTypes field indicates whether or not the given policy applies to ingress traffic to selected pod, egress traffic from selected pods, or both. If no policyTypes are specified on a NetworkPolicy then by default Ingress will always be set and Egress will be set if the NetworkPolicy has any egress rules.

ingress: Each NetworkPolicy may include a list of allowed ingress rules. Each rule allows traffic which matches both the from and ports sections. The example policy contains a single rule, which matches traffic on a single port, from one of three sources, the first specified via an ipBlock, the second via a namespaceSelector and the third via a podSelector.

egress: Each NetworkPolicy may include a list of allowed egress rules. Each rule allows traffic which matches both the to and ports sections. The example policy contains a single rule, which matches traffic on a single port to any destination in 10.0.0.0/24.

So, the example NetworkPolicy:

  1. isolates “role=db” pods in the “default” namespace for both ingress and egress traffic (if they weren’t already isolated)
  2. (Ingress rules) allows connections to all pods in the “default” namespace with the label “role=db” on TCP port 6379 from:
    • any pod in the “default” namespace with the label “role=frontend”
    • any pod in a namespace with the label “project=myproject”
    • IP addresses in the ranges 172.17.0.0–172.17.0.255 and 172.17.2.0–172.17.255.255 (ie, all of 172.17.0.0/16 except 172.17.1.0/24)
  3. (Egress rules) allows connections from any pod in the “default” namespace with the label “role=db” to CIDR 10.0.0.0/24 on TCP port 5978

Note: By default, if no policies exist in a namespace, then all ingress and egress traffic is allowed to and from pods in that namespace.

To view the Policies

kubectl get netpol

Important: The 3 selectors for netpol are:

  • podSelector
  • namespaceSelector
  • ipBlock

We can combinate this rules in one rule, or set as differents rules. For example, is not the same podSelector: frontend and namespaceSelector: prod

    - namespaceSelector:        
      matchLabels:          
        project: prod      
    podSelector:        
      matchLabels:          
        role: frontend

than podSelector: frontend OR namespaceSelector: prod

    - namespaceSelector:        
      matchLabels:          
        project: prod    
    - podSelector:        
      matchLabels:          
        role: frontend

Exercise:

  • Create a network policy to allow traffic from the Internal application only to the payroll-service and db-service.

    • Policy Name: internal-policy
    • Policy Type: Egress
    • Egress Allow: payroll
    • Payroll Port: 8080
    • Egress Allow: mysql
    • MySQL Port: 3306

    Answer:

    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      name: internal-policy
      namespace: default
    spec:
      podSelector:
        matchLabels:
          role: db
      policyTypes:
      - Egress
      egress:
      - to:
        - podSelector:
            matchLabels:
              name: payroll
        ports:
        - protocol: TCP
          port: 8080
      - to:
        - podSelector:
            matchLabels:
              name: mysql
        ports:
        - protocol: TCP
          port: 3306
    

Storage

This point is the same that CKAD, for this reason i only put some examples in each topic.

PV

apiVersion: v1
kind: PersistentVolume
metadata:
  name: task-pv-volume
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"

PVC

Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: task-pv-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

Note: PVCs will automatically bind themselves to a PV that has compatible StorageClass and accessModes

PVCs in PODs

apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage

Access Modes

The access modes are:

  • ReadWriteOnce – the volume can be mounted as read-write by a single node (RWO)
  • ReadOnlyMany – the volume can be mounted read-only by many nodes (ROX)
  • ReadWriteMany – the volume can be mounted as read-write by many nodes (RWX)

Reclaim Policy

  • Retain – manual reclamation
  • Recycle – basic scrub (rm -rf /thevolume/*) This is obsolete with Automatic Provisioning
  • Delete – associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder volume is deleted

Storage Class

apiVersion: v1
kind: PersistentVolumeClaim
	metadata: 
		name: mypvc 
		namespace: testns
spec: 
	accessModes: 
	- ReadWriteOnce 
	resources:   
		requests:     
			storage: 100Gi 
	storageClassName: gold

Note: The default StorageClass is marked by (default)

Networking

CoreDNS

CoreDNS is a flexible, extensible DNS server that can serve as the Kubernetes cluster DNS. CoreDNS running the kubernetes plugin can be used as a replacement for kube-dns in a kubernetes cluster.

Networking Namespaces

With network namespaces, you can have different and separate instances of network interfaces and routing tables that operate independent of each other.

Docker Networking

The type of network a container uses, whether it is a bridge, an overlay, a macvlan network, or a custom network plugin, is transparent from within the container. From the container’s point of view, it has a network interface with an IP address, a gateway, a routing table, DNS services, and other networking details (assuming the container is not using the none network driver).

CNI

CNI (Container Network Interface) consists of a specification and libraries for writing plugins to configure network interfaces in Linux containers, along with a number of plugins. CNI concerns itself only with network connectivity of containers and removing allocated resources when the container is deleted.

Kubernetes uses CNI as an interface between network providers and Kubernetes pod networking.

CNI comes with a set of supported plugins already, such as bridger, VLAN, IPVLAN, MACVLAN.

Cluster Networking

Networking is a central part of Kubernetes, but it can be challenging to understand exactly how it is expected to work. There are 4 distinct networking problems to address:

  1. Highly-coupled container-to-container communications: this is solved by Pods and localhost communications.
  2. Pod-to-Pod communications: this is the primary focus of this document.
  3. Pod-to-Service communications: this is covered by services.
  4. External-to-Service communications: this is covered by services.

Kubernetes is all about sharing machines between applications. Typically, sharing machines requires ensuring that two applications do not try to use the same ports. Coordinating ports across multiple developers is very difficult to do at scale and exposes users to cluster-level issues outside of their control.

Network Addons

Installing a network plugin in the cluster.

https://kubernetes.io/docs/concepts/cluster-administration/addons/

https://kubernetes.io/docs/concepts/cluster-administration/networking/#how-to-implement-the-kubernetes-networking-model

In the CKA exam, for a question that requires you to deploy a network addon, unless specifically directed, you may use any of the solutions described in the link above.

However the documentation currently does not contain a direct reference to the exact command to be used to deploy a third party network addon.

The links above redirect to third party/ vendor sites or GitHub repositories which cannot be used in the exam. This has been intentionally done to keep the content in the Kubernetes documentation vendor neutral.

At this moment in time, there is still one place within the documentation where you can find the exact command to deploy weave network addon:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#steps-for-the-first-control-plane-node (step 2)

Pod Networking

Networking model:

  • Every POD should have an IP
  • Every POD should be be able to communicate with every other POD in the same node
  • Every POD should be be able to communicate with every other POD on other nodes without NAT:

Tools to solucionate this: Weaveworks, flannel, cilium, calico, vmware nsx

CNI in Kubernetes

  • Container runtime must create network namespace

  • Identifity network the container must attach to

  • Container runtime to invoke network plugin when container is ADDed

  • JSON format of the network configuration

The CNI is configured on the kubelet.service

kubelet ... --network-plugin=cni --cni-bin-dir=/opt/cni/bin

We can check the parameters with another command

ps aux | grep kubelet

We can also define what is the network plugin

/usr/bin/kubelet --network-plugin

Important directories:

  • /opt/cni/bin The cni binaries such macvlan, weave-net, flannel,

  • /etc/cni/net.d the script config

CNI Weave

WeaveWorks is one solution based on CNI. We need the Agent deployed on every Node. Every agente have all the nodes ip information.

Deploy Weave

Deploy weave-net networking solution to the cluster with the subnet 10.50.0.0/16

$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&env.IPALLOC_RANGE=10.50.0.0/16"

The weave is desployed as daemon sets. Verify:

kubectl get pods -n kube-system

In order for Weave Net to work, you need to make sure IP forwarding is enabled on the worker nodes. Enable it by running the following on both workers:

sudo sysctl net.ipv4.conf.all.forwarding=1

Exercises:

  • Inspect the kubelet service and identify the network plugin configured for Kubernetes

    ps aux | grep network-plugin
    
  • What is the path configured with all binaries of CNI supported plugins?

    The CNI binaries are located under /opt/cni/bin by default.
    
  • What is the CNI plugin configured to be used on this kubernetes cluster?

    ls /etc/cni/net.d/
    
  • What binary executable file will be run by kubelet after a container and its associated namespace are created.

    grep type /etc/cni/net.d/10-flannel.conflist
    

Service Networking

  • ClusterIP
  • NodePort
  • LoadBalancer

Global Steps:

  1. Kubelet creates a POD (Watches the changes in the cluster through the Kube-Apiserver)

  2. The CNI plugin configure networking for the POD

  3. Each node runs a Kube-proxy. Kube-proxy also watches the changes in the cluster through the Kube-Apiserver and every time a new service is to be created, kube-proxy gets into action. Services are a cluster wide concept, exist across all the nodes in the cluster. It’s just a virtual object (no namespace)

  4. Kube-proxy get the Service IP/Port and creates forwarding rules on every node. Route example: Every request ip to 10.99.13.178:80 forward to 100.244.1.2. Kube-proxy for default uses iptables to make the forward (we can see the rules in the iptables nat table output)

    iptables -L -t nat | grep db-service
    

    Note: If we create a service of type NodePort, kube-proxy creates IPTable rules to forward all traffic comming on a port on all nodes. We can also check on the logs

    cat /var/log/kube-proxy.log
    

The ip range for the Cluster-ip is defined on the API server, (default 10.0.0.0/24)

kube-api-server --service-cluster-ip-range

Verify

ps aux |grep kube-api-server

Exercises:

  • What is the range of IP addresses configured for PODs on this cluster?

    kubectl logs weave-net-bh8mv weave -n kube-system 
    
  • What is the IP Range configured for the services within the cluster?

    cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep cluster-ip-range
    # ps aux | grep cluster-ip
    
  • What type of proxy is the kube-proxy configured to use?

    kubectl logs kube-proxy-b4hvp -n kube-system | grep Using
    

DNS in Kubernetes

Whenever a service is created, the Kubernetes DNS service creates a record for the service. Its maps the service name to the IP address.

Pods on the same namespace:

curl http://web-service

Pods on different namespace (apps ns)

curl http://web-service.apps

Note: For each namespace the dns creates a subdomain. All the services are group together on another subdomain called svc

curl http://web-service.apps.svc

Finally, all the services and PODs are grouped together into a root domain for the cluster cluster.local

curl http://web-service.apps.svc.cluster.local

Note: This is the FQDN for the service

Important:

We can also configure the PODs to have a DNS (it’s not by default). The type change from svc to pod and the Hostname is created from the POD IP, Example 10.244.2.5 -> 10-244-2-5.

curl http://10-244-2-5.apps.pod.cluster.local

CoreDNS in Kubernetes

Prior to version K8S V1.12 the DNS implemented by K8S was kube-dns. After the V1.12 they switched to the CoreDNS. The CoreDNS server is deployed as a POD in the kube-system namespace (two pods for replicaset). The core config file is on /etc/coredns/Corefile. Within this file we can configure plugins as prometheus, proxy, health, kubernetes, etc.

kubectl get cm coredns -n kube-system -o yaml                    apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }

Note: If we want to enable the pod dns, we need to add pods insecure parameter

What address do the PODs use to reach the DNS server?

When we deploy CoreDNS solution it also creates a service to make it available to other components within a cluster. The service is named as kube-dns by default. The IP address of this service is the nameserver on the PODs

k get svc kube-dns -n kube-system                                  NAME       TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGEkube-dns   ClusterIP   172.20.0.10   <none>        53/UDP,53/TCP   14d

Check the DNS server on a pod

k exec -it nginx-manual -- grep nameserver /etc/resolv.conf       nameserver 172.20.0.10

Note: The dns configuration on pods are done by k8s automatically (kubelet)

We can access the service without specified default.svc.cluster.local because is configured on the resolv.conf

k exec -it nginx-manual -- grep search /etc/resolv.conf                ✔  dh-sandbox-eks-cluster ⎈search default.svc.cluster.local svc.cluster.local cluster.local 

Ingress

Ingress helps your users access your application using a single externally accessible URL, that you can configure to route to different services based on the url path, at the same time implementing SSL. Ingress is like a layer 7 load balancer builtin to the k8s cluster.

We need to expose the ingress outside the cluster, using node port or loadbalancer, however we need to do only ones.

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.

An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name-based virtual hosting. An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.

An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort or Service.Type=LoadBalancer.

Note: The advantage of an Ingress over a LoadBalancer or NodePort is that an Ingress can consolidate routing rules in a single resource to expose multiple services

Prerequisites

You must have an Ingress controller to satisfy an Ingress. Only creating an Ingress resource has no effect.

You may need to deploy an Ingress controller such as ingress-nginx. You can choose from a number of Ingress controllers.

Ingress resource

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: minimal-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test
            port:
              number: 80

Each HTTP rule contains the following information:

  • An optional host. In this example, no host is specified, so the rule applies to all inbound HTTP traffic through the IP address specified. If a host is provided (for example, foo.bar.com), the rules apply to that host.
  • A list of paths (for example, /testpath), each of which has an associated backend defined with a service.name and a service.port.name or service.port.number. Both the host and path must match the content of an incoming request before the load balancer directs traffic to the referenced Service.
  • A backend is a combination of Service and port names as described in the Service doc or a custom resource backend by way of a CRD. HTTP (and HTTPS) requests to the Ingress that matches the host and path of the rule are sent to the listed backend.

Path types: Each path in an Ingress is required to have a corresponding path type. Paths that do not include an explicit pathType will fail validation.

Examples: https://kubernetes.io/docs/concepts/services-networking/ingress/#examples

Imperative way: (1.20+)

kubectl create ingress <ingress-name> --rule="host/path=service:port"
kubectl create ingress ingress-test --rule="wear.my-online-store.com/wear*=wear-service:80"

More information: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-ingress-em-

Example with Nginx rewrite:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: test-ingress
  namespace: critical-space
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
spec:
  rules:
  - http:
      paths:
      - path: /pay
        pathType: Prefix
        backend:
          service:
           name: pay-service
           port:
            number: 8282

Another example:

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-wear-watch
  namespace: app-space
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
spec:
  rules:
  - http:
      paths:
      - path: /wear
        pathType: Prefix
        backend:
          service:
           name: wear-service
           port: 
            number: 8080
      - path: /watch
        pathType: Prefix
        backend:
          service:
           name: video-service
           port:
            number: 8080

Design and install a Kubernetes Cluster

Design a Kubernetes Cluster

  • Purpose of the cluster?
    • Education (Minikube/SingleNodeWithKubeadm/GCP/AWS)
    • Development/Testing (MultiNode cluster/Kubeadm/GCP/AWS/AKS)
    • Hosting prod apps (HA Multi node cluster with multi master nodes/Kubeadm/Kops/AWS)
  • Cloud or OnPrem? (Kubeadm for on prem | Kops for AWS | AKS for Azure)
  • Workloads
    • How many?
    • What kind?
      • Web
      • Big Data/Analytics
    • Application Resource requirements
      • CPU Intensive
      • Memory Intensive
    • Traffic
      • Heavy traffic
      • Burst traffic
    • Storage

On large cluster we can separate the ETCD componentes from the Masternodes.

Choosing Kubernetes Infrastructure

Solutions avaialable to easily get started on a local machine:

  • Minikube: Single node cluster, use virtualbox or others virtualizations tools to create virtual machines that run the K8S cluster. DEPLOYS VMS
  • Kubeadm: Deploy a single node or a multi node cluster, however we need to provision the required hosts with supported configurations ourself. REQUIRES VMS TO BE READY

For production:

  • Turnkey Solutions: We provisions the required VMS and we use some kind of tools or scripts to configure the K8S cluster
    • We Provision VMs, Configure VMs, Use scripts to deploy cluster, maitain VMs. Ej K8S on AWS using KOPS
    • Another Turkey Solutions: Openshift, Cloud Foundry Container Runtime, Vmware Cloud PKS, Vagrant
  • Hosted Solutions/Managed Solutions:
    • KAAS. Providers privisions VMs, provider install K8S, provider maintain VMs. Ej EKS
    • Another Hosted Solution: GKE, Openshift Online, AKS,

Configure High Availability

What happened with we lost the Master node? As long as the workers are up and containers are alive, our apps are still running. If a POD crashes, we don’t have any more the replication controller, and the app can’t recreate again. Also we can’t access to the cluster using kubectl because the Kube Api is down. For theses reasons, is better to have a multi master node environment for production.

We need to remember, the master node has the ETC, Api server, Controller Manager, Scheduler. If we have another Master Cluster, we need to duplicate all the componentes.

  • Masternode1:
    • Api Server + ETCD + Controller Manager + Scheduler
  • Masternode2:
    • Api Server + ETCD + Controller Manager + Scheduler

Note: The API servers on all cluster nodes can be alive and running at the same time in an active active mode.

When we have 2 master (https://master1:6443 https://master2:6443) we can send a request using kubectl to either one of them, but we shouldn’t be sending the same request to both of them.

It’s better to have a loadbalancer on front the master nodes (https://loadl-balancer:6443) that split traffic between the API servers. So, we can point the kubectl to that load balancer. We can use Nginx or HA proxy, or any other load balancer.

The controller manager and scheduler needs to run on a active / standby mode (because is not, maybe will duplicate the tasks)

The Kube-controller maanger endpoint decides which is the active and wich is passive using and leader election process. The passive holds the lock

kube-controller-manager --leader-elect true

Note: By default is set to true

Both processes tries to become the leader every 2 seconds, set by the leader-elect-retry-period option. That way if one process fails maybe because the first must of crashes the second process can acquire the lock and become the leader.

ETCD:

With ETCD we can configure two topologies in Kubernetes

  • Stacked topology: ETCD is part of the K8S master nodes. Easier to setup and easier to manage. Requires fewer nodes. Risk during failures. 2 servers minimun. 2 Control node with 2 ETC

  • ETC separated from the control node: Les risky. Harder to setup. More servers

    4 servers minimun. 2 Control node + 2 ETC

Note: 2379 is the port of ETCD to which all control plane components connect to. 2380 is only for etcd peer-to-peer connectivity. When you have multiple master nodes. In this case we don’t.

ETC in HA

Etc is distributed, so it’s possible to have our datastore across multiple servers. All maintaining an identical copy of the database. Etc ensures that the same data is available at the same time on all the servers (Consistent). One of the instances is responsible for processing the writes. One nodes becomes the leader and the other nodes become the follower. The leader ensure that the other nodes are sent a copy of the data. Thus a write is only considered complete, if the leader gets consent from other members in the cluster. With ETC the minimun nodes for HA are 3.

Install “Kubernetes the Kubeadm way”

Deployment with Kubeadm

With Kubeadm we can bootstrap the K8S cluster easily

Steps:

  • Have multiple servers, for example 3, one master node, and two worker node
  • Install Container runtime on all hosts (Docker)
  • Install Kubeadm on all nodes
  • Initialize the master server (Install and configure all components)
  • Configure POD Network (Network to comunnicate between the masters and the workers)
  • Join the worker nodes to the masterd!

Exercises:

  • Install the kubeadm package on the controlplane and node01. Use the exact version of 1.21.0-0

Answer:

Letting iptables see bridged traffic

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

Install kubeadm, kubectl and kubelet on all nodes:

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Note: From the doc https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

  • Create the Cluster with Kubeadm Initialize Control Plane Node (Master Node). Use the following options:

    1. apiserver-advertise-address - Use the IP address allocated to eth0 on the controlplane node
    2. apiserver-cert-extra-sans - Set it to controlplane
    3. pod-network-cidr - Set to 10.244.0.0/16

    Once done, set up the default kubeconfig file and wait for node to be part of the cluster.

Answer:

kubeadm init --apiserver-cert-extra-sans=controlplane --apiserver-advertise-address 10.29.224.9 --pod-network-cidr=10.244.0.0/16

Note: In this example, the IP address is 10.2.223.3 . Check with ifconfig eth0

  • Join a node to the cluster

    Generate Token

    kubeadm token create --print-join-command
    

    The last command generates the next command

    root@node01:~# kubeadm join 10.2.223.3:6443 --token 50pj4l.0cy7m2e1jlfmvnif --discovery-token-ca-cert-hash sha256:fb08c01c782ef1d1ad0b643b56c9edd6a864b87cff56e7ff35713cd666659ff4
    
  • Install a Network Plugin (Flannel)

    First enable to pass bridged IPV4 to iptables

    sysctl net.bridge.bridge-nf-call-iptables=1
    

    Install flannel

    kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    

    Check the status ready on the cluster (NotReady -> Ready)

    watch kubectl get nodes
    

    https://kubernetes.io/docs/concepts/cluster-administration/addons/#networking-and-network-policy

Troubleshooting

4 types of troubleshooting on k8s for CKA:

  • Applications failures

    • Check Accessibility (services, selectors, tags, curl)

    • Check POD (number restarts, describe, logs)

      • View the logs of the previous pod (if fails)
      kubectl logs web -f --previous
      
    • Check the service targetPort, the ServiceSelector, the serviceName

  • Control-plane failures

    • Check status of the nodes

      kubectl get nodes
      
    • Check Controlplane pods (check if running)

      kubectl get pods -n kube-system
      
    • Check Controlplane services

      service kube-apiserver status
      service kube-controller-manager status
      service kube-scheduler status
      

      On the worker nodes

      service kubelet status
      service kube-proxy status
      
    • Check Service logs

      kubectl logs kube-apiserver-master -n kube-system
      sudo journalctl -u kube-apiserver
      
    • Extra:

      • Check Kube components on static pods:

        # grep KUBELET_CONFIG_ARGS= /etc/systemd/system/kubelet.service.d/10-kubeadm.confEnvironment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"# grep static /var/lib/kubelet/config.yaml staticPodPath: /etc/kubernetes/manifests
        
  • Worker-nodes failures

    • Check Node Status (ready or notready)

      kubectl get nodes
      kubectl describe node worker1
      

      Check: outOfDisk, MemoryPressure, Diskpressure, PidPressure

      Note: When a node can’t communicate with the master the status is “Unknown” Check memory and disk on the node. Check the status of the kubelet, the services, and the certificates (expired and if they are part of the right group, right CA)

      service kubelet status
      sudo journalctl -u kubelet
      openssl x509 -in /var/lib/kubelet/worker-1.crt -text
      
    • Step1. Check the status of services on the nodes. Step2. Check the service logs using journalctl -u kubelet. Step3. If it’s stopped then start the stopped services. Alternatively, run the command: ssh node01 "service kubelet start"

    • /etc/kubernetes/kubelet.conf file to check on the nodes.

      Kubelet trying to connect to the API server on the controlplane node on port 6553.

  • Network issues

    • Network Plugin in kubernetes, Install Weave Net:

      kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
      

      https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ Note: In CKA and CKAD exam, you won’t be asked to install the cni plugin.

    • Troubleshooting issues related to coreDNS

      • If you find CoreDNS pods in pending state first check network plugin is installed.
      • coredns pods have CrashLoopBackOff or Error state
  • Kube-proxy

    1. Check kube-proxy pod in the kube-system namespace is running.

    2. Check kube-proxy logs.

    3. Check configmap is correctly defined and the config file for running kube-proxy binary is correct.

    4. kube-config is defined in the config map.

    5. Check kube-proxy is running inside the container

      # netstat -plan | grep kube-proxy
      tcp        0      0 0.0.0.0:30081           0.0.0.0:*               LISTEN      1/kube-proxy
      tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      1/kube-proxy
      tcp        0      0 172.17.0.12:33706       172.17.0.12:6443        ESTABLISHED 1/kube-proxy
      tcp6       0      0 :::10256                :::*                    LISTEN      1/kube-proxy
      

Advance Kubectl Commands

  • Use JSON PATH query to fetch node names and store them in /opt/outputs/node_names.txt.

    $ kubectl get nodes -o jsonpath='{.items[*].metadata.name}' > /opt/outputs/node_names.txt
    $ cat /opt/outputs/node_names.txt
    controlplane node01
    
  • Use JSON PATH query to retrieve the osImages of all the nodes and store it in a file /opt/outputs/nodes_os.txt.

    The osImages are under the nodeInfo section under status of each node.

    $ kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.osImage}' > /opt/outputs/nodes_os.txt
    $ cat /opt/outputs/nodes_os.txt
    Ubuntu 18.04.5 LTS Ubuntu 18.04.5 LTS
    
  • A kube-config file is present at /root/my-kube-config. Get the user names from it and store it in a file /opt/outputs/users.txt.

    Use the command kubectl config view --kubeconfig=/root/my-kube-config to view the custom kube-config.

    $ kubectl config view --kubeconfig=my-kube-config -o jsonpath="{.users[*].name}" > /opt/outputs/users.txt
    $ cat /opt/outputs/users.txtaws-user dev-user test-user
    
  • A set of Persistent Volumes are available. Sort them based on their capacity and store the result in the file /opt/outputs/storage-capacity-sorted.txt.

    $ kubectl get pv --sort-by=.spec.capacity.storage > /opt/outputs/storage-capacity-sorted.txt
    $ cat /opt/outputs/storage-capacity-sorted.txt
    NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
    pv-log-4   40Mi       RWX            Retain           Available                                   10m
    pv-log-1   100Mi      RWX            Retain           Available                                   10m
    pv-log-2   200Mi      RWX            Retain           Available                                   10m
    pv-log-3   300Mi      RWX            Retain           Available                                   10m
    
  • That was good, but we don’t need all the extra details. Retrieve just the first 2 columns of output and store it in /opt/outputs/pv-and-capacity-sorted.txt.

    The columns should be named NAME and CAPACITY. Use the custom-columns option and remember, it should still be sorted as in the previous question.

    $ kubectl get pv --sort-by=.spec.capacity.storage -o=custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage > /opt/outputs/pv-and-capacity-sorted.txt
    $ # cat /opt/outputs/pv-and-capacity-sorted.txt
    NAME       CAPACITY
    pv-log-4   40Mi
    pv-log-1   100Mi
    pv-log-2   200Mi
    pv-log-3   300Mi
    

Categories

Recent Posts

About

Over 15-year experience in the IT industry. Working in SysOps, DevOps and Architecture roles with mission-critical systems across a wide range of industries. Wide experience with AWS, Terraform, Kubernetes, Containers, CI/CD pipelines, and Linux. Always keeping up with the latest technologies. Passionate about automating the run of the mill. Big focus on problem-solving.