Kubernetes Application Developer Certification CKAD
The Certified Kubernetes Application Developer or CKAD is a challenging exam. Performance-based test that requires solving multiple issues from a command line. I studied and passed the 3 Kubernetes Certifications (CKA/CKAD/CKS), and I want to share valuable information to prepare and pass this exam.
Kubernetes Application Developer Certification Notes (CKAD)
The Certified Kubernetes Application Developer or CKAD is a hands-on test and consists of a set of performance-based items (19 problems) to be solved in a command line and is expected to take approximately two (2) hours to complete.
Is a very hard exam, not because of the exercises, is because 19 problems in 2 hours are so little time, you need to be very fast with the exercises and you can’t stop on any exercises for a lot of time, in that case, I recommend you go with the next exercise.
I did the KodeKloud CKAD course with the practical test and is amazing, I studied for about 3 weeks and finally passed the exam, anyway I recommend you study a lot, as I said before is not an easy exam.
My notes are from the Kubernetes official documentation and a little part from the KodeKloud Course.
Table of contents
Global Tips
Get all resources in all namespaces
kubectl get all --all-namespaces
Explain Command
Use the explain command, is very useful to look before the web documentation
For example, if you want view the secret configuration in a pod
kubectl explain pods --recursive | less
Shortcuts / Aliases
- po = PODs
- rs = ReplicaSets
- deploy = Deployments
- svc = Services
- ns = Namespaces
- netpol = Network Policies
- pv = Persistent Volumes
- pvc = Persistent Volume Claims
- sa = Service Accounts
Kubectl Autocomple and Alias
Configure the Kubectl autocomplete and the alias k=kubectl
source <(kubectl completion bash) # setup autocomplete in bash into the current shell, bash-completion package should be installed first.
echo "source <(kubectl completion bash)" >> ~/.bashrc # add autocomplete permanently to your bash shell.
You can also use a shorthand alias for kubectl
that also works with completion:
alias k=kubectl
complete -F __start_kubectl k
https://kubernetes.io/docs/reference/kubectl/cheatsheet/
- You can use your bookmarks from kubernetes.io/docs, i recommend you prepare this before the examen, is very useful.
Core Concepts
PODS
Pods are basic building blocks of any application running in Kubernetes. A Pod consists of one or more containers and a set of resources shared by those containers. All containers managed by a Kubernetes cluster are part of a pod.
Create a Nginx Pod
Deploy a pod named nginx-pod
using the nginx:alpine
image.
kubectl run nginx --image=nginx
Deploy a redis
pod using the redis:alpine
image with the labels set to tier=db
and expose it on container port 8080
kubectl run redis --image=redis:alpine -l tier=db --port 8080
Note: The ready column in the output of the kubectl get pods
command indicates “Running containers/Total Containers”
Execute a command in a container example:
kubectl exec webapp -- cat /log/app.log
Editing Existing Pods
If you are asked to edit an existing POD, please note the following:
-
If you are given a pod definition file, edit that file and use it to create a new pod.
-
If you are not given a pod definition file, you may extract the definition to a file using the below command:
kubectl get pod <pod-name> -o yaml > pod-definition.yaml
Then edit the file to make the necessary changes, delete and re-create the pod.
-
Use the
kubectl edit pod <pod-name>
command to edit pod properties.
ReplicaSets
A ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods
How a ReplicaSet works
A ReplicaSet is defined with fields, including a selector that specifies how to identify Pods it can acquire, a number of replicas indicating how many Pods it should be maintaining, and a pod template specifying the data of new Pods it should create to meet the number of replicas criteria. A ReplicaSet then fulfills its purpose by creating and deleting Pods as needed to reach the desired number. When a ReplicaSet needs to create new Pods, it uses its Pod template.
A ReplicaSet is linked to its Pods via the Pods’ metadata.ownerReferences field, which specifies what resource the current object is owned by. All Pods acquired by a ReplicaSet have their owning ReplicaSet’s identifying information within their ownerReferences field. It’s through this link that the ReplicaSet knows of the state of the Pods it is maintaining and plans accordingly.
A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a Controller and it matches a ReplicaSet’s selector, it will be immediately acquired by said ReplicaSet.
When to use a ReplicaSet
A ReplicaSet ensures that a specified number of pod replicas are running at any given time. However, a Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to Pods along with a lot of other useful features. Therefore, we recommend using Deployments instead of directly using ReplicaSets, unless you require custom update orchestration or don’t require updates at all.
This actually means that you may never need to manipulate ReplicaSet objects: use a Deployment instead, and define your application in the spec section
ReplicaSets VS Replication Controller
Replica Set and Replication Controller do almost the same thing. Both of them ensure that a specified number of pod replicas are running at any given time. The difference comes with the usage of selectors to replicate pods. Replica Set use Set-Based selectors while replication controllers use Equity-Based selectors. ReplicaSets is the Replication Controller replacement
Example:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: frontend
labels:
app: guestbook
tier: frontend
spec:
replicas: 3
selector:
matchLabels:
tier: frontend
template:
metadata:
labels:
tier: frontend
spec:
containers:
- name: php-redis
image: gcr.io/google_samples/gb-frontend:v3
Commands
kubectl create -f replicaset-definition.yml
kubectl get replicaset
kubectl delete replicaset myapp-replicaset
kubectl replace -f replicaset-definition.yml
kubectl scale --replicas=6 -f replicaset-definition.yml
kubectl scale --replicas=5 rs new-replica-set
Deployments
A Deployment provides declarative updates for Pods ReplicaSets.
You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
Use case
- Create a Deployment to rollout a ReplicaSet. The ReplicaSet creates Pods in the background. Check the status of the rollout to see if it succeeds or not.
- Declare the new state of the Pods by updating the PodTemplateSpec of the Deployment. A new ReplicaSet is created and the Deployment manages moving the Pods from the old ReplicaSet to the new one at a controlled rate. Each new ReplicaSet updates the revision of the Deployment.
- Rollback to an earlier Deployment revision if the current state of the Deployment is not stable. Each rollback updates the revision of the Deployment.
- Scale up the Deployment to facilitate more load.
- Pause the Deployment to apply multiple fixes to its PodTemplateSpec and then resume it to start a new rollout.
- Use the status of the Deployment as an indicator that a rollout has stuck.
- Clean up older ReplicaSets that you don’t need anymore.
Namespaces
Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces.
When to Use Multiple Namespaces
Namespaces are intended for use in environments with many users spread across multiple teams, or projects. For clusters with a few to tens of users, you should not need to create or think about namespaces at all. Start using namespaces when you need the features they provide.
Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces. Namespaces cannot be nested inside one another and each Kubernetes resource can only be in one namespace.
Namespaces are a way to divide cluster resources between multiple users (via resource quota).
In future versions of Kubernetes, objects in the same namespace will have the same access control policies by default.
It is not necessary to use multiple namespaces just to separate slightly different resources, such as different versions of the same software: use labels to distinguish resources within the same namespace.
Note: Avoid creating namespace with prefix kube-
, since it is reserved for Kubernetes system namespaces.
Kubernetes starts with four initial namespaces:
default
The default namespace for objects with no other namespacekube-system
The namespace for objects created by the Kubernetes systemkube-public
This namespace is created automatically and is readable by all users (including those not authenticated). This namespace is mostly reserved for cluster usage, in case that some resources should be visible and readable publicly throughout the whole cluster. The public aspect of this namespace is only a convention, not a requirement.kube-node-lease
This namespace for the lease objects associated with each node which improves the performance of the node heartbeats as the cluster scales.
Commands
kubectl get namespace
kubectl create namespace production
kubectl run redis --image=redis -n production
kubectl get pd -n production
kubectl get pods --all-namespaces
Certification Tip: Imperative Commands
While you would be working mostly the declarative way - using definition files, imperative commands can help in getting one time tasks done quickly, as well as generate a definition template easily. This would help save a considerable amount of time during your exams.
Before we begin, familiarize with the two options that can come in handy while working with the below commands:
-
--dry-run
: By default as soon as the command is run, the resource will be created. If you simply want to test your command, use the--dry-run=client
option. This will not create the resource, instead, tell you whether the resource can be created and if your command is right. -
-o yaml
: This will output the resource definition in YAML format on the screen.
Use the above two in combination to generate a resource definition file quickly, that you can then modify and create resources as required, instead of creating the files from scratch.
POD
Create an NGINX Pod
kubectl run nginx --image=nginx
Generate POD Manifest YAML file (-o yaml). Don’t create it(–dry-run)
kubectl run nginx --image=nginx --dry-run=client -o yaml
Deployment
Create a deployment
kubectl create deployment --image=nginx nginx
Generate Deployment YAML file (-o yaml). Don’t create it(–dry-run)
kubectl create deployment --image=nginx nginx --dry-run=client -o yaml
Note: kubectl create deployment
does not have a --replicas
option. You could first create it and then scale it using the kubectl scale
command.
Example: Deployment named webapp
using the image kodekloud/webapp-color
with 3
replicas
kubectl create deployment webapp --image=kodekloud/webapp-color
kubectl scale deployment/webapp --replicas=3
Save it to a file - (If you need to modify or add some other details)
kubectl create deployment --image=nginx nginx --dry-run=client -o yaml > nginx-deployment.yaml
You can then update the YAML file with the replicas or any other field before creating the deployment.
Service
Create a Service named redis-service of type ClusterIP to expose pod redis on port 6379
kubectl expose pod redis --port=6379 --name redis-service --dry-run=client -o yaml
(This will automatically use the pod’s labels as selectors)
Or
kubectl create service clusterip redis --tcp=6379:6379 --dry-run=client -o yaml
(This will not use the pods labels as selectors, instead it will assume selectors as app=redis. You cannot pass in selectors as an option. So it does not work very well if your pod has a different label set. So generate the file and modify the selectors before creating the service)
Create a Service named nginx of type NodePort to expose pod nginx’s port 80 on port 30080 on the nodes:
kubectl expose pod nginx --port=80 --name nginx-service --type=NodePort --dry-run=client -o yaml
(This will automatically use the pod’s labels as selectors, but you cannot specify the node port. You have to generate a definition file and then add the node port in manually before creating the service with the pod.)
Or
kubectl create service nodeport nginx --tcp=80:80 --node-port=30080 --dry-run=client -o yaml
(This will not use the pods labels as selectors)
Both the above commands have their own challenges. While one of it cannot accept a selector the other cannot accept a node port. I would recommend going with the kubectl expose
command. If you need to specify a node port, generate a definition file using the same command and manually input the nodeport before creating the service.
Reference:
https://kubernetes.io/docs/reference/kubectl/conventions/
Certification Tip: Formatting Output with kubectl
The default output format for all kubectl commands is the human-readable plain-text format.
The -o flag allows us to output the details in several different formats.
kubectl [command] [TYPE] [NAME] -o <output_format>
Here are some of the commonly used formats:
-o json
Output a JSON formatted API object.-o name
Print only the resource name and nothing else.-o wide
Output in the plain-text format with any additional information.-o yaml
Output a YAML formatted API object.
Here are some useful examples:
- Output with JSON format:
master $ kubectl create namespace test-123 --dry-run -o json
{ "kind": "Namespace",
"apiVersion": "v1",
"metadata":{
"name": "test-123",
"creationTimestamp": null
},
"spec": {},
"status": {}
}
master $
- Output with YAML format:
master $ kubectl create namespace test-123 --dry-run -o yaml
apiVersion: v1
kind: Namespace
metadata:
creationTimestamp: null
name: test-123
spec: {}
status: {}
- Output with wide (additional details):
Probably the most common format used to print additional details about the object:
master $ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 3m39s 10.36.0.2 node01 <none> <none>
ningx 1/1 Running 0 7m32s 10.44.0.1 node03 <none> <none>
redis 1/1 Running 0 3m59s 10.36.0.1 node01 <none> <none>
master $
For more details, refer:
https://kubernetes.io/docs/reference/kubectl/overview
https://kubernetes.io/docs/reference/kubectl/cheatsheet
Configuration
Editing PODs and Deployments
Edit a POD
Remember, you CANNOT edit specifications of an existing POD other than the below.
- spec.containers[*].image
- spec.initContainers[*].image
- spec.activeDeadlineSeconds
- spec.tolerations
For example you cannot edit the environment variables, service accounts, resource limits (all of which we will discuss later) of a running pod. But if you really want to, you have 2 options:
- Run the
kubectl edit pod <pod name>
command. This will open the pod specification in an editor (vi editor). Then edit the required properties. When you try to save it, you will be denied. This is because you are attempting to edit a field on the pod that is not editable.
A copy of the file with your changes is saved in a temporary location as shown above.
You can then delete the existing pod by running the command:
kubectl delete pod webapp
Then create a new pod with your changes using the temporary file
kubectl create -f /tmp/kubectl-edit-ccvrq.yaml
The second option is to extract the pod definition in YAML format to a file using the command
kubectl get pod webapp -o yaml > my-new-pod.yaml
Then make the changes to the exported file using an editor (vi editor). Save the changes
vi my-new-pod.yaml
Then delete the existing pod
kubectl delete pod webapp
Then create a new pod with the edited file
kubectl create -f my-new-pod.yaml
Edit Deployments
With Deployments you can easily edit any field/property of the POD template. Since the pod template is a child of the deployment specification, with every change the deployment will automatically delete and create a new pod with the new changes. So if you are asked to edit a property of a POD part of a deployment you may do that simply by running the command
kubectl edit deployment my-deployment
Environments
In the K8s space, there are 4 ways environment variables can be set for Pods. These are namely:
- Using string literals
- From ConfigMaps
- From Secrets
- From Pod configuration
ConfigMaps
A ConfigMap is an API object used to store non-confidential data in key-value pairs. Pods can consume ConfigMaps as environment variables, command-line arguments, or as configuration files in a volume.
A ConfigMap allows you to decouple environment-specific configuration from your container images, so that your applications are easily portable.
Using ConfigMaps
ConfigMaps can be mounted as data volumes. ConfigMaps can also be used by other parts of the system, without being directly exposed to the Pod. For example, ConfigMaps can hold data that other parts of the system should use for configuration.
-
Create a configmap using imperative way
kubectl create configmap app-config --from-literal=APP_COLOR=blue
Where:
-
app-config: config-name
-
APP_COLOR=key
-
blue=value
Another way from file:
kubectl create configmap app-config --from-file=appconfig.properties
-
Create a configmap using declarative way
kubectl create -f configmap.yml
configmap.yml
apiVersion: v1 kind: ConfigMap metadata: name: app-config data: APP_COLOR: blue APP_MODE: prod
View configmaps
kubectl get configmaps
kubectl describe configmap
ConfigMap in Pods
pod-definition.yaml
apiVersion: v1
Kind: Pod
Metadata:
name: simple-webapp-color
labels:
name: simple-webapp-color
spec:
containers:
- name: simple-webapp-color
image: simple-webapp-color
ports:
- containerPort: 8080
envFrom:
- ConfigMapRef:
name: app-config
Practice:
- Create configmap from literal
- Create a pod using a configmap
Secrets
Kubernetes Secrets let you store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image
Create Secrets
- Imperative:
kubectl create secret generic \
app-secret --from-literal=DB_Host=mysql \
--from-literal=DB_User=root
or we can use a file:
kubectl create secret generic \
app-secret --from-file=app_secret.properties
- Declarative:
kubectl create -f secret-data.yaml
secret-data.yaml:
apiVersion: v1
kind: Secret
metadata:
name: app-secret
data:
DB_Host: bxlzcWw=
DB_User: cm9vdA==
DB_Password: cFGzd3Jk
For encode the data, we need to do for example:
echo -n 'mysql' | base64
For decode the data:
echo -n 'bxlzcWw=' | base64 --decode
View Secrets
kubectl get secrets
kubectl describe secrets
To view the values:
kubectl get secret app-secret -o yaml
Secrets in Pods
apiVersion: v1
Kind: Pod
Metadata:
name: simple-webapp-color
labels:
name: simple-webapp-color
spec:
containers:
- name: simple-webapp-color
image: simple-webapp-color
ports:
- containerPort: 8080
envFrom:
- SecretRef:
name: app-secret
In Secrets in pods as Volume we can see the secret inside the container
ls /opt/app-secret-volumes
DB_Host DB_Password DB_User
cat /opt/app-secret-volumes/DB_Password
paswrd
A note about Secrets
Remember that secrets encode data in base64 format. Anyone with the base64 encoded secret can easily decode it. As such the secrets can be considered as not very safe.
The concept of safety of the Secrets is a bit confusing in Kubernetes. The kubernetes documentation page and a lot of blogs out there refer to secrets as a “safer option” to store sensitive data. They are safer than storing in plain text as they reduce the risk of accidentally exposing passwords and other sensitive data. In my opinion it’s not the secret itself that is safe, it is the practices around it.
Secrets are not encrypted, so it is not safer in that sense. However, some best practices around using secrets make it safer. As in best practices like:
- Not checking-in secret object definition files to source code repositories.
- Enabling Encryption at Rest for Secrets so they are stored encrypted in ETCD.
Also the way kubernetes handles secrets. Such as:
- A secret is only sent to a node if a pod on that node requires it.
- Kubelet stores the secret into a tmpfs so that the secret is not written to disk storage.
- Once the Pod that depends on the secret is deleted, kubelet will delete its local copy of the secret data as well.
Read about the protections and risks of using secrets here
Having said that, there are other better ways of handling sensitive data like passwords in Kubernetes, such as using tools like Helm Secrets, HashiCorp Vault.
Practice:
- Create a new secret with 3 variables
- Create a pod with a secret
Security Context
On Docker we can do:
- If we want the docker run with an specified ID:
docker run --user=1001 ubuntu sleep 3600
- If we want add capabilities
docker run --cap-add MAC_ADMIN ubuntu
On Kubernetes is similar:
To Pod level:
apiVersion: V1
kind: Pod
metadata:
name: web-pod
spec:
securityContext:
runAsUser: 1000
containers:
- name: ubuntu
image: ubuntu
command: ["sleep", "3600"]
To Container level with capabilities:
...
containers:
- name: ubuntu
image: ubuntu
command: ["sleep", "3600"]
securityContext:
runAsUser: 1000
capabilities:
add: ["MAC_ADMIN"]
Note: Capabilities are only supported at the container level and not at the POD level
Practice:
- Edit a Pod and change the process to use another user with ID 1010
- Add the SYS_DATE capabilities, and change the date.
Service Account
A service account provides an identity for processes that run in a Pod.
When you (a human) access the cluster (for example, using kubectl
), you are authenticated by the apiserver as a particular User Account (currently this is usually admin
, unless your cluster administrator has customized your cluster). Processes in containers inside pods can also contact the apiserver. When they do, they are authenticated as a particular Service Account (for example, default
).
Create a Service Account
kubectl create serviceaccount dashboard-sa
View SA:
kubectl get serviceaccount
View SA Token:
$ kubectl describe serviceaccount dashboard-sa
Name: dasboard-sa
Namespace: default
...
Tokens: dashboard-sa-token-kddbm
When a SA is created first creates a SA object (the name) and after that generates the Token for the SA, and in the end creates a Secret for that token inside the object
Secret:
token:
aosvebpeh.gsxcuqptmeszxbp...
To view the token we need to run:
$ kubectl describe secret dashboard-sa-token-kddbm
...
namespace: default
token:
aosvebpeh.gsxcuqptmeszxbp...
This token can then be used as an authentication bearer token while making your REST call to the Kubernetes API.
For example in this simple example using curl, you could provide the bearer token as an authorization. Header while making a REST call to the Kubernetes API. In case of my custom dashboard application.
View roles and rolebindings from SA
kubectl get roles,rolebindings
Default Service Account
For every namespace in Kubernetes, a service account named “default” is automatically created. Each namespace has its own default service account. Whenever a pod is created,
The default service account and it’s token are automatically mounted to that pod as a volume mount.
For example we have a simple pod definition file that creates a pod using my custom Kubernetes dashboard image.
We haven’t specified any secrets or Volume mounts in the definition file.
However when the pod is created if you look at the details of the pod by running the kubectl describe pod command you’ll see that a volume is automatically created from the secret named “default-token” which is in fact the secret containing the token for this default service account.
The secret token is mounted at location /var/run/secrets/kubernetes.io/service/account inside the pod.
Remember that the default service account is very much restricted. It only has permission to run basic Kubernetes API queries.
Note: Remember, you cannot edit the service account of an existing pod. You must delete and recreate the pod. However in case of a deployment you will be able to get the service account as any changes to the pod definition file will automatically trigger a new rollout for the deployment.
Resource Requirements
When you specify a Pod, you can optionally specify how much of each resource a Container needs. The most common resources to specify are CPU and memory (RAM); there are others.
When you specify the resource request for Containers in a Pod, the scheduler uses this information to decide which node to place the Pod on. When you specify a resource limit for a Container, the kubelet enforces those limits so that the running container is not allowed to use more of that resource than the limit you set. The kubelet also reserves at least the request amount of that system resource specifically for that container to use.
Requests and limits
If the node where a Pod is running has enough of a resource available, it’s possible (and allowed) for a container to use more resource than its request
for that resource specifies. However, a container is not allowed to use more than its resource limit
.
For example, if you set a memory
request of 256 MiB for a container, and that container is in a Pod scheduled to a Node with 8GiB of memory and no other Pods, then the container can try to use more RAM.
If you set a memory
limit of 4GiB for that Container, the kubelet (and container runtime) enforce the limit. The runtime prevents the container from using more than the configured resource limit. For example: when a process in the container tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error.
Resource Requests
pod-definition.yaml
apiVersion:v1
kind: Pod
metadata:
name: simple-webapp-color
labels:
name: simple-webapp-color
spec:
containers:
- name: simple-webapp-color
image: simple-webapp-color
ports:
- containerPort: 8080
resources:
requests:
memory: "1Gi"
cpu: 1
CPU units
The CPU resource is measured in CPU units. One CPU, in Kubernetes, is equivalent to:
- 1 AWS vCPU
- 1 GCP Core
- 1 Azure vCore
- 1 Hyperthread on a bare-metal Intel processor with Hyperthreading
Fractional values are allowed. A Container that requests 0.5 CPU is guaranteed half as much CPU as a Container that requests 1 CPU. You can use the suffix m to mean milli. For example 100m CPU, 100 milliCPU, and 0.1 CPU are all the same. Precision finer than 1m is not allowed.
CPU is always requested as an absolute quantity, never as a relative quantity; 0.1 is the same amount of CPU on a single-core, dual-core, or 48-core machine.
Resource Limits
pod-definition.yaml
...
spec:
containers:
- name: simple-webapp-color
image: simple-webapp-color
ports:
- containerPort: 8080
resources:
requests:
memory: "1Gi"
cpu: 1
limits:
memory: "2Gi"
cpu: 2
Exceed Limits:
When a pod tries to exceed resources beyond its specified limited in case of CPU: Kubenertes throttles the CPU, the container cannot use more CPU resource in case of memory: A container can use more memory resource than its limit, but if a pod tries to consume more memory than its limit constantly the pods will be terminated
Default resource requirements and limits
In the previous lecture, I said - “When a pod is created the containers are assigned a default CPU request of .5 and memory of 256Mi”. For the POD to pick up those defaults you must have first set those as default values for request and limit by creating a LimitRange in that namespace.
apiVersion: v1kind: LimitRangemetadata: name: mem-limit-rangespec: limits: - default: memory: 512Mi defaultRequest: memory: 256Mi type: Container
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/memory-default-namespace/
apiVersion: v1kind: LimitRangemetadata: name: cpu-limit-rangespec: limits: - default: cpu: 1 defaultRequest: cpu: 0.5 type: Container
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/
References:
https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource
Taints and tolerations
Node affinity, is a property of Pods that attracts them to a set of nodes (either as a preference or a hard requirement). Taints are the opposite – they allow a node to repel a set of pods.
Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints.
Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.
Note: Taints are set on nodes, and tolerations are set on pods
Taint nodes
The taint-effect defines what would happen to the pods if they do not tolerate the taint, there are three main effects:
-
NoSchedule: the pods will not be scheduled on the node
-
PreferNoSchedule: the system will try to avoid placing the pod on the node, but that is not guaranteed.
-
NoExecute: Pods will not be scheduled on the node and existing pods on the node, if any, will be evicted if they do not tolerate the taint
kubectl taint nodes node-name key=value:taint-effect
Example:
kubectl taint nodes node1 app=blue:NoSchedule
#### Tolerations Pods
pod-definition.yml
… spec: containers: - name: nginx-container image: nginx tolerations: - key:“app” operator: “Equal” value: “blue” effect: “NoSchedule”
With this "toleration" the pod can be deployed on the `node1` with the taint
Note: A taint is set to the master node and automatically that prevents any pods from being scheduled there on master nodes. We can see this taint
kubectl describe node kubemaster | grep Taint Taints: node-role.kubernetes.io/master:NoSchedule
### Node Selectors
`nodeSelector` is the simplest recommended form of node selection constraint. `nodeSelector` is a field of PodSpec. It specifies a map of key-value pairs. For the pod to be eligible to run on a node, the node must have each of the indicated key-value pairs as labels (it can have additional labels as well). The most common usage is one key-value pair.
pods/pod-nginx.yaml
apiVersion: v1 kind: Pod metadata: name: nginx labels: env: test spec: containers:
- name: nginx image: nginx imagePullPolicy: IfNotPresent nodeSelector: disktype: ssd
When you then run `kubectl apply -f https://k8s.io/examples/pods/pod-nginx.yaml`, the Pod will get scheduled on the node that you attached the label to. You can verify that it worked by running `kubectl get pods -o wide` and looking at the "NODE" that the Pod was assigned to
#### Label nodes
List the [nodes](https://kubernetes.io/docs/concepts/architecture/nodes/) in your cluster, along with their labels:
```shell
$ kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
worker0 Ready <none> 1d v1.13.0 ...,kubernetes.io/hostname=worker0
worker1 Ready <none> 1d v1.13.0 ...,kubernetes.io/hostname=worker1
worker2 Ready <none> 1d v1.13.0 ...,kubernetes.io/hostname=worker2
Chose one of your nodes, and add a label to it:
kubectl label nodes <your-node-name> disktype=ssd
where <your-node-name>
is the name of your chosen node.
Example:
kubectl label nodes node-1 size=large
Node Selector Limitations
We can’t configure complex labels, such as “Large OR Medium” “Not Small” for something like this we need “Node Affinity”
Node Affinity
Node affinity is conceptually similar to nodeSelector
– it allows you to constrain which nodes your pod is eligible to be scheduled on, based on labels on the node.
There are currently two types of node affinity, called requiredDuringSchedulingIgnoredDuringExecution
and preferredDuringSchedulingIgnoredDuringExecution
. You can think of them as “hard” and “soft” respectively, in the sense that the former specifies rules that must be met for a pod to be scheduled onto a node (just like nodeSelector
but using a more expressive syntax), while the latter specifies preferences that the scheduler will try to enforce but will not guarantee. The “IgnoredDuringExecution” part of the names means that, similar to how nodeSelector
works, if labels on a node change at runtime such that the affinity rules on a pod are no longer met, the pod will still continue to run on the node. In the future we plan to offer requiredDuringSchedulingRequiredDuringExecution
which will be just like requiredDuringSchedulingIgnoredDuringExecution
except that it will evict pods from nodes that cease to satisfy the pods’ node affinity requirements.
pods/pod-with-node-affinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
Set Node Affinity to the deployment to place the pods on node01
only
apiVersion: apps/v1
kind: Deployment
metadata:
name: blue
spec:
replicas: 6
selector:
matchLabels:
run: nginx
template:
metadata:
labels:
run: nginx
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: color
operator: In
values:
- blue
Taints/Tolerations and Node Affinity
First Taint/Tolerations to prevents others pods to reach the “colors nodes”, and then configure node affinity to our colors pods to reach the color nodes
MultiContainer Pods
The primary purpose of a multi-container Pod is to support co-located, co-managed helper processes for a main program. There are some general patterns of using helper processes in Pods
Common Design Patterns
-
Sidecar containers “help” the main container. For example, log or data change watchers, monitoring adapters, and so on. A log watcher, for example, can be built once by a different team and reused across different applications. Another example of a sidecar container is a file or data loader that generates data for the main container.
-
Proxies, bridges, adapters: connect the main container with the external world. For example, Apache HTTP server or nginx can serve static files and act as a reverse proxy to a web application in the main container to log and limit HTTP request. Another example is a helper container that re-routes requests from the main container to the external world, so the main container connects to localhost to access, for example, external database without any service discovery.
-
Ambassador: Help to connect to different db environments such as Dev, Test and Prod envs, your app always connect to db localhost, and the ambassador redirect the traffic to the correct env/database
While you can host a multi-tier application (such as WordPress) in a single Pod, the recommended way is using separate Pods for each tier. The reason for that is simple: you can scale tiers up independently and distribute them across cluster nodes.
Example create a multi-container pod with 2 containers
apiVersion: v1
kind: Pod
metadata:
name: multi-container-pod
spec:
containers:
- name: container-1
image: nginx
ports:
- containerPort: 80
- name: container-2
image: alpine
command: ["watch", "wget", "-qO-", "localhost"]
Note: Multi-Container Pods share Lifecycle, Network and Storage
Observability
Readiness and Liveness probes
Pod Conditions
-
PodScheduled
-
Initilized
-
ContainersReady
-
Ready: Application inside the Pod is ready and running
We can check the value on a pod looking for the value
condition
when we describe a podkubecttl describe pod
Readiness Probes
Kubernetes uses readiness probes to decide when the container is available for accepting traffic. The readiness probe is used to control which pods are used as the backends for a service. A pod is considered ready when all of its containers are ready. If a pod is not ready, it is removed from service load balancers. For example, if a container loads a large cache at startup and takes minutes to start, you do not want to send requests to this container until it is ready, or the requests will fail—you want to route requests to other pods, which are capable of servicing requests.
Example: pod-definitiion.yaml
apiVersion: v1
kind: Pod
metadata:
name: simple-webapp
labels:
name: simple-webapp
spec:
containers:
name: simple-webapp
image: simple-webapp
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /api/ready
port: 8080
Kubernetes first perform a test to the container url /api/ready on port 8080 before sending traffic.
Readiness Probes:
-
HTTP Test - /api/ready
readinessProbe: httpGet: path: /api/ready port: 8080 initialDelaySeconds: 10 periodSeconds: 5 failureThreshold: 8
Note: The 3 last options are optional, in the case of the app try 3 times and fail the probe will stops (this is the default option)
-
TCP Test - 3306 (database)
readinessProbe: tcpSocket: port: 3306
-
Exec Command
readinessProbe: exec: command: - cat - /app/is_ready
Liveness Probes
Kubernetes uses liveness probes to know when to restart a container. If a container is unresponsive—perhaps the application is deadlocked due to a multi-threading defect—restarting the container can make the application more available, despite the defect. It certainly beats paging someone in the middle of the night to restart a container.
Example:
apiVersion: v1
kind: Pod
metadata:
name: simple-webapp
labels:
name: simple-webapp
spec:
containers:
name: simple-webapp
image: simple-webapp
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /api/ready
port: 8080
Similar to readiness probes they offer httpGet, tcpSocket and exec/command.
Container Logging
We can see the Pods logs very similar to docker
kubectl logs -f event-simulator-pod
Note: If we have multiple containers, we need to specified the container name.
kubectl logs -f event-simulator-pod event-simulator
Monitor and Debug Applications
Kubernetes doesn’t come with a full feature built in monitoring solution, however we can use different open source solution such as Metric Server, Prometheus, Elastic Stack, or propietary solutions like Datadog, Dynatrace.
For the CKAD certification we need only learn about Metric Server
Heapster was the first K8S monitoring, and now is deprecated. And the slimmed down version was formed as the Metric Server.
We can have one metric server per K8S Cluster, the Metrics Server retrieves metrics from each of the K8S nodes and pods, aggregates them and store them in memory.
Note: The metric server is only an in-memory monitoring solution and does not store the metrics on the disk, for that reason, we can’t see historical performance data. For that we need to choose and advanced monitoring tool solution.
Kubernetes runs an agent on each node know as the kubelet which is responsible for receiving instructions from the Kubernetes API Master Server and running PODs on the nodes. The Kubelet also contains a subcomponent know as the cAdvisor or container Advisor. cAdvisor is responsible for retrieving performance metrics from pods and exposing them through the kubelet API to meet the metrics available for the metrics server.
For enable Metrics Server on Minikube
minikube addons enable metrics-server
For others environments:
git clone https://github.com/kubernetes-sigs/metrics-server
kubectl create -f deploy/1.8/
This commands deploys a set of pods, services and roles to enable the metric server to poll for performance metrics from the nodes in the cluster, once deployed give the metrics server some time to collect and process data
This provides the CPU and memory consumption of each of the nodes, we can check this with:
kubectl top node
kubemaster
kubenode
kubenode
We can also use a command to view performance metrics of pods in Kubernetes
kubectl top pod
POD Design
Labels, Selectors and Annotations
Labels
Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a set of key/value labels defined. Each Key must be unique for a given object.
"metadata": {
"labels": {
"key1" : "value1",
"key2" : "value2"
}
}
Labels allow for efficient queries and watches and are ideal for use in UIs and CLIs. Non-identifying information should be recorded using annotations.
pod-definitions.yaml
apiVersion: v1
kind: Pod
metadata:
name: simple-webapp
labels:
app: App1
function: Front-end
spec:
containers:
- name: simple-webapp
image: simple-webapp
ports:
- containerPort: 8080
Label selectors
Unlike names and UIDs, labels do not provide uniqueness. In general, we expect many objects to carry the same label(s).
Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.
The API currently supports two types of selectors: equality-based and set-based. A label selector can be made of multiple requirements which are comma-separated. In the case of multiple requirements, all must be satisfied so the comma separator acts as a logical AND (&&
) operator.
The semantics of empty or non-specified selectors are dependent on the context, and API types that use selectors should document the validity and meaning of them.
We can select pods with specified labels
kubectl get pods --selector app=App1
For example, when we use ReplicaSet, we use Labels and Selectors
replicaset-definition.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: simple-webapp
labels:
app: App1
function: Front-end
spec:
replicas: 3
selector:
matchLabels:
app: App1
template:
metadata:
labels:
app: App1
function: Front-end
spec:
containers:
- name: simple-webapp
image: simple-webapp
Note: Labels in the template section is for the pods, the labels in the top are the labels for the replica set it self.
Exercises:
- Select pod with multiple Labels
- Select ALL resource with a Label
Annotations
Annotations are used to record other details for informatory purpose, for example, tools details like version, build, information, etc
You can use Kubernetes annotations to attach arbitrary non-identifying metadata to objects. Clients such as tools and libraries can retrieve this metadata.
Attaching metadata to objects
You can use either labels or annotations to attach metadata to Kubernetes objects. Labels can be used to select objects and to find collections of objects that satisfy certain conditions. In contrast, annotations are not used to identify and select objects. The metadata in an annotation can be small or large, structured or unstructured, and can include characters not permitted by labels.
Annotations, like labels, are key/value maps:
"metadata": {
"annotations": {
"key1" : "value1",
"key2" : "value2"
}
}
Here are some examples of information that could be recorded in annotations:
- Fields managed by a declarative configuration layer. Attaching these fields as annotations distinguishes them from default values set by clients or servers, and from auto-generated fields and fields set by auto-sizing or auto-scaling systems.
- Build, release, or image information like timestamps, release IDs, git branch, PR numbers, image hashes, and registry address.
- Pointers to logging, monitoring, analytics, or audit repositories.
- Client library or tool information that can be used for debugging purposes: for example, name, version, and build information.
- User or tool/system provenance information, such as URLs of related objects from other ecosystem components.
- Lightweight rollout tool metadata: for example, config or checkpoints.
- Phone or pager numbers of persons responsible, or directory entries that specify where that information can be found, such as a team web site.
- Directives from the end-user to the implementations to modify behavior or engage non-standard features.
Instead of using annotations, you could store this type of information in an external database or directory, but that would make it much harder to produce shared client libraries and tools for deployment, management, introspection, and the like.
Rolling Updates & Rollbacks in Deployments
When a new rollout is triggered a new deployment revision is created named revision 2. This helps us keep track of the changes made to our deployment and enables us to roll back to a previous version of deployment if necessary.
We can see the status of our rollout by running
kubectl rollout status deployment/myapp-deployment
We can also see the revisions and history of our deployment
kubectl rollout history deployment/myapp-deployment
Note: A Deployment’s rollout is triggered if and only if the Deployment’s Pod template (that is, .spec.template
) is changed, for example if the labels or container images of the template are updated. Other updates, such as scaling the Deployment, do not trigger a rollout.
Deployment Strategies
In Kubernetes there are a few different ways to release an application, it is necessary to choose the right strategy to make your infrastructure reliable during an application update.
Choosing the right deployment procedure depends on the needs, we listed below some of the possible strategies to adopt:
- recreate: terminate the old version and release the new one
- ramped: release a new version on a rolling update fashion, one after the other
- blue/green: release a new version alongside the old version then switch traffic
- canary: release a new version to a subset of users, then proceed to a full rollout
- a/b testing: release a new version to a subset of users in a precise way (HTTP headers, cookie, weight, etc.). A/B testing is really a technique for making business decisions based on statistics but we will briefly describe the process. This doesn’t come out of the box with Kubernetes, it implies extra work to setup a more advanced infrastructure (Istio, Linkerd, Traefik, custom nginx/haproxy, etc).
Note: We can view the difference between the deployments if we execute a describe on the deployment, in this part we can see the Scaling process.
kubectl describe deployment myapp-deployment
For example, if we executed a recreate deployment we can see the old replica set was scaled down to zero first and then the new replica set scaled up to five. However, when the rolling update strategy was used the old replica set was scaled down one at a time simultaneously scaling up the new replica set one at a time.
We will focus on the two principal deployments: Recreate and Ramped/rolling update
-
Recreate - best for development environment
A deployment defined with a strategy of type Recreate will terminate all the running instances then recreate them with the newer version.
spec: replicas: 3 strategy: type: Recreate
Pro:
- application state entirely renewed
Cons:
- downtime that depends on both shutdown and boot duration of the application
-
Rolling update - slow rollout (the default option)
A ramped deployment updates pods in a rolling update fashion, a secondary ReplicaSet is created with the new version of the application, then the number of replicas of the old version is decreased and the new version is increased until the correct number of replicas is reached.
spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 2 # how many pods we can add at a time maxUnavailable: 0 # maxUnavailable define how many pods can be unavailable # during the rolling update
When setup together with horizontal pod autoscaling it can be handy to use a percentage based value instead of a number for maxSurge and maxUnavailable.
If you trigger a deployment while an existing rollout is in progress, the deployment will pause the rollout and proceed to a new release by overriding the rollout.
Pro:
- version is slowly released across instances
- convenient for stateful applications that can handle rebalancing of the data
Cons:
- rollout/rollback can take time
- supporting multiple APIs is hard
- no control over traffic
Trigger a Deployment
We have two ways to trigger a deployment
-
Change the .yml definition, for example the image version (nginx:1.7 -> nginx:1.8) and run
kubectl apply -f deployment-definition.yml
-
The second form is using a command to update the image of our application
kubectl set image deploymnet/myapp-deploymnet nginx=nginx:1.8
Note: Remember with this method we don’t change the file definition
Upgrades
When we upgrade our applications Kubernetes creates a new replica set under the hood and start deploying the containers there at the same time taking down the pods in the old replica set following a rolling update strategy, we can see this using:
kubectl get replicasets
In the end we can see the old replicaset with 0 pods and the new replicaset with X pods
Rollback
If in the previous case, the last version has a problem, we can rollback to a previous version
kubectl rollout undo deployment/myapp-deployment
The deployment will then destroy the pods in the new replica set and bring the older ones up in the old replica set.
Updating a Deployment
Here are some handy examples related to updating a Kubernetes Deployment:
- Creating a deployment, checking the rollout status and history:
In the example below, we will first create a simple deployment and inspect the rollout status and the rollout history:
master $ kubectl create deployment nginx --image=nginx:1.16deployment.apps/nginx created
master $ kubectl rollout status deployment nginxWaiting for deployment "nginx" rollout to finish: 0 of 1 updated replicas are available...deployment "nginx" successfully rolled out
master $ master $ kubectl rollout history deployment nginxdeployment.extensions/nginxREVISION CHANGE-CAUSE1 <none>
master $
- Using the –revision flag:
Here the revision 1 is the first version where the deployment was created.
You can check the status of each revision individually by using the –revision flag:
master $ kubectl rollout history deployment nginx --revision=1deployment.extensions/nginx with revision #1
Pod Template:
Labels: app=nginx pod-template-hash=6454457cdb
Containers: nginx: Image: nginx:1.16
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
master $
- Using the –record flag:
You would have noticed that the “change-cause” field is empty in the rollout history output. We can use the –record flag to save the command used to create/update a deployment against the revision number.
master $ kubectl set image deployment nginx nginx=nginx:1.17 --recorddeployment.extensions/nginx image updatedmaster
$master $
master $ kubectl rollout history deployment nginxdeployment.extensions/nginx
REVISION CHANGE-CAUSE
1 <none>
2 kubectl set image deployment nginx nginx=nginx:1.17 --record=true
master $
You can now see that the change-cause is recorded for the revision 2 of this deployment.
Let’s make some more changes. In the example below, we are editing the deployment and changing the image from nginx:1.17 to nginx:latest while making use of the –record flag.
master $ kubectl edit deployments. nginx --recorddeployment.extensions/nginx edited
master $ kubectl rollout history deployment nginxREVISION CHANGE-CAUSE
1 <none>
2 kubectl set image deployment nginx nginx=nginx:1.17 --record=true
3 kubectl edit deployments. nginx --record=true
master $ kubectl rollout history deployment nginx --revision=3deployment.extensions/nginx with revision #3
Pod Template: Labels: app=nginx pod-template-hash=df6487dc Annotations: kubernetes.io/change-cause: kubectl edit deployments. nginx --record=true
Containers:
nginx:
Image: nginx:latest
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
master $
- Undo a change:
Lets now rollback to the previous revision:
master $ kubectl rollout undo deployment nginxdeployment.extensions/nginx rolled back
master $ kubectl rollout history deployment nginxdeployment.extensions/nginxREVISION CHANGE-CAUSE
1 <none>
3 kubectl edit deployments. nginx --record=true
4 kubectl set image deployment nginx nginx=nginx:1.17 --record=true
master $ kubectl rollout history deployment nginx --revision=4deployment.extensions/nginx with revision #4Pod Template:
Labels: app=nginx pod-template-hash=b99b98f9
Annotations: kubernetes.io/change-cause: kubectl set image deployment nginx nginx=nginx:1.17 --record=true
Containers:
nginx:
Image: nginx:1.17
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
master $ kubectl describe deployments. nginx | grep -i image:
Image: nginx:1.17
master $
With this, we have rolled back to the previous version of the deployment with the image = nginx:1.17.
Exercises:
- Change the Deployment strategy from rollout to recreate and view the pods
Jobs and CronJobs
A Job creates one or more Pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created.
A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).
Example Job: Here is an example Job config. It computes π to 2000 places and prints it out. It takes around 10s to complete.
controllers/job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
Note: The key part in a Job code is restartPolicy: Never
for only run once and the command
part with the necessary commands.
Check on the status of the Job with kubectl
:
kubectl describe jobs/pi
or
kubectl get jobs
Note: If we run kubectl get pods
we can see the STATUS = Completed
Multiple Pods
To run multiple Job/pods we need to specified the completions, this run the jobs sequentially.
spec:
completions: 3
template:
spec:
containers:
- name: math-add
image: ubuntu
command: ['expr', '3', '+', '2']
restartPolicy: Never
We can check this, and verified the DESIRED
and SUCCESSFUL
values
kubectl get jobs
Parallelism We can also configure multiplepods to run in parallel,
spec:
completions: 3
parallelism: 3
template:
CronJobs
A CronJob creates Jobs on a repeating schedule.
CronJobs are useful for creating periodic and recurring tasks, like running backups or sending emails. CronJobs can also schedule individual tasks for a specific time, such as scheduling a Job for when your cluster is likely to be idle.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
We can also create a Job using the Imperative way
kubectl create job name-job --image busybox
CronJob using the Imperative way
kubectl create cronjob name-cronjob --image busybox --schedule "00 00 * * *"
Or if we want to generate a file with the cron configuration
kubectl create job name-job --image busybox --dry-run=client -o yaml > job.yaml
To view the cronjob
kubectl get cronjob
Exercises:
- Create a Job using a POD definition
- Generate a Job file using the imperative way, and add completion to 5 and parallales to 3
- Add a CronJob to run at 10 every day
Services & Networking
Services
A Service in Kubernetes is an abstraction which defines a logical set of Pods and a policy by which to access them. Services enable a loose coupling between dependent Pods. A Service is defined using YAML (preferred) or JSON, like all Kubernetes objects. The set of Pods targeted by a Service is usually determined by a LabelSelector (see below for why you might want a Service without including selector
in the spec).
Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. Services allow your applications to receive traffic. Services can be exposed in different ways by specifying a type
in the ServiceSpec:
- ClusterIP (default) - Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
- NodePort - Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using
<NodeIP>:<NodePort>
. Superset of ClusterIP. - LoadBalancer - Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.
- ExternalName - Exposes the Service using an arbitrary name (specified by
externalName
in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher ofkube-dns
.
We can also generate a Service from a Deployment
kubectl expose deployment -n ingress-space ingress-controller --type=NodePort --port=80 --name=ingress --dry-run -o yaml >ingress.yaml
Services Cluster IP
Ingress Networking
Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.
An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name-based virtual hosting. An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort or Service.Type=LoadBalancer.
Example
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: minimal-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- http:
paths:
- path: /testpath
pathType: Prefix
backend:
service:
name: test
port:
number: 80
Each HTTP rule contains the following information:
- An optional host. In this example, no host is specified, so the rule applies to all inbound HTTP traffic through the IP address specified. If a host is provided (for example, foo.bar.com), the rules apply to that host.
- A list of paths (for example,
/testpath
), each of which has an associated backend defined with aservice.name
and aservice.port.name
orservice.port.number
. Both the host and path must match the content of an incoming request before the load balancer directs traffic to the referenced Service. - A backend is a combination of Service and port names as described in the Service doc or a custom resource backend by way of a CRD. HTTP (and HTTPS) requests to the Ingress that matches the host and path of the rule are sent to the listed backend.
View all ingress
kubectl get ingress --all-namespaces
Network Policies
If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.
The entities that a Pod can communicate with are identified through a combination of the following 3 identifiers:
- Other pods that are allowed (exception: a pod cannot block access to itself)
- Namespaces that are allowed
- IP blocks (exception: traffic to and from the node where a Pod is running is always allowed, regardless of the IP address of the Pod or the node)
Example:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-network-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
project: myproject
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: TCP
port: 6379
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/24
ports:
- protocol: TCP
port: 5978
Note: POSTing this to the API server for your cluster will have no effect unless your chosen networking solution supports network policy.
Mandatory Fields: As with all other Kubernetes config, a NetworkPolicy needs apiVersion
, kind
, and metadata
fields. For general information about working with config files, see Configure Containers Using a ConfigMap, and Object Management.
spec: NetworkPolicy spec has all the information needed to define a particular network policy in the given namespace.
podSelector: Each NetworkPolicy includes a podSelector
which selects the grouping of pods to which the policy applies. The example policy selects pods with the label “role=db”. An empty podSelector
selects all pods in the namespace.
policyTypes: Each NetworkPolicy includes a policyTypes
list which may include either Ingress
, Egress
, or both. The policyTypes
field indicates whether or not the given policy applies to ingress traffic to selected pod, egress traffic from selected pods, or both. If no policyTypes
are specified on a NetworkPolicy then by default Ingress
will always be set and Egress
will be set if the NetworkPolicy has any egress rules.
ingress: Each NetworkPolicy may include a list of allowed ingress
rules. Each rule allows traffic which matches both the from
and ports
sections. The example policy contains a single rule, which matches traffic on a single port, from one of three sources, the first specified via an ipBlock
, the second via a namespaceSelector
and the third via a podSelector
.
egress: Each NetworkPolicy may include a list of allowed egress
rules. Each rule allows traffic which matches both the to
and ports
sections. The example policy contains a single rule, which matches traffic on a single port to any destination in 10.0.0.0/24
.
So, the example NetworkPolicy:
- isolates “role=db” pods in the “default” namespace for both ingress and egress traffic (if they weren’t already isolated)
- (Ingress rules) allows connections to all pods in the “default” namespace with the label “role=db” on TCP port 6379 from:
- any pod in the “default” namespace with the label “role=frontend”
- any pod in a namespace with the label “project=myproject”
- IP addresses in the ranges 172.17.0.0–172.17.0.255 and 172.17.2.0–172.17.255.255 (ie, all of 172.17.0.0/16 except 172.17.1.0/24)
- (Egress rules) allows connections from any pod in the “default” namespace with the label “role=db” to CIDR 10.0.0.0/24 on TCP port 5978
Note: By default, if no policies exist in a namespace, then all ingress and egress traffic is allowed to and from pods in that namespace.
To view the Policies
kubectl get netpol
State Persistence
Kubernetes is designed to manage stateless containers, Pods and containers can be easily deleted and/or replaced. When a container is removed, data stored inside the container’s internal disk is lost.
State persistence refers to maintaining data outside and potentially beyond the life of a container, this usually means storing data in some king of persistent data store that can be accessed by containers.
Volumes
The internal storage of a container is ephemeral (designed to be temporary). Volumes allow us to provide more permanent storage to a pod that exist beyond the life of a container
In Kubernetes, a volume can be thought of as a directory which is accessible to the containers in a pod. We have different types of volumes in Kubernetes and the type defines how the volume is created and its content.
The concept of volume was present with the Docker, however the only issue was that the volume was very much limited to a particular pod. As soon as the life of a pod ended, the volume was also lost.
On the other hand, the volumes that are created through Kubernetes is not limited to any container. It supports any or all the containers deployed inside the pod of Kubernetes. A key advantage of Kubernetes volume is, it supports different kind of storage wherein the pod can use multiple of them at the same time.
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}
Note: EmptyDir
volumes create storage on a node when the pod is assigned to the node. The storage disappears when the pod leaves the node (Is not completely permanent).
A container crashing does not remove a Pod from a node. The data in an emptyDir
volume is safe across container crashes.
Persistent Volume (PV)
Is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-vol1
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
awsElasticBlockStore:
volumeID: 253jd2du
fsType: ext4
Create the PV and view the pv
kubectl create -f pv-definition.yaml
kubectl get pv
Persistent Volume Claim (PVC)
The storage requested by Kubernetes for its pods is known as PVC. The user does not need to know the underlying provisioning. The claims must be created in the same namespace where the pod is created.
Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany)
If there are no PV available, the PVC will remain in a pending state
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
Note: We can also specified the storageClassName
like the PV
Create the PV and view the pv
kubectl create -f pvc-definition.yaml
kubectl get pvc
PV vs PVC
PV represents a storage resource, and PVC is an abstraction layer between user (pod) and the PV.
PVCs will automatically bind themselves to a PV that has compatible StorageClass and accessModes
Using PVCs in PODs
Once you create a PVC use it in a POD definition file by specifying the PVC Claim name under persistentVolumeClaim section in the volumes section like this:
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
The same is true for ReplicaSets or Deployments. Add this to the pod template section of a Deployment on ReplicaSet.
Access Modes
A PersistentVolume can be mounted on a host in any way supported by the resource provider. As shown in the table below, providers will have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.
The access modes are:
- ReadWriteOnce – the volume can be mounted as read-write by a single node
- ReadOnlyMany – the volume can be mounted read-only by many nodes
- ReadWriteMany – the volume can be mounted as read-write by many nodes
In the CLI, the access modes are abbreviated to:
- RWO - ReadWriteOnce
- ROX - ReadOnlyMany
- RWX - ReadWriteMany
Reclaim Policy
Current reclaim policies are:
- Retain – manual reclamation
- Recycle – basic scrub (
rm -rf /thevolume/*
) - Delete – associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder volume is deleted
Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure Disk, and Cinder volumes support deletion.
Practice:
- Create a PV and a PVC, and create a Pod configured to use the PVC
Storage Classes
Note: It is not necessary for the CKAD exam
StorageClasses use provisioners that are specific to the storage platform or cloud provider to give Kubernetes access to the physical media being used. StorageClasses are the foundation of dynamic provisioning, allowing cluster administrators to define abstractions for the underlying storage platform. Users simply refer to a StorageClass by name in the PersistentVolumeClaim (PVC) using the “storageClassName” parameter (this still creates a PV, but we don’t need to create manually).
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mypvc
namespace: testns
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: gold
List the StorageClasses
# kubectl get storageclass
NAME PROVISIONER AGE
standard (default) kubernetes.io/gce-pd 1d
gold kubernetes.io/gce-pd 1d
Note: The default StorageClass is marked by (default)
In order to promote the usage of dynamic provisioning this feature permits the cluster administrator to specify a default StorageClass. When present, the user can create a PVC without having specifying a storageClassName, further reducing the user’s responsibility to be aware of the underlying storage provider.
Practice:
- List storage classes
- Create a new storage class with
provisioner
: kubernetes.io/no-provisionervolumeBindingMode
: WaitForFirstConsumer
Stateful Sets
Note: It is not necessary for the CKAD exam
- A StatefulSet is another Kubernetes controller that manages pods just like Deployments. But it differs from a Deployment in that it is more suited for stateful apps.
- Stateful use a specified order, each pod get a unique name, first deploy for example vault-0, after then vault-1, and so on.
- A stateful application requires pods with a unique identity (for example, hostname). One pod should be able to reach other pods with well-defined names.
- For a StatefulSet to work, it needs a Headless Service. A Headless Service does not have an IP address. Internally, it creates the necessary endpoints to expose pods with DNS names. The StatefulSet definition includes a reference to the Headless Service, but you have to create it separately.
- By nature, a StatefulSet needs persistent storage so that the hosted application saves its state and data across restarts. Kubernetes provides Storage Classes, Persistent Volumes, and Persistent Volume Claims to provide an abstraction layer above the cloud provider’s storage-offering mechanism.
- Once the StatefulSet and the Headless Service are created, a pod can access another one by name prefixed with the service name.
The statefulset definition is very similar to deployment definition:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
labels:
app: mysql
spec:
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql
replicas: 3
serviceName: mysql-h
We need to specified the ServiceName of the headless-service (is the last line)
kubectl create -f statefulset.yml
This creates pods one after the others, first mysql-0, then mysql-1, and so on.
We can add the podManagementPolicy: Parallel
value for deploy all pods in parallel (the default value of this field is ordered ready)
Headless Services
Note: It is not necessary for the CKAD exam
A headless service is a service with a service IP but instead of load-balancing it will return the IPs of our associated Pods. This allows us to interact directly with the Pods instead of a proxy. It’s as simple as specifying None
for .spec.clusterIP
and can be utilized with or without selectors - you’ll see an example with selectors in a moment.
apiVersion: v1
kind: Service
metadata:
name: my-headless-service
spec:
clusterIP: None # <--
selector:
app: test-app
ports:
- protocol: TCP
port: 80
targetPort: 3000
We also in our StatefulSet definition we need to specified the headless service.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
labels:
app: mysql
spec:
serviceName: my-headless-service
replicas: 3
...
More information: https://dev.to/kaoskater08/building-a-headless-service-in-kubernetes-3bk8#:~:text=What%20is%20a%20headless%20service,Pods%20instead%20of%20a%20proxy
Storage in StatefulSets
Note: It is not necessary for the CKAD exam
Maybe our pods don’t need to share data, for example in a database with replication, each instance has its own database and the replication of data between the databases is done at the database level.
Each Pod in this case need a PVC (each pvc need a pv), if we want to do this in a StatefulSets we need a PVC definition (VolumeClaimTemplate) and we need to put all these definition in our StatefulSets definition
Something like that:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
...
volumeClaimTemplates: # <---
- metadata:
name: data-volume
spec:
accessModes:
- ReadWriteOnce
...