CKS Kubernetes Specialist Security Certification
The Certified Kubernetes Security or CKS is a challenging exam. Performance-based test that requires solving multiple issues from a command line. I studied and passed the 3 Kubernetes Certifications (CKA/CKAD/CKS), and I want to share valuable information to prepare and pass this exam.
Certified Kubernetes Security Specialist
The Certified Kubernetes Security Specialist or CKs is a hands-on test and consists of a set of performance-based items (15 problems) to be solved using a command line and is expected to take approximately two (2) hours to complete.
The exam for me was the most challenging Kubernetes exam. I recommend studying using the Kim course and KodeKloud, and practicing a lot to be very fast. I finish the exam in the last minute.
Prerequisite:
Candidates must have taken and passed the Certified Kubernetes Administrator (CKA) exam prior to attempting the CKS exam.
Useful links:
-
https://docs.linuxfoundation.org/tc-docs/certification/important-instructions-cks
-
https://github.com/cncf/curriculum/blob/master/CKS_Curriculum_%20v1.22.pdf
-
https://training.linuxfoundation.org/certification/certified-kubernetes-security-specialist/
-
https://www.udemy.com/course/certified-kubernetes-security-specialist/
Exam Objectives:
Domain | Weight |
---|---|
Cluster Setup | 10% |
Cluster Hardening | 15% |
System Hardening | 15% |
Minimize Microservice Vulnerabilities | 20% |
Supply Chain Security | 20% |
Monitoring, Logging and Runtime Security | 20% |
Table of contents:
My notes are from the Kubernetes official documentation, the Killer course and KodeKloud
Cluster Setup
Use Network security policies to restrict cluster level access
If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.
The entities that a Pod can communicate with are identified through a combination of the following 3 identifiers:
- Other pods that are allowed (exception: a pod cannot block access to itself)
- Namespaces that are allowed
- IP blocks (exception: traffic to and from the node where a Pod is running is always allowed, regardless of the IP address of the Pod or the node)
Example:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-network-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
project: myproject
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: TCP
port: 6379
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/24
ports:
- protocol: TCP
port: 5978
Note: POSTing this to the API server for your cluster will have no effect unless your chosen networking solution supports network policy.
Mandatory Fields:
- podSelector: Each NetworkPolicy includes a
podSelector
which selects the grouping of pods to which the policy applies. The example policy selects pods with the label “role=db”. An emptypodSelector
selects all pods in the namespace. - policyTypes: Each NetworkPolicy includes a
policyTypes
list which may include eitherIngress
,Egress
, or both. If nopolicyTypes
are specified on a NetworkPolicy then by defaultIngress
will always be set andEgress
will be set if the NetworkPolicy has any egress rules. - ingress: Each NetworkPolicy may include a list of allowed
ingress
rules. Each rule allows traffic which matches both thefrom
andports
sections. - egress: Each NetworkPolicy may include a list of allowed
egress
rules. Each rule allows traffic which matches both theto
andports
sections.
Note: Ingress and egress parameters are the same, the only difference is with egress will need to use to
and with ingress we need to use from
.
So, the example NetworkPolicy:
- isolates “role=db” pods in the “default” namespace for both ingress and egress traffic (if they weren’t already isolated)
- (Ingress rules) allows connections to all pods in the “default” namespace with the label “role=db” on TCP port 6379 from:
- any pod in the “default” namespace with the label “role=frontend”
- any pod in a namespace with the label “project=myproject”
- IP addresses in the ranges 172.17.0.0–172.17.0.255 and 172.17.2.0–172.17.255.255 (ie, all of 172.17.0.0/16 except 172.17.1.0/24)
- (Egress rules) allows connections from any pod in the “default” namespace with the label “role=db” to CIDR 10.0.0.0/24 on TCP port 5978
Note: By default, if no policies exist in a namespace, then all ingress and egress traffic is allowed to and from pods in that namespace.
To view the Policies
> kubectl get netpol
Important: The 3 selectors for netpol are:
- podSelector
- namespaceSelector
- ipBlock
We can combinate this rules in one rule, or set as differents rules. For example, is not the same podSelector: frontend and namespaceSelector: prod
- namespaceSelector:
matchLabels:
project: prod
podSelector:
matchLabels:
role: frontend
than podSelector: frontend OR namespaceSelector: prod
- namespaceSelector:
matchLabels:
project: prod
- podSelector:
matchLabels:
role: frontend
Note: If we want to create a NP using a namespaceSelector, before we need to add a label to the namespace.
Use CIS benchmark to review the security configuration of Kubernetes components
The Center for Internet Security (CIS) releases benchmarks for best practice security recommendations. The CIS Kubernetes Benchmark is a set of recommendations for configuring Kubernetes to support a strong security posture. The Benchmark is tied to a specific Kubernetes release
Run CIS-CAT Benchmark on Linux and generate a report:
> sh ./Assessor-CLI.sh -i -rd /var/www/html/ -nts -rp index
More information at https://ccpa-docs.readthedocs.io/en/latest/
CIS Benchmark for Kubernetes:
- etcd
- kubelet
- kubedns
- kubeapi
More information at
- https://www.cisecurity.org/benchmark/kubernetes/
- https://www.cisecurity.org/cis-benchmarks/#kubernetes
- https://www.cisecurity.org/cybersecurity-tools/cis-cat-pro/cis-benchmarks-supported-by-cis-cat-pro/
Note: The CIS-Cat lite is for Windows10, Ubuntu, Mac. If we want runs CIS Benchmark on Kubernetes we need the CIS-Cat Pro version. However we can use alternate open-source and free tools to run on K8S.
Kube-bench
kube-bench is a tool from Aqua Security that checks whether Kubernetes is deployed securely by running the checks documented in the CIS Kubernetes Benchmark. It’s open-source and free.
We can deploy Kube-bench as:
- Docker container
- POD in K8S
- Binaries / Compile from source
Install on master node:
> curl -L https://github.com/aquasecurity/kube-bench/releases/download/v0.6.5/kube-bench_0.6.5_linux_amd64.tar.gz -o kube-bench_0.6.5_linux_amd64.tar.gz
> tar -xvf kube-bench_0.6.5_linux_amd64.tar.gz
Run assesment and review results
> ./kube-bench --config-dir `pwd`/cfg --config `pwd`/cfg/config.yaml
== Summary ==
43 checks PASS
12 checks FAIL
10 checks WARN
0 checks INFO
[INFO] 2 Etcd Node Configuration
[INFO] 2 Etcd Node Configuration Files
[PASS] 2.1 Ensure that the --cert-file and --key-file arguments are set as appropriate (Automated)
[PASS] 2.2 Ensure that the --client-cert-auth argument is set to true (Automated)
[PASS] 2.3 Ensure that the --auto-tls argument is not set to true (Automated)
[PASS] 2.4 Ensure that the --peer-cert-file and --peer-key-file arguments are set as appropriate (Automated)
[PASS] 2.5 Ensure that the --peer-client-cert-auth argument is set to true (Automated)
[PASS] 2.6 Ensure that the --peer-auto-tls argument is not set to true (Automated)
[PASS] 2.7 Ensure that a unique Certificate Authority is used for etcd (Manual)
... # Continue
More information at: https://github.com/aquasecurity/kube-bench#download-and-install-binaries
Properly set up Ingress objects with security control
This topic is about secure an Ingress with TLS using a Secret.
Creating the secret with certs example:
> kubectl create secret tls tls-secret --key tls.key --cert tls.crt
Creating the ingress with TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: secure-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
tls:
- hosts:
- secure-ingress.com
secretName: secure-ingress
rules:
- host: secure-ingress.com
http:
paths:
- path: /service1
pathType: Prefix
backend:
service:
name: service1
port:
number: 80
- path: /service2
pathType: Prefix
backend:
service:
name: service2
port:
number: 80
Protect node metadata and endpoints
Metadata can contain cloud credentials for VMs / Nodes, or kubelet credentials.
One way to protect node metadata on Cloud providers, it’s using Network Policies to limit the access to the endpoint metadata. For example on AWS the metadata is on http://169.254.169.254/latest/meta-data/ if we create a NP blocking the IP 169.254.169.254 we can limit the access to the metadata.
Minimize use of, and access to, GUI elements
The key points are:
- Change nodeport\loadbalancer services to clusterip
- Use
kubectl port-forward
to access - Use credentials and users (not leaving the access to anonymous users)
Verify platform binaries before deploying
To do this, we need to go to the Kubernetes webpage and compare with the checksum hash.
For example go to the release webpage and check the Hash of the kubernetes.tar.gz
On the webpage shows
ebfe49552bbda02807034488967b3b62bf9e3e507d56245e298c4c19090387136572c1fca789e772a5e8a19535531d01dcedb61980e42ca7b0461d3864df2c14
Now we can check the hash locally
> shasum -a512 kubernetes.tar.gz
ebfe49552bbda02807034488967b3b62bf9e3e507d56245e298c4c19090387136572c1fca789e772a5e8a19535531d01dcedb61980e42ca7b0461d3864df2c14
Note: The shasum of a file changes when its contents are modified and should always be compared against the hash on the official pages to ensure the same file is downloaded
Cluster Hardening
Restrict access to Kubernetes API
Api Requests are:
- Normal user
- A Service Account
- Anonyomus requests Every request must authenticate, unless anonymous user
We can connect to the Api from:
- POD
- Outside
- Node
Restrictions:
- Don’t allow anonymous access
- Close insecure port
- Don’t expose ApiServer to the outside
- Restrict access from Nodes to API (NodeRestriction)
- Prevent unauthorized access
- Prevent pod from accessing API
Anonymous Access
- kube-apiserver –anonymous-auth = true|false
- Anonymous access is enabled by default if authorization mode other than AlwaysAllow
Disabling Anonymous Access Edit the /etc/kubernetes/manifest/kube-apiserver.yaml. Add the line
--anonymous-auth=false
Note: If we disable anonymous access the kube-api liveness maybe fail, and keep restarting.
Manual API Request
We can copy the ca, crt, and key values from the kubeconfig or using
> k config view --raw
And after that we can do a manual api request
> curl https://10.100.0.20:6445 --cacert ca --cert crt --key key
NodeRestriction AdmissionController
The NodeRestriction Limits the Node labels a kubelet can modify or limit to modify only their own node labels. This is useful to ensure secure workload isolation via labels, no one can pretend to be a “secure” node and schedule secure pods
Enable it on the kube-apiserver (Enabled by default by Kubeadm cluster)
--enable-admission-plugins=NodeRestriction
Verify the Node Restriction using worker node kubelet kubeconfig to set labels
First we need to export the kubelet config in a variable
> export KUBECONFIG=/etc/kubernetes/kubelet.conf
Testing
Worker > k get ns
Error from server
Worker > k get node
NAME
cks-master
cks-worker
We are not allowed to get namespaces but we are allowed to get nodes for example.
Now we can try to set a label from the node on the master.
> k label node cks-master cks=yes
Error from server
Testing the same in the node
> k label node cks-worker cks=yes
node/cks-worker labeled
Note: Another node restriction, is we can’t set labels node-restriction.kubernetes.io
even in ourself.
Kubelet Security
If we want to detect how Kubelet is configured, we can check the kubelet process ps -aux | grep kubelet
and look on the configuration file cat /var/lib/kubelet/config.yaml
(the config file path is on the process).
Kubelet uses two ports:
- 10250: Serves API that allows full access
- 10255: Servers API that allow unauthenticated read-only access
For example if we have anonymous access enabled and we want to list the pods using the API, we can do it
> curl -sk https://localhost:10250/pods
{"kind":"PodList","apiVersion":"v1","metadata":{},"items":[{"metadata":{"name":"kube-scheduler-controlplane","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/pods/kube-scheduler-controlplane",... ->
To prevent the access to Kubelet any request need to be authenticated and then authorized, to do it we need to configure the kubelet.service with the parameter --anonymous-auth=false
We can also configure this parameter on the kubelet-config.yaml
kind: KubeletConfiguration
authentication:
anonymous:
enabled: false
If we test to list the pods again we will see a Unauthorized
message
> curl -sk https://localhost:10250/pods
Unauthorized
The best practice is disable anonymous auth, and enable a authentication mechanisms. By default Kubelet allows all requests without authorization.
-
Certificates (x509): We need to provide the CA file on the
kubelet.service
or at thekubelet-config.yaml
kind: KubeletConfiguration authentication: x509: clientCAFile: /path/ca.crt
Now to use the api we can do it using the certificates.
> curl -sk https://localhost:10250/pods/ -key kubelet-key.pem -cert kubelet-cert.pem
-
API Bearer Tokens
Kubelet Authorization:
The default mode is always allow all access to the API
kind: KubeletConfiguration
authentication:
mode: AlwaysAllow
To prevent this we can change the authorization mode to webhook. In this mode the Kubelet call the kube-apiserver to determine each request can be authorized or not.
mode: AlwaysAllow -> mode: Webhook
Testing list pods changing the authorization mode:
> curl -sk https://localhost:10250/pods
Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=proxy)
Note: After any change on the configuration we need to restart kubelet
Disable readonly port:
Testing metrics on readOnlyPort
> curl -sk http://localhost:10255/metrics
# HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend.
We could disable the read only port 10255 (the port where the metric server is exposed) if we set to 0. If the config file is in use, this flag has a default value of zero.
kind: KubeletConfiguration
readOnlyPort:0
Kubectl Proxy & Port Forward
One option to comunnicate to the Kubeapi is using the Kubectl proxy client.
> kubectl proxy
It launches a proxy service locally on port 8001 by default and uses the credentials and certificates from our Kubeconfig file. Now we can access to the Kubeapi server locally using curl without specified certificates
> curl -k http://localhost:8001
Another option to expose a Kubernetes service locally is using port forward
> kubectl port-forward service/nginx 20000:80
Summary:
-
kubectl proxy
- Opens proxy port to API server -
kubectl port-forward
- Opens port to target deployment pods (svc, deploy, rc, pods)
Use Role Based Access Controls to minimize exposure
Role-based access control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within your organization.
An RBAC Role or ClusterRole contains rules that represent a set of permissions. Permissions are purely additive (there are no “deny” rules).
Roles
A Role always sets permissions within a particular namespace; when you create a Role, you have to specify the namespace it belongs in.
Here’s an example Role in the “default” namespace that can be used to grant read access to pods:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
RoleBinding
Rolebinding Is to link the user to the rol. A role binding grants the permissions defined in a role to a user or set of users. It holds a list of subjects (users, groups, or service accounts), and a reference to the role being granted. A RoleBinding grants permissions within a specific namespace whereas a ClusterRoleBinding grants that access cluster-wide.
Here is an example of a RoleBinding that grants the “pod-reader” Role to the user “jane” within the “default” namespace. This allows “jane” to read pods in the “default” namespace.
apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
name: jane # "name" is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
# "roleRef" specifies the binding to a Role / ClusterRole
kind: Role #this must be Role or ClusterRole
name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io
RBAC Commands
-
View roles:
> kubectl get roles
-
View rolesbinding:
> kubectl get rolebindings
-
Check Access: (for example if you are a user and you want to verify and access)
> kubectl auth can-i create deployments > kubectl auth can-i delete nodes > kubectl auth can-i create pods --as dev-user > kubectl auth can-i create pods --as dev-user --namespace test
CSR
Some examples using CSRs
-
Create a CertificateSigningRequest object with the name
akshay
with the contents of theakshay.csr
fileapiVersion: certificates.k8s.io/v1kind: CertificateSigningRequestmetadata: name: akshayspec: groups: - system:authenticated request: $FROMTHEFILE signerName: kubernetes.io/kube-apiserver-client usages: - client auth
Note: Doc https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/
-
Check the CSR
> kubectl get csr
-
Approve the CSR Request
> kubectl certificate approve akshay
-
Reject and delete CSR Request
> kubectl certificate deny agent-smithkubectl delete csr agent-smith
Exercise caution in using service accounts e.g. disable defaults, minimize permissions on newly created ones
Pod using custom SA
First we need to create the SA.
> k create sa sa-test
Create a pod with the custom SA.
spec
serviceAccountName: sa-test
containers:
- image: nginx
from inside a Pod we can do:
> cat /run/secrets/kubernetes.io/serviceaccount/token
> curl https://kubernetes.default -k -H "Authorization: Bearer SA_TOKEN"
Use SA Token to connect to the API from inside a Pod
First we need to be inside the pod
> k exec -it POD -- bash
Now we can Access the API using the SA
> mount | grep sec
tmpfs on /run/secrets/kubernetes.io/serviceaccount/token
> cat /run/secrets/kubernetes.io/serviceaccount/token # SA_TOKEN
> curl https://kubernetes.default -k -H "Authorization: Bearer SA_TOKEN"
Note: By default the SA is mounted on the POD, and we can access the SA token.
Disabling SA Mounting
If the pod does not need to talk to the Kubernetes API, we can disable it (usually we don’t need it). In version 1.6+, you can opt out of automounting API credentials for a service account by setting automountServiceAccountToken: false
on the service account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: build-robot
automountServiceAccountToken: false
...
In version 1.6+, you can also opt out of automounting API credentials for a particular pod:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
serviceAccountName: build-robot
automountServiceAccountToken: false
...
Now the SA is not mounted on the pod.
Limit SA using RBAC
For example if we want to edit the default SA to delete secrets on the prod namespace. For this job we will use the ClusterRole “edit”
> k create clusterrolebinding crb-test --clusterrole edit --serviceaccount default:prod
Now if we go inside to the pod, and we use the SA Token, we can delete secrets on the production namespace
Update Kubernetes frequently
Upgrade frequentily is good because:
- Support
- Security fixes
- Bug fixes
- Stay up to date for dependencies
Releases Cycles
The Kubernetes project maintains release branches for the most recent three minor releases (1.23, 1.22, 1.21). Kubernetes 1.19 and newer receive approximately 1 year of patch support. Kubernetes 1.18 and older received approximately 9 months of patch support.
Kubernetes versions are expressed as x.y.z, where x is the major version, y is the minor version, and z is the patch version, following Semantic Versioning terminology.
Upgrade Cluster
- First upgrade the master componentes
- apiserver, controller-manager, scheduler
- Worker components
- kubelet, kube-proxy
Note: Components same minor version as apiserver or one below
Upgrade node
kubectl drain
- Do te upgrade
kubectl uncordon
For more information: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
System Hardening
Minimize host OS footprint (reduce attack surface)
To reduce the attack surface we can use the least privilege principle on the following topics:
-
Limit Node Access
- Limit control plane access to Internet using VPN
- Limit access using CIDR ranges
- Limit access to specified users (only admins)
- Limit access using /etc/passwd /etc/shadow /etc/group
- Delete unnecesary users/groups
- SSH Hardening
- Disable SSH for root Account
PermitRootLogin no
- Disable password login
PasswordAuthentication no
- Disable SSH for root Account
-
RBAC Access
-
Remove obsolete packages & services
For example remove apache server List services
> systemctl list-units --type service
List all packages
> apt list --installed
Disable service
> systemctl disable --now apache2
Remove package
> apt remove apache2
-
Restrict Network Access
Use firewalls, such UFW on Linux
-
Restrict Obsolete Kernel modules
List current kernel modules
> lsmod
sctp and dccp are 2 unneccesary modules on K8S
> cat /etc/modprobe.d/blacklist.conf blacklist sctp blacklist dccp > shutdown -r now > lsmod | egrep "sctp|dccp"
-
Indentify and Fix open ports Check Open Ports
> netstat -an | grep -w LISTEN
Check Service
> cat /etc/services | grep -w 53
Minimize IAM roles
Don’t use the root user, and set users with least privileges principle. Assign permissions to groups, and no to users, and assign the user to a group.
Minimize external access to the network
Using UFW on Linux (simple frontend for iptables) Install
> apt-get update
> apt-get install ufw
> systectl enable ufw --now
Get status
> ufw status
Add default rule to allow all outbound connections
> ufw default alllow outgoing
Add default deny incoming
> ufw default deny incoming
Add allow rule for an specified ip to ssh
> ufw allow from 192.168.0.20 to any port 22 proto tcp
Note: any
is because any ip on the server
Add allow rule for web on a specified cicr
> ufw allow from 192.168.0.0/24 to any port 80 proto tcp
Activate the firewall and check
> ufw enable
> ufw status
Delete a specified rule
> ufw delete deny 8080
We could delete using the line number
> ufw status numbered # (check the line number)
> ufw delete 5
Firewall stopped and disabled on system startup
> ufw disable
Appropriately use kernel hardening tools such as AppArmor, seccomp
Restrict Syscalls using secomp
Seccomp stands for secure computing mode and has been a feature of the Linux kernel since version 2.6.12. It can be used to sandbox the privileges of a process, restricting the calls it is able to make from userspace into the kernel. Kubernetes lets you automatically apply seccomp profiles loaded onto a Node to your Pods and containers.
Secomp have 3 modes:
- mode 0 : disabled
- mode 1 : strict
- mode 2: filtered
Is critical to limit the syscall to the applications. The default docker blocks around 60 of the 300 syscalls.
- reboot, mount, unmount
- clock_adjtime, swapoff, stime,
For example we cannot change the date on a container because the secomp configured on it.
https://kubernetes.io/docs/tutorials/clusters/seccomp/
Seccomp in Kubernetes
We can use seccomp in K8S using the amicontained image
> kubectl run amicontained --image r.j3ss.co/amicontained amicontained -- amicontained
> kubectl logs amicontained
Kubernetes doesn’t apply seccomp by default.
However we can activate using the securityContext
parameter.
Examples:
Pod that uses the container runtime default
apiVersion: v1
kind: Pod
metadata:
name: audit-pod
labels:
app: audit-pod
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
With custom profile
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/violation.json
violation.json (we need to create the violation.json previously)
{
"defaultAction": "SCMP_ACT_ERRNO"
}
Is a good practice to disable the allowPrivilegeEscalation
The logs for seccomp are at /var/log/syslog
Seccomp profile location by default is set to /var/lib/kubelet/seccomp
.
AppArmor
AppArmor is a Linux kernel security module that supplements the standard Linux user and group based permissions to confine programs to a limited set of resources. AppArmor can be configured for any application to reduce its potential attack surface and provide greater in-depth defense. It is configured through profiles tuned to allow the access needed by a specific program or container, such as Linux capabilities, network access, file permissions, etc. Each profile can be run in either enforcing mode, which blocks access to disallowed resources, or complain mode, which only reports violations.
AppArmor profiles:
AppArmor profiles are specified per-container. To specify the AppArmor profile to run a Pod container with, add an annotation to the Pod’s metadata:
container.apparmor.security.beta.kubernetes.io/<container_name>: <profile_ref>
Where <container_name>
is the name of the container to apply the profile to, and <profile_ref>
specifies the profile to apply. The profile_ref
can be one of:
runtime/default
to apply the runtime’s default profilelocalhost/<profile_name>
to apply the profile loaded on the host with the name<profile_name>
unconfined
to indicate that no profiles will be loaded
Check if AppArmor module is loaded
> aa-status
56 profiles are loaded.
19 profiles are in enforce mode.
Example: Load profile usr.sbin.nginx located in the default
AppArmor profiles directory.
apparmor_parser -q /etc/apparmor.d/usr.sbin.nginx
Since we don’t know where the Pod will be scheduled, we’ll need to load the profile on all our nodes. For this example we’ll use SSH to install the profiles, but other approaches are discussed in Setting up nodes with profiles.
NODES=(
# The SSH-accessible domain names of your nodes
gke-test-default-pool-239f5d02-gyn2.us-central1-a.my-k8s
gke-test-default-pool-239f5d02-x1kf.us-central1-a.my-k8s
gke-test-default-pool-239f5d02-xwux.us-central1-a.my-k8s)
for NODE in ${NODES[*]}; do ssh $NODE 'sudo apparmor_parser -q <<EOF
#include <tunables/global>
profile k8s-apparmor-example-deny-write flags=(attach_disconnected) {
#include <abstractions/base>
file,
# Deny all file writes.
deny /** w,
}
EOF'
done
Example:
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor
annotations:
# Tell Kubernetes to apply the AppArmor profile "k8s-apparmor-example-deny-write".
# Note that this is ignored if the Kubernetes node is not running version 1.4 or greater.
container.apparmor.security.beta.kubernetes.io/hello: localhost/k8s-apparmor-example-deny-write
spec:
containers:restricted-frontend
- name: hello
image: busybox
command: [ "sh", "-c", "echo 'Hello AppArmor!' && sleep 1h" ]
Check
> kubectl get events | grep hello-apparmor
> kubectl exec hello-apparmor cat /proc/1/attr/current
k8s-apparmor-example-deny-write (enforce)
> kubectl exec hello-apparmor touch /tmp/test
touch: /tmp/test: Permission denied
https://kubernetes.io/docs/tutorials/clusters/apparmor/
Minimize Microservice Vulnerabilities
Setup appropriate OS level security domains e.g. using PSP, OPA, security contexts
Admission Controllers
An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized. The controllers consist of the list below, are compiled into the kube-apiserver
binary, and may only be configured by the cluster administrator. In that list, there are two special controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook. These execute the mutating and validating (respectively) admission control webhooks which are configured in the API.
Admission controllers may be “validating”, “mutating”, or both. Mutating controllers may modify the objects they admit; validating controllers may not.
Admission controllers limit requests to create, delete, modify or connect to (proxy). They do not support read requests.
The admission control process proceeds in two phases. In the first phase, mutating admission controllers are run. In the second phase, validating admission controllers are run. Note again that some of the controllers are both.
If any of the controllers in either phase reject the request, the entire request is rejected immediately and an error is returned to the end-user.
Finally, in addition to sometimes mutating the object in question, admission controllers may sometimes have side effects, that is, mutate related resources as part of request processing. Incrementing quota usage is the canonical example of why this is necessary. Any such side-effect needs a corresponding reclamation or reconciliation process, as a given admission controller does not know for sure that a given request will pass all of the other admission controllers.
Why do I need them?
Many advanced features in Kubernetes require an admission controller to be enabled in order to properly support the feature. As a result, a Kubernetes API server that is not properly configured with the right set of admission controllers is an incomplete server and will not support all the features you expect.
Request to create a pod
Kubectl -> Authentication -> Authorization -> Adminission Controllers -> Create pod
View Enabled Admissions controllers
Some adminissions controllers are enabled by default, we could check using the apiserver
> kube-apiserver -h | grep enable-admission-plugins
Or we can use
> kubectl exec kube-apiserver-controlplane -n kube-system -- kube-apiserver -h | grep enable-admission-plugins
--enable-admission-plugins strings admission plugins that should be enabled in addition to default enabled ones (NamespaceLifecycle, LimitRanger, ServiceAccount, TaintNodesByCondition, Priority, DefaultTolerationSeconds, DefaultStorageClass, StorageObjectInUseProtection, PersistentVolumeClaimResize, RuntimeClass, CertificateApproval, CertificateSigning, CertificateSubjectRestriction, DefaultIngressClass, MutatingAdmissionWebhook, ValidatingAdmissionWebhook, ResourceQuota). Comma-delimited list of admission plugins: AlwaysAdmit, AlwaysDeny, AlwaysPullImages, CertificateApproval, CertificateSigning, CertificateSubjectRestriction, DefaultIngressClass, DefaultStorageClass, DefaultTolerationSeconds, DenyEscalatingExec, DenyExecOnPrivileged, EventRateLimit, ExtendedResourceToleration, ImagePolicyWebhook, LimitPodHardAntiAffinityTopology, LimitRanger, MutatingAdmissionWebhook, NamespaceAutoProvision, NamespaceExists, NamespaceLifecycle, NodeRestriction, OwnerReferencesPermissionEnforcement, PersistentVolumeClaimResize, PersistentVolumeLabel, PodNodeSelector, PodSecurityPolicy, PodTolerationRestriction, Priority, ResourceQuota, RuntimeClass, SecurityContextDeny, ServiceAccount, StorageObjectInUseProtection, TaintNodesByCondition, ValidatingAdmissionWebhook.
Add new admission controller
We need to add to the kube-apiserver.yaml
> grep NamespaceAuto /etc/kubernetes/manifests/kube-apiserver.yaml
- --enable-admission-plugins=NodeRestriction,NamespaceAutoProvision
Disabled admission controller
> grep admission /etc/kubernetes/manifests/kube-apiserver.yaml
- --enable-admission-plugins=NodeRestriction,NamespaceAutoProvision
- --disable-admission-plugins=DefaultStorageClass
Note: the NamespaceExists
and NamespaceAutoProvision
admission controllers are deprecated and now replaced by NamespaceLifecycle
admission controller. The NamespaceLifecycle
admission controller will make sure that requests to a non-existent namespace is rejected and that the default namespaces such as default
, kube-system
and kube-public
cannot be deleted.
Validating and Mutating Admission Controllers
Mutating: Could change the request (First the mutating admission controllers runs) Validating: Limit requests to create, delete, modify
Create Admission Controllers.
- MutatingAdmissionWebhook
- ValidatingAdmissionWebhook
NOTE: COMPLICATED TOPIC, in special making a custom Admission Controllers. However, very useful
OPA
More often than not organizations need to apply various kinds of policies on the environments where they run their applications. These policies might be required to meet compliance requirements, achieve a higher degree of security, achieve standardization across multiple environments, etc. This calls for an automated/declarative way to define and enforce these policies. Policy engines like OPA help us achieve the same.
Opa in Kubernetes
We can integrate OPA and Admission Controllers, for example creating a custom Adminission Controllers (mutatingwebhook / validatingwebhook) and validate with a OPA server.
Steps:
- Install Opa server on K8S (or externally)
- Creating the validating (or mutating) webhook config
The kube-mgmt
agent is deployed with OPA (sidecar containter) and is the responsible of:
- Replicate Kubernetes resources to OPA
- Load policies into OPA via Kubernetes
To enable kube-mgmt
to automatically identify policies defined in kubernetes and load them into OPA we need to create configmaps with the label openpolicyagent.org/policy set to rego
Configmap info
package kubernetes.admission
deny[msg] {
input.request.kind.kind == "Pod"
image := input.request.object.spec.containers[_].image
not startswith(image, "hooli.com/")
msg := sprintf("image '%v' comes from untrusted registry", [image])
}
Creating CM
> kubectl create configmap untrusted-registry --from-file=untrusted-registry.rego
Deny All Policy
First list the crd
> k get crd
> k get constrainttemplates
Create the ConstraintTemplate
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8salwaysdeny
spec:
crd:
spec:
names:
kind: K8sAlwaysDeny
validation:
# Schema for the `parameters` field
openAPIV3Schema:
properties:
message:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8salwaysdeny
violation[{"msg": msg}] {
1 > 0
msg := input.parameters.message
}
Now we can create resources type K8sAlwaysDeny
Create the Constrain
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAlwaysDeny
metadata:
name: pod-always-deny
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
message: "ACCESS DENIED!"
Apply both files, and verify
> k get K8sAlwaysDeny
> k describe K8sAlwaysDeny
Pod Security Policy
Pod Security Policy is an Admission Controller, if it detects a match that we have configured, then the request is rejected, for example to limite the securityContext to deny the privileged and runasuser0. Enable the Pod Securty Policy Admission Controller is like the others AC.
> grep admission /etc/kubernetes/manifests/kube-apiserver.yaml
- --enable-admission-plugins=PodSecurityPolicy
Now we can create a pod security policy configuration, for example to deny the privileged
mode.
kind: PodSecurityPolicy
metadata:
name: example-psp
spec:
privileged: false
seLinux:
rule: RunAsAny
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
Note: The others fiels are mandatories.
Check the psp
> kubectl get psp
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
example-psp false RunAsAny RunAsAny RunAsAny RunAsAny false configMap,secret,emptyDir,hostPath
Now we need to authorize the pod to use the API pod security policy, to do that we need to create a new role and bind it to the default SA account in that NS and after that we need to create the rolebinding
Now we have the podsecuritypolicy enabled, and the policy created, and the roles created, if we try to create a pod with privileged=True
it will fail.
Error from server (Forbidden): error when creating "/root/pod.yaml": pods "example-app" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "CAP_SYS_BOOT": capability may not be added]
Manage Kubernetes secrets
Basic Secrets info
Kubernetes Secrets let you store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image
Creating Secrets
- Imperative:
> kubectl create secret generic \
app-secret --from-literal=DB_Host=mysql \
--from-literal=DB_User=root
or we can use a file:
> kubectl create secret generic \
app-secret --from-file=app_secret.properties
- Declarative:
> kubectl create -f secret-data.yaml
secret-data.yaml:
apiVersion: v1
kind: Secret
metadata:
name: app-secret
data:
DB_Host: bxlzcWw=
DB_User: cm9vdA==
DB_Password: cFGzd3Jk
For encode the data, we need to do for example:
echo -n 'mysql' | base64
For decode the data:
echo -n 'bxlzcWw=' | base64 --decode
View Secrets
> kubectl get secrets
> kubectl describe secrets
To view the values:
> kubectl get secret app-secret -o yaml
Secret in pods
apiVersion: v1
Kind: Pod
Metadata:
name: simple-webapp-color
labels:
name: simple-webapp-color
spec:
containers:
- name: simple-webapp-color
image: simple-webapp-color
ports:
- containerPort: 8080
envFrom:
- SecretRef:
name: app-secret
In Secrets in pods as Volume we can see the secret inside the container
> ls /opt/app-secret-volumes
DB_Host DB_Password DB_User
> cat /opt/app-secret-volumes/DB_Password
paswrd
A note about secrets
Remember that secrets encode data in base64 format. Anyone with the base64 encoded secret can easily decode it. As such the secrets can be considered as not very safe.
Hack Secrets in Container
COMPLETE
Hack Secrets in ETCD
Access secret int etcd, first we need to copy the certs path
> cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd
Now we can access to the etcd healt to verifiy the connection
> ETCDCTL_API=3 etcdctl --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key --cacert /etc/kubernetes/pki/etcd/ca.crt endpoint health
# --endpoints "https://127.0.0.1:2379" not necessary because we’re on same node
Finally we could get the secret from the ETCD
> ETCDCTL_API=3 etcdctl --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key --cacert /etc/kubernetes/pki/etcd/ca.crt get /registry/secrets/default/secret1
Note: default
is the namespace and secret1
is the secret name. The secret is saved in plain text.
Encrypt ETCD
COMPLETE
Use container runtime sandboxes in multi-tenant environments (e.g. gvisor, kata containers)
Gvisor
gVisor is an application kernel, written in Go, that implements a substantial portion of the Linux system call interface. It provides an additional layer of isolation between running applications and the host operating system.
gVisor provides a virtualized environment in order to sandbox containers. The system interfaces normally implemented by the host kernel are moved into a distinct, per-sandbox application kernel in order to minimize the risk of a container escape exploit.
On Kubernetes Gvisor allow us to isolate Pods in our K8s.
Katacontainers
Kata Containers is an open source community working to build a secure container runtime with lightweight virtual machines that feel and perform like containers, but provide stronger workload isolation using hardware virtualization technology as a second layer of defense.
Runtimes in Kubernetes
We can specified a Runtime on Kubernetes, using Katacontainers or Gvisor
- gVisor = Handler -> runsc
- Kata = Handler -> kata
Example:
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: secure-runtime
handler: runsc
And after that we can specified on the pod which runtimeClass use.
apiVersion: v1
kind: Pod
metadata:
name: nginx-1
labels:
name: nginx
spec:
runtimeClassName: secure-runtime
containers:
- name: nginx
image: nginx
ports:
- containerPort: 8080
Check runtimeclasses
> kubectl describe runtimeclasses gvisor | grep Handler
Handler: runsc
Note: Docker makes use of runc
to start containers.
Implement pod to pod encryption by use of mTLS
What is mTLS?
Mutual TLS, or mTLS for short, is a method for mutual authentication. mTLS ensures that the parties at each end of a network connection are who they claim to be by verifying that they both have the correct private key.
mTLS is often used in a Zero Trust security framework* to verify users, devices, and servers within an organization. It can also help keep APIs secure.
Zero Trust means that no user, device, or network traffic is trusted by default, an approach that helps eliminate many security vulnerabilities.
Pod a Pod using mTLS
With mutual authenticatoin both pots prove to each other that they are indeed the real pods
Istio or Linkerd (service Mesh services) they facilitate mTLS encryption between pods. For example if we use Istio (the most famous service mesh), Istio deploys a sidecar container in every pod, this container encrypt/decrypt the Data. App1 -> Sidecar App 1 -> Sidecar App 2 -> App2
More information about how istio works: https://istio.io/latest/blog/2019/data-plane-setup/
Note: Istio/Linkerd is not part of the CKS and nothing is specified in the certification about mTLS
Supply Chain Security
Minimize base image footprint
On the Dockerfile config we could configure some parameters to make the base image smaller:
- Use lighter base image (slim, alpine, minimal)
- Generate less layers (using run &)
- Cleaning the image (remove shells, tools)
Another option is using Multi-Stage
For example, we can build and application, and execute a lighter image container to run it.
# build container stage 1
FROM ubuntu
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y golang-go
COPY app.go .
RUN CGO_ENABLED=0 go build app.go
# app container stage 2
FROM alpine
COPY --from=0 /app .
CMD ["./app"]
Secure and harden images
-
Use specified package versions
FROM alpine:3.11.6 RUN apt-get update && apt-get install -y golang-go=2:1.13
-
Don’t run as root
RUN addgroup -S appgroup && adduser -S appuser -G appgroup -h /home/appuser COPY --from=0 /app /home/appuser/ USER appuser
-
Make filesysteam read only
RUN chmod a-w /etc
Note: Do the same with all directories we don’t need write permission.
-
Remove shell access
RUN rm -rf /bin/*
More information: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
Secure your supply chain: whitelist allowed registries, sign and validate images
Use a Docker private registry
Is a good practice to use Docker Images from a private registry instead a public image as Dockerhub.
To use a Image from a private registry on Kubernetes, we need to follow some steps:
-
Create a secret object (type docker-registry) with the credentials
> kubectl create secret docker-registry private-reg-cred --docker-username=dock_user --docker-password=dock_password --docker-server=myprivateregistry.com:5000 --docker-email=dock_user@myprivateregistry.com
-
Create the pod/deploy using the secret (imagePullSecrets)
spec: containers: - name: foo image: janedoe/awesomeapp:v1 imagePullSecrets: - name: myregistrykey
Whitelist images
Whitelist images repository using ImagePolicyWebhook Admission Controller
We can limit to only use images from whitelisted repositories using a ImagePolicyWebhook Adminission Controller
Limit using images with latest
tag and ensure that all images have tags.
-
Deploy an Image Policy Webhook server
apiVersion: apps/v1 kind: Deployment ... spec: containers: - name: image-bouncer-webhook imagePullPolicy: Always image: "kainlite/kube-image-bouncer:latest" args: - "--cert=/etc/admission-controller/tls/tls.crt" - "--key=/etc/admission-controller/tls/tls.key" - "--debug" - "--registry-whitelist=docker.io,k8s.gcr.io" volumeMounts: - name: tls mountPath: /etc/admission-controller/tls volumes: - name: tls secret: secretName: tls-image-bouncer-webhook
-
Create the AdmissoinConfiguration
apiVersion: apiserver.config.k8s.io/v1 kind: AdmissionConfiguration plugins: - name: ImagePolicyWebhook configuration: imagePolicy: kubeConfigFile: /etc/kubernetes/pki/admission_kube_config.yaml allowTTL: 50 denyTTL: 50 retryBackoff: 500 defaultAllow: false
With the configuration from admission_kube_config.yaml
apiVersion: v1 kind: Config clusters: - cluster: certificate-authority: /etc/kubernetes/pki/server.crt server: https://image-bouncer-webhook:30080/image_policy name: bouncer_webhook contexts: - context: cluster: bouncer_webhook user: api-server name: bouncer_validator current-context: bouncer_validator preferences: {} users: - name: api-server user: client-certificate: /etc/kubernetes/pki/apiserver.crt client-key: /etc/kubernetes/pki/apiserver.key
Note: The 30080 port is from the image-bouncer-webhook service exposed as Nodeport
-
Enable the
ImagePolicyWebhook
so that our image policy validation can take place in API server> grep admission /etc/kubernetes/manifests/kube-apiserver.yaml - --enable-admission-plugins=NodeRestriction,ImagePolicyWebhook - --admission-control-config-file=/etc/kubernetes/pki/admission_configuration.yaml
Note: API server will automatically restart and pickup this configuration.
-
Now we can try to create a pod using the
latest
image> kubectl run nginx --image nginx:latest Error from server (Forbidden): pods "nginx" is forbidden: image policy webhook backend denied one or more images: Images using latest tag are not allowed
Another example using OPA:
Whitelist only images from docker.io
and k8s.grc.io
Creating the template.yaml
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8strustedimages
spec:
crd:
spec:
names:
kind: K8sTrustedImages
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8strustedimages
violation[{"msg": msg}] {
image := input.review.object.spec.containers[_].image
not startswith(image, "docker.io/")
not startswith(image, "k8s.gcr.io/")
msg := "not trusted image!"
}
Creating constraint.yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sTrustedImages
metadata:
name: pod-trusted-images
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
Apply
> k create -f template.yaml
> k create -f constraint.yaml
Verify
> k get constrainttemplate
> k get k8strustedimages
> k describe k8strustedimages pod-trusted-images
Use static analysis of user workloads (e.g.Kubernetes resources, Docker files)
We can analyze the Kubernetes yaml before creating using kubectl
using tools as Kubesec which makes an static analysis of user Workloads.
**Kubesec **quantifies risk for Kubernetes resources by validating the configuration files and manifest files used for Kubernetes deployments and operations. Kubesec shows a score and the reason.
We can deploy kubesec locally, as a server, or using the SaaS version.
-
Installing Kubesec using binary files
> wget https://github.com/controlplaneio/kubesec/releases/download/v2.11.0/kubesec_linux_amd64.tar.gz > tar -xvf kubesec_linux_amd64.tar.gz > mv kubesec /usr/bin/
-
Analyze a file using kubesec and generate a report file
> kubesec scan node.yaml > kubesec_report.json
-
View report
[ # Only show some lines { "object": "Pod/node.default", "valid": true, "fileName": "/root/node.yaml", "message": "Failed with a score of -27 points", "score": -27, "scoring": { "critical": [ { "id": "Privileged", "selector": "containers[] .securityContext .privileged == true", "reason": "Privileged containers can allow almost completely unrestricted host access", "points": -30 } ], "passed": [ { "id": "ServiceAccountName", "selector": ".spec .serviceAccountName", "reason": "Service accounts restrict Kubernetes API access and should be configured with least privilege", "points": 3 } ], "advise": [ { "id": "ApparmorAny", "selector": ".metadata .annotations .\"container.apparmor.security.beta.kubernetes.io/nginx\"", "reason": "Well defined AppArmor policies may provide greater protection from unknown threats. WARNING: NOT PRODUCTION READY", "points": 3 }, { "id": "ReadOnlyRootFilesystem", "selector": "containers[] .securityContext .readOnlyRootFilesystem == true", "reason": "An immutable root filesystem can prevent malicious binaries being added to PATH and increase attack cost", "points": 1 }, { "id": "RunAsNonRoot", "selector": "containers[] .securityContext .runAsNonRoot == true", "reason": "Force the running image to run as a non-root user to ensure least privilege", "points": 1 }, { "id": "RunAsUser", "selector": "containers[] .securityContext .runAsUser -gt 10000", "reason": "Run as a high-UID user to avoid conflicts with the host's user table", "points": 1 } ] } } ]
More information: https://kubesec.io/
Scan images for known vulnerabilities
To scan images for know vulnerabilities we can use Trivy. Trivy looks on the CVE, and we can check with diferents Severity Scores from 0 to 10.
CVSS V3.0 Ratings
- Low 0-4
- Medium 4-7
- High 7-9
- Critical 9-10
Trivy
-
Install Trivy
# Add the trivy-repo > apt-get update > apt-get install wget apt-transport-https gnupg lsb-release > wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add - echo deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main | sudo tee -a /etc/apt/sources.list.d/trivy.list # Update Repo and Install trivy > apt-get update > apt-get install trivy
-
Analyze Docker Image
> docker pull nginx:1.19.0 > trivy image nginx:1.19.0 Nginx (alpine 3.12.3) ====================================== Total: 41 (UNKNOWN: 0, LOW: 2, MEDIUM: 8, HIGH: 28, CRITICAL: 3)
-
Analyze with a specified rating
> trivy image --severity CRITICAL,HIGH nginx:1.19.0
Best Practices
- Continuously rescan images
- K8S Admission Controllers to scan images
- Have your own repository with pre-scanned images ready to go
- Integrate scanning into your CICD
Monitoring, Logging and Runtime Security
Perform behavioral analytics of syscall process and file activities at the host and container level to detect malicious activities
Perform behavioral analytics of syscall process
Falco acts as a security camera detecting unexpected behavior, intrusions, and data theft in real time.
For example, Falco alerts if somebody wants to see a password file, o clean a log file.
Install Falco
Detect Threats using Falco
Check the Falco service logs (config file on /etc/falco/falco.yaml)
> journalctl -fu falco
Note: Rules are read in the order of files in the list. If the same rule is present in all the files, the one in the last
file overrides the others.
Falco hot reload
> kill -1 $(cat /var/run/falco.pid)
Detect threats within physical infrastructure, apps, networks, data, users and workloads
Detect all phases of attack regardless where it occurs and how it spreads
Perform deep analytical investigation and identification of bad actors within environment
Ensure immutability of containers at runtime
To ensure immutability of containers, one way to do it is using security context and setting as readOnlyRootFilesystem (this maybe break the application), another way is using app volumes to the parts the app need to write, for example cache and run.
In summary, the key to ensure a pod is immutable is to use readOnlyRootFilesystem: true
and use volumes for temporary data in case we need it. A good practice is using in combination to Privileged: false
and runAsUser: 100
Example
spec:
containers:
- image: httpd
name: apache2
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /usr/local/apache2/logs
name: log-volume
volumes:
- name: log-volume
emptyDir: {}
Use Audit Logs to monitor access
With Audit Policies, the cluster audits the activities generated by users, by applications that use the Kubernetes API, and by the control plane itself. We can select what data to store.
The first matching rule sets the audit level of the event. The defined audit levels are:
None
- don’t log events that match this rule.Metadata
- log request metadata (requesting user, timestamp, resource, verb, etc.) but not request or response body.Request
- log event metadata and request body but not response body. This does not apply for non-resource requests.RequestResponse
- log event metadata, request and response bodies. This does not apply for non-resource requests.
Note: The RequestResponse
level provides the most verbose logs that includes the Metadata, request body and the response body.
You can pass a file with the policy to kube-apiserver
using the --audit-policy-file
flag. If the flag is omitted, no events are logged. Note that the rules
field must be provided in the audit policy file. A policy with no (0) rules is treated as illegal.
Sample Policies:
Logging all metadata events:
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
Another big example:
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
# Don't log requests to a configmap called "controller-leader"
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
# Don't log watch requests by the "system:kube-proxy" on endpoints or services
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
# Log the request body of configmap changes in kube-system.
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
# This rule only applies to resources in the "kube-system" namespace.
# The empty string "" can be used to select non-namespaced resources.
namespaces: ["kube-system"]
# Log configmap and secret changes in all other namespaces at the Metadata level.
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
# Log all other resources in core and extensions at the Request level.
- level: Request
resources:
- group: "" # core API group
- group: "extensions" # Version of group should NOT be included.
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
# Long-running requests like watches that fall under this rule will not
# generate an audit event in RequestReceived.
omitStages:
- "RequestReceived"
Note: The rules are checked in order, the first mathing rule sets the “audit level” of the event
Changes on our Audit Policy
Steps to make changes:
- Change policy file
- Disable audit logging in apiserver, wait till restart
- Enable audit logging in apiserver, wait till restart
- Test
Point 2/3 can be replace for move the kube-api manifest file and copy again to the path
Exercise:
Restrict logged data with an Audit Policy
- Nothing from stage RequestReceived
- Nothing from “get”, “watch”, “list”
- From Secrets only metada level
- Everything else requestResponse level
Answer:
apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: None
verbs: ["get","watch","list"]
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
- level: RequestResponse
Exercise: Enable auditing in this kubernetes cluster. Create a new policy file that will only log events based on the below specifications:
Namespace: prod
Operations: delete
Resources: secrets
Log Path: /var/log/prod-secrets.log
Audit file location: /etc/kubernetes/prod-audit.yaml
Maximum days to keep the logs: 30
Answer:
Create /etc/kubernetes/prod-audit.yaml
as below:
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
namespaces: ["prod"]
verbs: ["delete"]
resources:
- group: ""
resources: ["secrets"]
Next, make sure to enable logging in api-server
:
- --audit-policy-file=/etc/kubernetes/prod-audit.yaml
- --audit-log-path=/var/log/prod-secrets.log
- --audit-log-maxage=30
Then, add volumes and volume mounts as shown in the below snippets. volumes:
- name: audit
hostPath:
path: /etc/kubernetes/prod-audit.yaml
type: File
- name: audit-log
hostPath:
path: /var/log/prod-secrets.log
type: FileOrCreate
Volumes Mount:
- name: audit
hostPath:
path: /etc/kubernetes/prod-audit.yaml
type: File
- name: audit-log
hostPath:
path: /var/log/prod-secrets.log
type: FileOrCreate
Then save the file and make sure that kube-apiserver
restarts.
Cheatsheet
-
AppArmor is a Linux kernel security module that supplements the standard Linux user and group based permissions to confine programs to a limited set of resources
-
Load AppAmor profile
frontend
and check if is loadedapparmor_parser -q /etc/apparmor.d/frontend aa-status | grep frontend
-
Add to the pod
Metadata container.apparmor.security.beta.kubernetes.io/frontend-site: localhost/frontend
Note: Remember to use the profile name instead to file name
-
-
Decode a secret
echo YmpCUGJqTkRRRzVJUUdOclRUTT0K | base64 -d
-
Trivy check vulnerabilities only CRITICAL
trivy image --severity CRITICAL kodekloud/webapp-delayed-start
-
Seccomp allow us to restrict Syscalls.
Enable profile /var/lib/kubelet/seccomp/profiles/audit.json
{ "defaultAction": "SCMP_ACT_ERRNO", "architectures": [ "SCMP_ARCH_X86_64", "SCMP_ARCH_X86", "SCMP_ARCH_X32" ], "syscalls": [ { "names": [ "accept4", "read" "writev" ], "action": "SCMP_ACT_ALLOW" } ] }
Use on a pod
spec: securityContext: seccompProfile: type: Localhost localhostProfile: profiles/audit.json containers: - image: nginx
-
Falco acts as a security camera detecting unexpected behavior, intrusions, and data theft in real time. For example, Falco alerts if somebody wants to see a password file, o clean a log file.
-
Check falco logs
journalctl -u falco
-
Enable file_output in /etc/falco/falco.yaml
file_output: enabled: true keep_alive: false filename: /opt/security_incidents/alerts.log
-
Add a custom rule under /etc/falco/falco_rules.local.yaml (copy original and edit some parameters)
-
Restart Falco
kill -1 $(cat /var/run/falco.pid) # or systemctl restart falco
-
-
Runtime Class
-
Create a pod with gvisor
spec: runtimeClassName: gvisor containers: - image: nginx
-
-
Makes sure that the service account token secret is not mounted in the pod
containers: - image: nginx name: apps-cluster-dash serviceAccountName: cluster-view automountServiceAccountToken: false
-
Kubesec allows us to analyze yaml, also shows a score and the reason/improvements
kubesec scan node.yaml > kubesec_report.json
-
Inmutabiility fail with:
- privileged: true
- readOnlyRootFilesystem: true
-
Podsecurity is a K8S resource, we can limit privileged containers, volumes types, and more. First, enable podsecurity on K8S kube-apiserver /etc/kubernetes/manifests/kube-apiserver.yaml
- --enable-admission-plugins=NodeRestriction,PodSecurityPolicy
Create a podsecurity resource, disabling privileged true and limit the volumes
apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: pod-psp spec: privileged: false seLinux: rule: RunAsAny runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny volumes: - configMap - secret - emptyDir - hostPath
Now if we need to create a pod, we need to adapt the code to validate the podsecurity config, for example, priviliged: false
-
Check users permissions from roles/rolebindings
kubectl auth can-i update pods --as=john --namespace=development
-
Limit using images with
latest
tag and ensure that all images have tags.-
Deploy an Image Policy Webhook server
apiVersion: apps/v1 kind: Deployment ... spec: containers: - name: image-bouncer-webhook imagePullPolicy: Always image: "kainlite/kube-image-bouncer:latest" args: - "--cert=/etc/admission-controller/tls/tls.crt" - "--key=/etc/admission-controller/tls/tls.key" - "--debug" - "--registry-whitelist=docker.io,k8s.gcr.io" volumeMounts: - name: tls mountPath: /etc/admission-controller/tls volumes: - name: tls secret: secretName: tls-image-bouncer-webhook
-
Create the AdmissoinConfiguration
apiVersion: apiserver.config.k8s.io/v1 kind: AdmissionConfiguration plugins: - name: ImagePolicyWebhook configuration: imagePolicy: kubeConfigFile: /etc/kubernetes/pki/admission_kube_config.yaml allowTTL: 50 denyTTL: 50 retryBackoff: 500 defaultAllow: false
With the configuration from admission_kube_config.yaml
apiVersion: v1 kind: Config clusters: - cluster: certificate-authority: /etc/kubernetes/pki/server.crt server: https://image-bouncer-webhook:30080/image_policy name: bouncer_webhook contexts: - context: cluster: bouncer_webhook user: api-server name: bouncer_validator current-context: bouncer_validator preferences: {} users: - name: api-server user: client-certificate: /etc/kubernetes/pki/apiserver.crt client-key: /etc/kubernetes/pki/apiserver.key
Note: The 30080 port is from the image-bouncer-webhook service exposed as Nodeport
-
Enable the
ImagePolicyWebhook
so that our image policy validation can take place in API servergrep admission /etc/kubernetes/manifests/kube-apiserver.yaml - --enable-admission-plugins=NodeRestriction,ImagePolicyWebhook - --admission-control-config-file=/etc/kubernetes/pki/admission_configuration.yaml
Note: API server will automatically restart and pickup this configuration.
-
Now we can try to create a pod using the
latest
imagekubectl run nginx --image nginx:latest Error Package management process launched in container (user=root user_loginuid=-1 command=apk container_id=3ed4079e7f61 container_name=nginx image=docker.io/library/nginx:1.19
-