Nahuel Hernandez

Nahuel Hernandez

Another personal blog about IT, Automation, Cloud, DevOps and Stuff.

CKS Kubernetes Specialist Security Certification

The Certified Kubernetes Security or CKS is a challenging exam. Performance-based test that requires solving multiple issues from a command line. I studied and passed the 3 Kubernetes Certifications (CKA/CKAD/CKS), and I want to share valuable information to prepare and pass this exam.

41-Minute Read

cks

Certified Kubernetes Security Specialist

The Certified Kubernetes Security Specialist or CKs is a hands-on test and consists of a set of performance-based items (15 problems) to be solved using a command line and is expected to take approximately two (2) hours to complete.

The exam for me was the most challenging Kubernetes exam. I recommend studying using the Kim course and KodeKloud, and practicing a lot to be very fast. I finish the exam in the last minute.

Prerequisite:

Candidates must have taken and passed the Certified Kubernetes Administrator (CKA) exam prior to attempting the CKS exam.

Useful links:

Exam Objectives:

Domain Weight
Cluster Setup 10%
Cluster Hardening 15%
System Hardening 15%
Minimize Microservice Vulnerabilities 20%
Supply Chain Security 20%
Monitoring, Logging and Runtime Security 20%

Table of contents:

My notes are from the Kubernetes official documentation, the Killer course and KodeKloud

Cluster Setup

Use Network security policies to restrict cluster level access

If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.

The entities that a Pod can communicate with are identified through a combination of the following 3 identifiers:

  1. Other pods that are allowed (exception: a pod cannot block access to itself)
  2. Namespaces that are allowed
  3. IP blocks (exception: traffic to and from the node where a Pod is running is always allowed, regardless of the IP address of the Pod or the node)

Example:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: test-network-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      role: db
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - ipBlock:
        cidr: 172.17.0.0/16
        except:
        - 172.17.1.0/24
    - namespaceSelector:
        matchLabels:
          project: myproject
    - podSelector:
        matchLabels:
          role: frontend
    ports:
    - protocol: TCP
      port: 6379
  egress:
  - to:
    - ipBlock:
        cidr: 10.0.0.0/24
    ports:
    - protocol: TCP
      port: 5978

Note: POSTing this to the API server for your cluster will have no effect unless your chosen networking solution supports network policy.

Mandatory Fields:

  • podSelector: Each NetworkPolicy includes a podSelector which selects the grouping of pods to which the policy applies. The example policy selects pods with the label “role=db”. An empty podSelector selects all pods in the namespace.
  • policyTypes: Each NetworkPolicy includes a policyTypes list which may include either Ingress, Egress, or both. If no policyTypes are specified on a NetworkPolicy then by default Ingress will always be set and Egress will be set if the NetworkPolicy has any egress rules.
  • ingress: Each NetworkPolicy may include a list of allowed ingress rules. Each rule allows traffic which matches both the from and ports sections.
  • egress: Each NetworkPolicy may include a list of allowed egress rules. Each rule allows traffic which matches both the to and ports sections.

Note: Ingress and egress parameters are the same, the only difference is with egress will need to use to and with ingress we need to use from.

So, the example NetworkPolicy:

  1. isolates “role=db” pods in the “default” namespace for both ingress and egress traffic (if they weren’t already isolated)
  2. (Ingress rules) allows connections to all pods in the “default” namespace with the label “role=db” on TCP port 6379 from:
    • any pod in the “default” namespace with the label “role=frontend”
    • any pod in a namespace with the label “project=myproject”
    • IP addresses in the ranges 172.17.0.0–172.17.0.255 and 172.17.2.0–172.17.255.255 (ie, all of 172.17.0.0/16 except 172.17.1.0/24)
  3. (Egress rules) allows connections from any pod in the “default” namespace with the label “role=db” to CIDR 10.0.0.0/24 on TCP port 5978

Note: By default, if no policies exist in a namespace, then all ingress and egress traffic is allowed to and from pods in that namespace.

To view the Policies

> kubectl get netpol

Important: The 3 selectors for netpol are:

  • podSelector
  • namespaceSelector
  • ipBlock

We can combinate this rules in one rule, or set as differents rules. For example, is not the same podSelector: frontend and namespaceSelector: prod

    - namespaceSelector:        
      matchLabels:          
        project: prod      
    podSelector:        
      matchLabels:          
        role: frontend

than podSelector: frontend OR namespaceSelector: prod

    - namespaceSelector:        
      matchLabels:          
        project: prod    
    - podSelector:        
      matchLabels:          
        role: frontend

Note: If we want to create a NP using a namespaceSelector, before we need to add a label to the namespace.

Use CIS benchmark to review the security configuration of Kubernetes components

The Center for Internet Security (CIS) releases benchmarks for best practice security recommendations. The CIS Kubernetes Benchmark is a set of recommendations for configuring Kubernetes to support a strong security posture. The Benchmark is tied to a specific Kubernetes release

Run CIS-CAT Benchmark on Linux and generate a report:

> sh ./Assessor-CLI.sh -i -rd /var/www/html/ -nts -rp index

More information at https://ccpa-docs.readthedocs.io/en/latest/

CIS Benchmark for Kubernetes:

  • etcd
  • kubelet
  • kubedns
  • kubeapi

More information at

Note: The CIS-Cat lite is for Windows10, Ubuntu, Mac. If we want runs CIS Benchmark on Kubernetes we need the CIS-Cat Pro version. However we can use alternate open-source and free tools to run on K8S.

Kube-bench

kube-bench is a tool from Aqua Security that checks whether Kubernetes is deployed securely by running the checks documented in the CIS Kubernetes Benchmark. It’s open-source and free.

We can deploy Kube-bench as:

  • Docker container
  • POD in K8S
  • Binaries / Compile from source

Install on master node:

> curl -L https://github.com/aquasecurity/kube-bench/releases/download/v0.6.5/kube-bench_0.6.5_linux_amd64.tar.gz -o kube-bench_0.6.5_linux_amd64.tar.gz
> tar -xvf kube-bench_0.6.5_linux_amd64.tar.gz

Run assesment and review results

 > ./kube-bench --config-dir `pwd`/cfg --config `pwd`/cfg/config.yaml 
== Summary ==
43 checks PASS
12 checks FAIL
10 checks WARN
0 checks INFO
[INFO] 2 Etcd Node Configuration
[INFO] 2 Etcd Node Configuration Files
[PASS] 2.1 Ensure that the --cert-file and --key-file arguments are set as appropriate (Automated)
[PASS] 2.2 Ensure that the --client-cert-auth argument is set to true (Automated)
[PASS] 2.3 Ensure that the --auto-tls argument is not set to true (Automated)
[PASS] 2.4 Ensure that the --peer-cert-file and --peer-key-file arguments are set as appropriate (Automated)
[PASS] 2.5 Ensure that the --peer-client-cert-auth argument is set to true (Automated)
[PASS] 2.6 Ensure that the --peer-auto-tls argument is not set to true (Automated)
[PASS] 2.7 Ensure that a unique Certificate Authority is used for etcd (Manual)
... # Continue

More information at: https://github.com/aquasecurity/kube-bench#download-and-install-binaries

Properly set up Ingress objects with security control

This topic is about secure an Ingress with TLS using a Secret.

Creating the secret with certs example:

> kubectl create secret tls tls-secret --key tls.key --cert tls.crt

Creating the ingress with TLS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: secure-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx 
  tls:
  - hosts:
      - secure-ingress.com
    secretName: secure-ingress
  rules:
  - host: secure-ingress.com
    http:
      paths:
      - path: /service1
        pathType: Prefix
        backend:
          service:
            name: service1
            port:
              number: 80
      - path: /service2
        pathType: Prefix
        backend:
          service:
            name: service2
            port:
              number: 80

Protect node metadata and endpoints

Metadata can contain cloud credentials for VMs / Nodes, or kubelet credentials.

One way to protect node metadata on Cloud providers, it’s using Network Policies to limit the access to the endpoint metadata. For example on AWS the metadata is on http://169.254.169.254/latest/meta-data/ if we create a NP blocking the IP 169.254.169.254 we can limit the access to the metadata.

Minimize use of, and access to, GUI elements

The key points are:

  • Change nodeport\loadbalancer services to clusterip
  • Use kubectl port-forward to access
  • Use credentials and users (not leaving the access to anonymous users)

Verify platform binaries before deploying

To do this, we need to go to the Kubernetes webpage and compare with the checksum hash.

For example go to the release webpage and check the Hash of the kubernetes.tar.gz

On the webpage shows

ebfe49552bbda02807034488967b3b62bf9e3e507d56245e298c4c19090387136572c1fca789e772a5e8a19535531d01dcedb61980e42ca7b0461d3864df2c14

Now we can check the hash locally

> shasum -a512 kubernetes.tar.gz
ebfe49552bbda02807034488967b3b62bf9e3e507d56245e298c4c19090387136572c1fca789e772a5e8a19535531d01dcedb61980e42ca7b0461d3864df2c14

Note: The shasum of a file changes when its contents are modified and should always be compared against the hash on the official pages to ensure the same file is downloaded

Cluster Hardening

Restrict access to Kubernetes API

Api Requests are:

  • Normal user
  • A Service Account
  • Anonyomus requests Every request must authenticate, unless anonymous user

We can connect to the Api from:

  • POD
  • Outside
  • Node

Restrictions:

  • Don’t allow anonymous access
  • Close insecure port
  • Don’t expose ApiServer to the outside
  • Restrict access from Nodes to API (NodeRestriction)
  • Prevent unauthorized access
  • Prevent pod from accessing API

Anonymous Access

  • kube-apiserver –anonymous-auth = true|false
  • Anonymous access is enabled by default if authorization mode other than AlwaysAllow

Disabling Anonymous Access Edit the /etc/kubernetes/manifest/kube-apiserver.yaml. Add the line

--anonymous-auth=false

Note: If we disable anonymous access the kube-api liveness maybe fail, and keep restarting.

Manual API Request

We can copy the ca, crt, and key values from the kubeconfig or using

> k config view --raw

And after that we can do a manual api request

> curl https://10.100.0.20:6445 --cacert ca --cert crt --key key

NodeRestriction AdmissionController

The NodeRestriction Limits the Node labels a kubelet can modify or limit to modify only their own node labels. This is useful to ensure secure workload isolation via labels, no one can pretend to be a “secure” node and schedule secure pods

Enable it on the kube-apiserver (Enabled by default by Kubeadm cluster)

--enable-admission-plugins=NodeRestriction

Verify the Node Restriction using worker node kubelet kubeconfig to set labels

First we need to export the kubelet config in a variable

> export KUBECONFIG=/etc/kubernetes/kubelet.conf

Testing

Worker > k get ns 
Error from server
Worker > k get node
NAME
cks-master
cks-worker

We are not allowed to get namespaces but we are allowed to get nodes for example.

Now we can try to set a label from the node on the master.

> k label node cks-master cks=yes
Error from server

Testing the same in the node

> k label node cks-worker cks=yes
node/cks-worker labeled

Note: Another node restriction, is we can’t set labels node-restriction.kubernetes.io even in ourself.

Kubelet Security

If we want to detect how Kubelet is configured, we can check the kubelet process ps -aux | grep kubelet and look on the configuration file cat /var/lib/kubelet/config.yaml (the config file path is on the process). Kubelet uses two ports:

  • 10250: Serves API that allows full access
  • 10255: Servers API that allow unauthenticated read-only access

For example if we have anonymous access enabled and we want to list the pods using the API, we can do it

> curl -sk https://localhost:10250/pods
{"kind":"PodList","apiVersion":"v1","metadata":{},"items":[{"metadata":{"name":"kube-scheduler-controlplane","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/pods/kube-scheduler-controlplane",... ->

To prevent the access to Kubelet any request need to be authenticated and then authorized, to do it we need to configure the kubelet.service with the parameter --anonymous-auth=false We can also configure this parameter on the kubelet-config.yaml

kind: KubeletConfiguration
authentication:
	anonymous:
		enabled: false

If we test to list the pods again we will see a Unauthorized message

> curl -sk https://localhost:10250/pods
Unauthorized

The best practice is disable anonymous auth, and enable a authentication mechanisms. By default Kubelet allows all requests without authorization.

  • Certificates (x509): We need to provide the CA file on the kubelet.service or at the kubelet-config.yaml

    kind: KubeletConfiguration
    authentication:
    	x509:
    		clientCAFile: /path/ca.crt
    

    Now to use the api we can do it using the certificates.

    > curl -sk https://localhost:10250/pods/ -key kubelet-key.pem -cert kubelet-cert.pem
    
  • API Bearer Tokens

Kubelet Authorization:

The default mode is always allow all access to the API

kind: KubeletConfiguration
authentication:
	mode: AlwaysAllow

To prevent this we can change the authorization mode to webhook. In this mode the Kubelet call the kube-apiserver to determine each request can be authorized or not.

mode: AlwaysAllow -> mode: Webhook

Testing list pods changing the authorization mode:

> curl -sk https://localhost:10250/pods
Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=proxy)

Note: After any change on the configuration we need to restart kubelet

Disable readonly port:

Testing metrics on readOnlyPort

> curl -sk http://localhost:10255/metrics
# HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend.

We could disable the read only port 10255 (the port where the metric server is exposed) if we set to 0. If the config file is in use, this flag has a default value of zero.

kind: KubeletConfiguration
readOnlyPort:0

Kubectl Proxy & Port Forward

One option to comunnicate to the Kubeapi is using the Kubectl proxy client.

> kubectl proxy

It launches a proxy service locally on port 8001 by default and uses the credentials and certificates from our Kubeconfig file. Now we can access to the Kubeapi server locally using curl without specified certificates

> curl -k http://localhost:8001 

Another option to expose a Kubernetes service locally is using port forward

> kubectl port-forward service/nginx 20000:80

Summary:

  • kubectl proxy - Opens proxy port to API server

  • kubectl port-forward - Opens port to target deployment pods (svc, deploy, rc, pods)

Use Role Based Access Controls to minimize exposure

Role-based access control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within your organization.

An RBAC Role or ClusterRole contains rules that represent a set of permissions. Permissions are purely additive (there are no “deny” rules).

Roles

A Role always sets permissions within a particular namespace; when you create a Role, you have to specify the namespace it belongs in.

Here’s an example Role in the “default” namespace that can be used to grant read access to pods:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

RoleBinding

Rolebinding Is to link the user to the rol. A role binding grants the permissions defined in a role to a user or set of users. It holds a list of subjects (users, groups, or service accounts), and a reference to the role being granted. A RoleBinding grants permissions within a specific namespace whereas a ClusterRoleBinding grants that access cluster-wide.

Here is an example of a RoleBinding that grants the “pod-reader” Role to the user “jane” within the “default” namespace. This allows “jane” to read pods in the “default” namespace.

apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
  name: jane # "name" is case sensitive
  apiGroup: rbac.authorization.k8s.io
roleRef:
  # "roleRef" specifies the binding to a Role / ClusterRole
  kind: Role #this must be Role or ClusterRole
  name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
  apiGroup: rbac.authorization.k8s.io

RBAC Commands

  • View roles:

    > kubectl get roles
    
  • View rolesbinding:

    > kubectl get rolebindings
    
  • Check Access: (for example if you are a user and you want to verify and access)

    > kubectl auth can-i create deployments
    > kubectl auth can-i delete nodes
    > kubectl auth can-i create pods --as dev-user
    > kubectl auth can-i create pods --as dev-user --namespace test
    

CSR

Some examples using CSRs

  • Create a CertificateSigningRequest object with the name akshay with the contents of the akshay.csr file

    apiVersion: certificates.k8s.io/v1kind: CertificateSigningRequestmetadata:  name: akshayspec:  groups:  - system:authenticated  request: $FROMTHEFILE  signerName: kubernetes.io/kube-apiserver-client  usages:  - client auth
    

    Note: Doc https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/

  • Check the CSR

    > kubectl get csr
    
  • Approve the CSR Request

    > kubectl certificate approve akshay
    
  • Reject and delete CSR Request

    > kubectl certificate deny agent-smithkubectl delete csr agent-smith
    

Exercise caution in using service accounts e.g. disable defaults, minimize permissions on newly created ones

Pod using custom SA

First we need to create the SA.

> k create sa sa-test

Create a pod with the custom SA.

spec
	serviceAccountName: sa-test
	containers:
	- image: nginx

from inside a Pod we can do:

> cat /run/secrets/kubernetes.io/serviceaccount/token
> curl https://kubernetes.default -k -H "Authorization: Bearer SA_TOKEN"

Use SA Token to connect to the API from inside a Pod

First we need to be inside the pod

> k exec -it POD -- bash

Now we can Access the API using the SA

> mount | grep sec
tmpfs on /run/secrets/kubernetes.io/serviceaccount/token
> cat /run/secrets/kubernetes.io/serviceaccount/token # SA_TOKEN
> curl https://kubernetes.default -k -H "Authorization: Bearer SA_TOKEN"

Note: By default the SA is mounted on the POD, and we can access the SA token.

Disabling SA Mounting

If the pod does not need to talk to the Kubernetes API, we can disable it (usually we don’t need it). In version 1.6+, you can opt out of automounting API credentials for a service account by setting automountServiceAccountToken: false on the service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: build-robot
automountServiceAccountToken: false
...

In version 1.6+, you can also opt out of automounting API credentials for a particular pod:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  serviceAccountName: build-robot
  automountServiceAccountToken: false
  ...

Now the SA is not mounted on the pod.

Limit SA using RBAC

For example if we want to edit the default SA to delete secrets on the prod namespace. For this job we will use the ClusterRole “edit”

> k create clusterrolebinding crb-test --clusterrole edit --serviceaccount default:prod

Now if we go inside to the pod, and we use the SA Token, we can delete secrets on the production namespace

Update Kubernetes frequently

Upgrade frequentily is good because:

  • Support
  • Security fixes
  • Bug fixes
  • Stay up to date for dependencies

Releases Cycles

The Kubernetes project maintains release branches for the most recent three minor releases (1.23, 1.22, 1.21). Kubernetes 1.19 and newer receive approximately 1 year of patch support. Kubernetes 1.18 and older received approximately 9 months of patch support.

Kubernetes versions are expressed as x.y.z, where x is the major version, y is the minor version, and z is the patch version, following Semantic Versioning terminology.

Upgrade Cluster

  1. First upgrade the master componentes
    1. apiserver, controller-manager, scheduler
  2. Worker components
    1. kubelet, kube-proxy

Note: Components same minor version as apiserver or one below

Upgrade node

  1. kubectl drain
  2. Do te upgrade
  3. kubectl uncordon

For more information: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

System Hardening

Minimize host OS footprint (reduce attack surface)

To reduce the attack surface we can use the least privilege principle on the following topics:

  • Limit Node Access

    • Limit control plane access to Internet using VPN
    • Limit access using CIDR ranges
    • Limit access to specified users (only admins)
    • Limit access using /etc/passwd /etc/shadow /etc/group
    • Delete unnecesary users/groups
    • SSH Hardening
      • Disable SSH for root Account PermitRootLogin no
      • Disable password login PasswordAuthentication no
  • RBAC Access

  • Remove obsolete packages & services

    For example remove apache server List services

    > systemctl list-units --type service
    

    List all packages

    > apt list --installed
    

    Disable service

    > systemctl disable --now apache2
    

    Remove package

    > apt remove apache2
    
  • Restrict Network Access

    Use firewalls, such UFW on Linux

  • Restrict Obsolete Kernel modules

    List current kernel modules

    > lsmod
    

    sctp and dccp are 2 unneccesary modules on K8S

    > cat /etc/modprobe.d/blacklist.conf
    blacklist sctp
    blacklist dccp
    > shutdown -r now
    > lsmod | egrep "sctp|dccp"
    
  • Indentify and Fix open ports Check Open Ports

    > netstat -an | grep -w LISTEN
    

    Check Service

    > cat /etc/services | grep -w 53
    

Minimize IAM roles

Don’t use the root user, and set users with least privileges principle. Assign permissions to groups, and no to users, and assign the user to a group.

Minimize external access to the network

Using UFW on Linux (simple frontend for iptables) Install

> apt-get update
> apt-get install ufw
> systectl enable ufw --now

Get status

> ufw status

Add default rule to allow all outbound connections

> ufw default alllow outgoing

Add default deny incoming

> ufw default deny incoming

Add allow rule for an specified ip to ssh

> ufw allow from 192.168.0.20 to any port 22 proto tcp

Note: any is because any ip on the server

Add allow rule for web on a specified cicr

> ufw allow from 192.168.0.0/24 to any port 80 proto tcp

Activate the firewall and check

> ufw enable
> ufw status

Delete a specified rule

> ufw delete deny 8080

We could delete using the line number

> ufw status numbered # (check the line number)
> ufw delete 5

Firewall stopped and disabled on system startup

> ufw disable

Appropriately use kernel hardening tools such as AppArmor, seccomp

Restrict Syscalls using secomp

Seccomp stands for secure computing mode and has been a feature of the Linux kernel since version 2.6.12. It can be used to sandbox the privileges of a process, restricting the calls it is able to make from userspace into the kernel. Kubernetes lets you automatically apply seccomp profiles loaded onto a Node to your Pods and containers.

Secomp have 3 modes:

  • mode 0 : disabled
  • mode 1 : strict
  • mode 2: filtered

Is critical to limit the syscall to the applications. The default docker blocks around 60 of the 300 syscalls.

  • reboot, mount, unmount
  • clock_adjtime, swapoff, stime,

For example we cannot change the date on a container because the secomp configured on it.

https://kubernetes.io/docs/tutorials/clusters/seccomp/

Seccomp in Kubernetes

We can use seccomp in K8S using the amicontained image

> kubectl run amicontained --image r.j3ss.co/amicontained amicontained -- amicontained
> kubectl logs amicontained

Kubernetes doesn’t apply seccomp by default.

However we can activate using the securityContext parameter.

Examples:

Pod that uses the container runtime default

apiVersion: v1
kind: Pod
metadata:
  name: audit-pod
  labels:
    app: audit-pod
spec:
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: test-container
    image: hashicorp/http-echo:0.2.3
    args:
    - "-text=just made some syscalls!"
    securityContext:
      allowPrivilegeEscalation: false

With custom profile

spec:
  securityContext:
    seccompProfile:
      type: Localhost
      localhostProfile: profiles/violation.json

violation.json (we need to create the violation.json previously)

{
    "defaultAction": "SCMP_ACT_ERRNO"
}

Is a good practice to disable the allowPrivilegeEscalation

The logs for seccomp are at /var/log/syslog

Seccomp profile location by default is set to /var/lib/kubelet/seccomp.

AppArmor

AppArmor is a Linux kernel security module that supplements the standard Linux user and group based permissions to confine programs to a limited set of resources. AppArmor can be configured for any application to reduce its potential attack surface and provide greater in-depth defense. It is configured through profiles tuned to allow the access needed by a specific program or container, such as Linux capabilities, network access, file permissions, etc. Each profile can be run in either enforcing mode, which blocks access to disallowed resources, or complain mode, which only reports violations.

AppArmor profiles:

AppArmor profiles are specified per-container. To specify the AppArmor profile to run a Pod container with, add an annotation to the Pod’s metadata:

container.apparmor.security.beta.kubernetes.io/<container_name>: <profile_ref>

Where <container_name> is the name of the container to apply the profile to, and <profile_ref> specifies the profile to apply. The profile_ref can be one of:

  • runtime/default to apply the runtime’s default profile
  • localhost/<profile_name> to apply the profile loaded on the host with the name <profile_name>
  • unconfined to indicate that no profiles will be loaded

Check if AppArmor module is loaded

> aa-status
56 profiles are loaded.
19 profiles are in enforce mode.

Example: Load profile usr.sbin.nginx located in the default AppArmor profiles directory.

apparmor_parser -q /etc/apparmor.d/usr.sbin.nginx

Since we don’t know where the Pod will be scheduled, we’ll need to load the profile on all our nodes. For this example we’ll use SSH to install the profiles, but other approaches are discussed in Setting up nodes with profiles.

NODES=(
    # The SSH-accessible domain names of your nodes
    gke-test-default-pool-239f5d02-gyn2.us-central1-a.my-k8s
    gke-test-default-pool-239f5d02-x1kf.us-central1-a.my-k8s
    gke-test-default-pool-239f5d02-xwux.us-central1-a.my-k8s)
for NODE in ${NODES[*]}; do ssh $NODE 'sudo apparmor_parser -q <<EOF
#include <tunables/global>

profile k8s-apparmor-example-deny-write flags=(attach_disconnected) {
  #include <abstractions/base>

  file,

  # Deny all file writes.
  deny /** w,
}
EOF'
done

Example:

apiVersion: v1
kind: Pod
metadata:
  name: hello-apparmor
  annotations:
    # Tell Kubernetes to apply the AppArmor profile "k8s-apparmor-example-deny-write".
    # Note that this is ignored if the Kubernetes node is not running version 1.4 or greater.
    container.apparmor.security.beta.kubernetes.io/hello: localhost/k8s-apparmor-example-deny-write
spec:
  containers:restricted-frontend
  - name: hello
    image: busybox
    command: [ "sh", "-c", "echo 'Hello AppArmor!' && sleep 1h" ]

Check

> kubectl get events | grep hello-apparmor
> kubectl exec hello-apparmor cat /proc/1/attr/current
k8s-apparmor-example-deny-write (enforce)
> kubectl exec hello-apparmor touch /tmp/test
touch: /tmp/test: Permission denied

https://kubernetes.io/docs/tutorials/clusters/apparmor/

Minimize Microservice Vulnerabilities

Setup appropriate OS level security domains e.g. using PSP, OPA, security contexts

Admission Controllers

An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized. The controllers consist of the list below, are compiled into the kube-apiserver binary, and may only be configured by the cluster administrator. In that list, there are two special controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook. These execute the mutating and validating (respectively) admission control webhooks which are configured in the API.

Admission controllers may be “validating”, “mutating”, or both. Mutating controllers may modify the objects they admit; validating controllers may not.

Admission controllers limit requests to create, delete, modify or connect to (proxy). They do not support read requests.

The admission control process proceeds in two phases. In the first phase, mutating admission controllers are run. In the second phase, validating admission controllers are run. Note again that some of the controllers are both.

If any of the controllers in either phase reject the request, the entire request is rejected immediately and an error is returned to the end-user.

Finally, in addition to sometimes mutating the object in question, admission controllers may sometimes have side effects, that is, mutate related resources as part of request processing. Incrementing quota usage is the canonical example of why this is necessary. Any such side-effect needs a corresponding reclamation or reconciliation process, as a given admission controller does not know for sure that a given request will pass all of the other admission controllers.

Why do I need them?

Many advanced features in Kubernetes require an admission controller to be enabled in order to properly support the feature. As a result, a Kubernetes API server that is not properly configured with the right set of admission controllers is an incomplete server and will not support all the features you expect.

Request to create a pod

Kubectl -> Authentication -> Authorization -> Adminission Controllers -> Create pod

View Enabled Admissions controllers

Some adminissions controllers are enabled by default, we could check using the apiserver

> kube-apiserver -h | grep enable-admission-plugins

Or we can use

> kubectl exec kube-apiserver-controlplane -n kube-system -- kube-apiserver -h | grep enable-admission-plugins
--enable-admission-plugins strings       admission plugins that should be enabled in addition to default enabled ones (NamespaceLifecycle, LimitRanger, ServiceAccount, TaintNodesByCondition, Priority, DefaultTolerationSeconds, DefaultStorageClass, StorageObjectInUseProtection, PersistentVolumeClaimResize, RuntimeClass, CertificateApproval, CertificateSigning, CertificateSubjectRestriction, DefaultIngressClass, MutatingAdmissionWebhook, ValidatingAdmissionWebhook, ResourceQuota). Comma-delimited list of admission plugins: AlwaysAdmit, AlwaysDeny, AlwaysPullImages, CertificateApproval, CertificateSigning, CertificateSubjectRestriction, DefaultIngressClass, DefaultStorageClass, DefaultTolerationSeconds, DenyEscalatingExec, DenyExecOnPrivileged, EventRateLimit, ExtendedResourceToleration, ImagePolicyWebhook, LimitPodHardAntiAffinityTopology, LimitRanger, MutatingAdmissionWebhook, NamespaceAutoProvision, NamespaceExists, NamespaceLifecycle, NodeRestriction, OwnerReferencesPermissionEnforcement, PersistentVolumeClaimResize, PersistentVolumeLabel, PodNodeSelector, PodSecurityPolicy, PodTolerationRestriction, Priority, ResourceQuota, RuntimeClass, SecurityContextDeny, ServiceAccount, StorageObjectInUseProtection, TaintNodesByCondition, ValidatingAdmissionWebhook.

Add new admission controller

We need to add to the kube-apiserver.yaml

> grep NamespaceAuto /etc/kubernetes/manifests/kube-apiserver.yaml 
    - --enable-admission-plugins=NodeRestriction,NamespaceAutoProvision

Disabled admission controller

> grep admission /etc/kubernetes/manifests/kube-apiserver.yaml 
    - --enable-admission-plugins=NodeRestriction,NamespaceAutoProvision
    - --disable-admission-plugins=DefaultStorageClass

Note: the NamespaceExists and NamespaceAutoProvision admission controllers are deprecated and now replaced by NamespaceLifecycle admission controller. The NamespaceLifecycle admission controller will make sure that requests to a non-existent namespace is rejected and that the default namespaces such as default, kube-system and kube-public cannot be deleted.

Validating and Mutating Admission Controllers

Mutating: Could change the request (First the mutating admission controllers runs) Validating: Limit requests to create, delete, modify

Create Admission Controllers.

  • MutatingAdmissionWebhook
  • ValidatingAdmissionWebhook

NOTE: COMPLICATED TOPIC, in special making a custom Admission Controllers. However, very useful

OPA

More often than not organizations need to apply various kinds of policies on the environments where they run their applications. These policies might be required to meet compliance requirements, achieve a higher degree of security, achieve standardization across multiple environments, etc. This calls for an automated/declarative way to define and enforce these policies. Policy engines like OPA help us achieve the same.

Opa in Kubernetes

We can integrate OPA and Admission Controllers, for example creating a custom Adminission Controllers (mutatingwebhook / validatingwebhook) and validate with a OPA server.

Steps:

  1. Install Opa server on K8S (or externally)
  2. Creating the validating (or mutating) webhook config

The kube-mgmt agent is deployed with OPA (sidecar containter) and is the responsible of:

  • Replicate Kubernetes resources to OPA
  • Load policies into OPA via Kubernetes

To enable kube-mgmt to automatically identify policies defined in kubernetes and load them into OPA we need to create configmaps with the label openpolicyagent.org/policy set to rego

Configmap info

package kubernetes.admission

deny[msg] {
  input.request.kind.kind == "Pod"
  image := input.request.object.spec.containers[_].image
  not startswith(image, "hooli.com/")
  msg := sprintf("image '%v' comes from untrusted registry", [image])
}

Creating CM

> kubectl create configmap untrusted-registry --from-file=untrusted-registry.rego

Deny All Policy

First list the crd

> k get crd
> k get constrainttemplates

Create the ConstraintTemplate

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8salwaysdeny
spec:
  crd:
    spec:
      names:
        kind: K8sAlwaysDeny
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            message:
              type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8salwaysdeny
        violation[{"msg": msg}] {
          1 > 0
          msg := input.parameters.message
        }

Now we can create resources type K8sAlwaysDeny

Create the Constrain

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAlwaysDeny
metadata:
  name: pod-always-deny
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
  parameters:
    message: "ACCESS DENIED!"

Apply both files, and verify

> k get K8sAlwaysDeny
> k describe K8sAlwaysDeny

Pod Security Policy

Pod Security Policy is an Admission Controller, if it detects a match that we have configured, then the request is rejected, for example to limite the securityContext to deny the privileged and runasuser0. Enable the Pod Securty Policy Admission Controller is like the others AC.

> grep admission /etc/kubernetes/manifests/kube-apiserver.yaml 
    - --enable-admission-plugins=PodSecurityPolicy

Now we can create a pod security policy configuration, for example to deny the privileged mode.

kind: PodSecurityPolicy
metadata:
  name: example-psp
spec:
  privileged: false
  seLinux:
    rule: RunAsAny
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny

Note: The others fiels are mandatories.

Check the psp

> kubectl get psp
NAME          PRIV    CAPS   SELINUX    RUNASUSER   FSGROUP    SUPGROUP   READONLYROOTFS   VOLUMES
example-psp   false          RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            configMap,secret,emptyDir,hostPath

Now we need to authorize the pod to use the API pod security policy, to do that we need to create a new role and bind it to the default SA account in that NS and after that we need to create the rolebinding

Now we have the podsecuritypolicy enabled, and the policy created, and the roles created, if we try to create a pod with privileged=True it will fail.

Error from server (Forbidden): error when creating "/root/pod.yaml": pods "example-app" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "CAP_SYS_BOOT": capability may not be added]

Manage Kubernetes secrets

Basic Secrets info

Kubernetes Secrets let you store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image

Creating Secrets

  • Imperative:
> kubectl create secret generic \ 
	app-secret --from-literal=DB_Host=mysql \
			   --from-literal=DB_User=root

or we can use a file:

> kubectl create secret generic \
	app-secret --from-file=app_secret.properties
  • Declarative:
> kubectl create -f secret-data.yaml

secret-data.yaml:

apiVersion: v1
kind: Secret
metadata:
	name: app-secret
data:
	DB_Host: bxlzcWw=
	DB_User: cm9vdA==
	DB_Password: cFGzd3Jk	

For encode the data, we need to do for example:

echo -n 'mysql' | base64

For decode the data:

echo -n 'bxlzcWw=' | base64 --decode

View Secrets

> kubectl get secrets
> kubectl describe secrets

To view the values:

> kubectl get secret app-secret -o yaml

Secret in pods

apiVersion: v1
Kind: Pod
Metadata:
	name: simple-webapp-color
	labels:
		name: simple-webapp-color
spec:
	containers:
	- name: simple-webapp-color
	  image: simple-webapp-color
	  ports:
	  	- containerPort: 8080
      envFrom:
      - SecretRef:
      		name: app-secret

In Secrets in pods as Volume we can see the secret inside the container

> ls /opt/app-secret-volumes
DB_Host		DB_Password		DB_User
> cat /opt/app-secret-volumes/DB_Password
paswrd

A note about secrets

Remember that secrets encode data in base64 format. Anyone with the base64 encoded secret can easily decode it. As such the secrets can be considered as not very safe.

Hack Secrets in Container

COMPLETE

Hack Secrets in ETCD

Access secret int etcd, first we need to copy the certs path

> cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd

Now we can access to the etcd healt to verifiy the connection

> ETCDCTL_API=3 etcdctl --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key --cacert /etc/kubernetes/pki/etcd/ca.crt endpoint health
# --endpoints "https://127.0.0.1:2379" not necessary because we’re on same node

Finally we could get the secret from the ETCD

> ETCDCTL_API=3 etcdctl --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key --cacert /etc/kubernetes/pki/etcd/ca.crt get /registry/secrets/default/secret1

Note: default is the namespace and secret1 is the secret name. The secret is saved in plain text.

Encrypt ETCD

COMPLETE

Use container runtime sandboxes in multi-tenant environments (e.g. gvisor, kata containers)

Gvisor

gVisor is an application kernel, written in Go, that implements a substantial portion of the Linux system call interface. It provides an additional layer of isolation between running applications and the host operating system.

gVisor provides a virtualized environment in order to sandbox containers. The system interfaces normally implemented by the host kernel are moved into a distinct, per-sandbox application kernel in order to minimize the risk of a container escape exploit.

On Kubernetes Gvisor allow us to isolate Pods in our K8s.

Katacontainers

Kata Containers is an open source community working to build a secure container runtime with lightweight virtual machines that feel and perform like containers, but provide stronger workload isolation using hardware virtualization technology as a second layer of defense.

Runtimes in Kubernetes

We can specified a Runtime on Kubernetes, using Katacontainers or Gvisor

  • gVisor = Handler -> runsc
  • Kata = Handler -> kata

Example:

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
    name: secure-runtime
handler: runsc

And after that we can specified on the pod which runtimeClass use.

apiVersion: v1
kind: Pod
metadata:
    name: nginx-1
    labels:
        name: nginx
spec:
   runtimeClassName: secure-runtime
   containers:
     - name: nginx
       image: nginx
       ports:
        - containerPort: 8080

Check runtimeclasses

> kubectl describe runtimeclasses gvisor  | grep Handler
Handler:      runsc

Note: Docker makes use of runc to start containers.

Implement pod to pod encryption by use of mTLS

What is mTLS?

Mutual TLS, or mTLS for short, is a method for mutual authentication. mTLS ensures that the parties at each end of a network connection are who they claim to be by verifying that they both have the correct private key.

mTLS is often used in a Zero Trust security framework* to verify users, devices, and servers within an organization. It can also help keep APIs secure.

Zero Trust means that no user, device, or network traffic is trusted by default, an approach that helps eliminate many security vulnerabilities.

Pod a Pod using mTLS

With mutual authenticatoin both pots prove to each other that they are indeed the real pods

Istio or Linkerd (service Mesh services) they facilitate mTLS encryption between pods. For example if we use Istio (the most famous service mesh), Istio deploys a sidecar container in every pod, this container encrypt/decrypt the Data. App1 -> Sidecar App 1 -> Sidecar App 2 -> App2

More information about how istio works: https://istio.io/latest/blog/2019/data-plane-setup/

Note: Istio/Linkerd is not part of the CKS and nothing is specified in the certification about mTLS

Supply Chain Security

Minimize base image footprint

On the Dockerfile config we could configure some parameters to make the base image smaller:

  • Use lighter base image (slim, alpine, minimal)
  • Generate less layers (using run &)
  • Cleaning the image (remove shells, tools)

Another option is using Multi-Stage

For example, we can build and application, and execute a lighter image container to run it.

# build container stage 1
FROM ubuntu
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y golang-go
COPY app.go .
RUN CGO_ENABLED=0 go build app.go

# app container stage 2
FROM alpine
COPY --from=0 /app .
CMD ["./app"]

Secure and harden images

  • Use specified package versions

    FROM alpine:3.11.6
    RUN apt-get update && apt-get install -y golang-go=2:1.13
    
  • Don’t run as root

    RUN addgroup -S appgroup && adduser -S appuser -G appgroup -h /home/appuser
    COPY --from=0 /app /home/appuser/
    USER appuser
    
  • Make filesysteam read only

    RUN chmod a-w /etc
    

    Note: Do the same with all directories we don’t need write permission.

  • Remove shell access

    RUN rm -rf /bin/*
    

More information: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/

Secure your supply chain: whitelist allowed registries, sign and validate images

Use a Docker private registry

Is a good practice to use Docker Images from a private registry instead a public image as Dockerhub.

To use a Image from a private registry on Kubernetes, we need to follow some steps:

  • Create a secret object (type docker-registry) with the credentials

    > kubectl create secret docker-registry private-reg-cred --docker-username=dock_user --docker-password=dock_password --docker-server=myprivateregistry.com:5000 --docker-email=dock_user@myprivateregistry.com
    
  • Create the pod/deploy using the secret (imagePullSecrets)

    spec:
      containers:
        - name: foo
          image: janedoe/awesomeapp:v1
      imagePullSecrets:
        - name: myregistrykey
    

Whitelist images

Whitelist images repository using ImagePolicyWebhook Admission Controller

We can limit to only use images from whitelisted repositories using a ImagePolicyWebhook Adminission Controller

Limit using images with latest tag and ensure that all images have tags.

  1. Deploy an Image Policy Webhook server

    apiVersion: apps/v1
    kind: Deployment
    ...
        spec:
          containers:
            - name: image-bouncer-webhook
              imagePullPolicy: Always
              image: "kainlite/kube-image-bouncer:latest"
              args:
                - "--cert=/etc/admission-controller/tls/tls.crt"
                - "--key=/etc/admission-controller/tls/tls.key"
                - "--debug"
                - "--registry-whitelist=docker.io,k8s.gcr.io"
              volumeMounts:
                - name: tls
                  mountPath: /etc/admission-controller/tls
          volumes:
            - name: tls
              secret:
                secretName: tls-image-bouncer-webhook
    
  2. Create the AdmissoinConfiguration

    apiVersion: apiserver.config.k8s.io/v1
    kind: AdmissionConfiguration
    plugins:
    - name: ImagePolicyWebhook
      configuration:
        imagePolicy:
          kubeConfigFile: /etc/kubernetes/pki/admission_kube_config.yaml
          allowTTL: 50
          denyTTL: 50
          retryBackoff: 500
          defaultAllow: false
    

    With the configuration from admission_kube_config.yaml

    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /etc/kubernetes/pki/server.crt
        server: https://image-bouncer-webhook:30080/image_policy
      name: bouncer_webhook
    contexts:
    - context:
        cluster: bouncer_webhook
        user: api-server
      name: bouncer_validator
    current-context: bouncer_validator
    preferences: {}
    users:
    - name: api-server
      user:
        client-certificate: /etc/kubernetes/pki/apiserver.crt
        client-key:  /etc/kubernetes/pki/apiserver.key
    

    Note: The 30080 port is from the image-bouncer-webhook service exposed as Nodeport

  3. Enable the ImagePolicyWebhook so that our image policy validation can take place in API server

    > grep admission /etc/kubernetes/manifests/kube-apiserver.yaml 
        - --enable-admission-plugins=NodeRestriction,ImagePolicyWebhook
        - --admission-control-config-file=/etc/kubernetes/pki/admission_configuration.yaml
    

    Note: API server will automatically restart and pickup this configuration.

  4. Now we can try to create a pod using the latest image

    > kubectl run nginx --image nginx:latest
    Error from server (Forbidden): pods "nginx" is forbidden: image policy webhook backend denied one or more images: Images using latest tag are not allowed
    

Another example using OPA:

Whitelist only images from docker.io and k8s.grc.io

Creating the template.yaml

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8strustedimages
spec:
  crd:
    spec:
      names:
        kind: K8sTrustedImages
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8strustedimages
        violation[{"msg": msg}] {
          image := input.review.object.spec.containers[_].image
          not startswith(image, "docker.io/")
          not startswith(image, "k8s.gcr.io/")
          msg := "not trusted image!"
        }

Creating constraint.yaml

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sTrustedImages
metadata:
  name: pod-trusted-images
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]

Apply

> k create -f template.yaml
> k create -f constraint.yaml

Verify

> k get constrainttemplate
> k get k8strustedimages
> k describe k8strustedimages pod-trusted-images

Use static analysis of user workloads (e.g.Kubernetes resources, Docker files)

We can analyze the Kubernetes yaml before creating using kubectl using tools as Kubesec which makes an static analysis of user Workloads.

**Kubesec **quantifies risk for Kubernetes resources by validating the configuration files and manifest files used for Kubernetes deployments and operations. Kubesec shows a score and the reason.

We can deploy kubesec locally, as a server, or using the SaaS version.

  1. Installing Kubesec using binary files

    > wget https://github.com/controlplaneio/kubesec/releases/download/v2.11.0/kubesec_linux_amd64.tar.gz
    > tar -xvf  kubesec_linux_amd64.tar.gz
    > mv kubesec /usr/bin/
    
  2. Analyze a file using kubesec and generate a report file

    > kubesec scan node.yaml  > kubesec_report.json
    
  3. View report

    [ # Only show some lines
      {
        "object": "Pod/node.default",
        "valid": true,
        "fileName": "/root/node.yaml",
        "message": "Failed with a score of -27 points",
        "score": -27,
        "scoring": {
          "critical": [
            {
              "id": "Privileged",
              "selector": "containers[] .securityContext .privileged == true",
              "reason": "Privileged containers can allow almost completely unrestricted host access",
              "points": -30
            }
          ],
          "passed": [
            {
              "id": "ServiceAccountName",
              "selector": ".spec .serviceAccountName",
              "reason": "Service accounts restrict Kubernetes API access and should be configured with least privilege",
              "points": 3
            }
          ],
          "advise": [
            {
              "id": "ApparmorAny",
              "selector": ".metadata .annotations .\"container.apparmor.security.beta.kubernetes.io/nginx\"",
              "reason": "Well defined AppArmor policies may provide greater protection from unknown threats. WARNING: NOT PRODUCTION READY",
              "points": 3
            },
            {
              "id": "ReadOnlyRootFilesystem",
              "selector": "containers[] .securityContext .readOnlyRootFilesystem == true",
              "reason": "An immutable root filesystem can prevent malicious binaries being added to PATH and increase attack cost",
              "points": 1
            },
            {
              "id": "RunAsNonRoot",
              "selector": "containers[] .securityContext .runAsNonRoot == true",
              "reason": "Force the running image to run as a non-root user to ensure least privilege",
              "points": 1
            },
            {
              "id": "RunAsUser",
              "selector": "containers[] .securityContext .runAsUser -gt 10000",
              "reason": "Run as a high-UID user to avoid conflicts with the host's user table",
              "points": 1
            }
          ]
        }
      }
    ]
    

More information: https://kubesec.io/

Scan images for known vulnerabilities

To scan images for know vulnerabilities we can use Trivy. Trivy looks on the CVE, and we can check with diferents Severity Scores from 0 to 10.

CVSS V3.0 Ratings

  • Low 0-4
  • Medium 4-7
  • High 7-9
  • Critical 9-10

Trivy

  1. Install Trivy

    # Add the trivy-repo
    > apt-get  update
    > apt-get install wget apt-transport-https gnupg lsb-release
    > wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
    echo deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main | sudo tee -a /etc/apt/sources.list.d/trivy.list
    # Update Repo and Install trivy
    > apt-get update
    > apt-get install trivy
    
  2. Analyze Docker Image

    > docker pull nginx:1.19.0
    > trivy image nginx:1.19.0
    Nginx (alpine 3.12.3)
    ======================================
    Total: 41 (UNKNOWN: 0, LOW: 2, MEDIUM: 8, HIGH: 28, CRITICAL: 3)
    
  3. Analyze with a specified rating

    > trivy image --severity CRITICAL,HIGH nginx:1.19.0
    

Best Practices

  • Continuously rescan images
  • K8S Admission Controllers to scan images
  • Have your own repository with pre-scanned images ready to go
  • Integrate scanning into your CICD

Monitoring, Logging and Runtime Security

Perform behavioral analytics of syscall process and file activities at the host and container level to detect malicious activities

Perform behavioral analytics of syscall process

Falco acts as a security camera detecting unexpected behavior, intrusions, and data theft in real time.

For example, Falco alerts if somebody wants to see a password file, o clean a log file.

Install Falco

Detect Threats using Falco

Check the Falco service logs (config file on /etc/falco/falco.yaml)

> journalctl -fu falco

Note: Rules are read in the order of files in the list. If the same rule is present in all the files, the one in the last file overrides the others.

Falco hot reload

> kill -1 $(cat /var/run/falco.pid)

Detect threats within physical infrastructure, apps, networks, data, users and workloads

Detect all phases of attack regardless where it occurs and how it spreads

Perform deep analytical investigation and identification of bad actors within environment

Ensure immutability of containers at runtime

To ensure immutability of containers, one way to do it is using security context and setting as readOnlyRootFilesystem (this maybe break the application), another way is using app volumes to the parts the app need to write, for example cache and run.

In summary, the key to ensure a pod is immutable is to use readOnlyRootFilesystem: true and use volumes for temporary data in case we need it. A good practice is using in combination to Privileged: false and runAsUser: 100

Example

spec:
  containers:
  - image: httpd
    name: apache2
    securityContext:
      readOnlyRootFilesystem: true
    volumeMounts:
    - mountPath: /usr/local/apache2/logs
      name: log-volume
  volumes:
  - name: log-volume
    emptyDir: {}

Use Audit Logs to monitor access

With Audit Policies, the cluster audits the activities generated by users, by applications that use the Kubernetes API, and by the control plane itself. We can select what data to store.

The first matching rule sets the audit level of the event. The defined audit levels are:

  • None - don’t log events that match this rule.
  • Metadata - log request metadata (requesting user, timestamp, resource, verb, etc.) but not request or response body.
  • Request - log event metadata and request body but not response body. This does not apply for non-resource requests.
  • RequestResponse - log event metadata, request and response bodies. This does not apply for non-resource requests.

Note: The RequestResponse level provides the most verbose logs that includes the Metadata, request body and the response body.

You can pass a file with the policy to kube-apiserver using the --audit-policy-file flag. If the flag is omitted, no events are logged. Note that the rules field must be provided in the audit policy file. A policy with no (0) rules is treated as illegal.

Sample Policies:

Logging all metadata events:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  - level: Metadata

Another big example:

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

Note: The rules are checked in order, the first mathing rule sets the “audit level” of the event

Changes on our Audit Policy

Steps to make changes:

  1. Change policy file
  2. Disable audit logging in apiserver, wait till restart
  3. Enable audit logging in apiserver, wait till restart
  4. Test

Point 2/3 can be replace for move the kube-api manifest file and copy again to the path

Exercise:

Restrict logged data with an Audit Policy

  • Nothing from stage RequestReceived
  • Nothing from “get”, “watch”, “list”
  • From Secrets only metada level
  • Everything else requestResponse level

Answer:

apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
  - "RequestReceived"
rules:

  - level: None
    verbs: ["get","watch","list"]
    
  - level: Metadata
    resources:
    - group: "" 
      resources: ["secrets"]
  
  - level: RequestResponse

Exercise: Enable auditing in this kubernetes cluster. Create a new policy file that will only log events based on the below specifications:

Namespace: prod Operations: delete Resources: secrets Log Path: /var/log/prod-secrets.log Audit file location: /etc/kubernetes/prod-audit.yaml Maximum days to keep the logs: 30

Answer: Create /etc/kubernetes/prod-audit.yaml as below:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
  namespaces: ["prod"]
  verbs: ["delete"]
  resources:
  - group: ""
    resources: ["secrets"]

Next, make sure to enable logging in api-server:

 - --audit-policy-file=/etc/kubernetes/prod-audit.yaml
 - --audit-log-path=/var/log/prod-secrets.log
 - --audit-log-maxage=30

Then, add volumes and volume mounts as shown in the below snippets. volumes:

  - name: audit
    hostPath:
      path: /etc/kubernetes/prod-audit.yaml
      type: File

  - name: audit-log
    hostPath:
      path: /var/log/prod-secrets.log
      type: FileOrCreate

Volumes Mount:

  - name: audit
    hostPath:
      path: /etc/kubernetes/prod-audit.yaml
      type: File

  - name: audit-log
    hostPath:
      path: /var/log/prod-secrets.log
      type: FileOrCreate

Then save the file and make sure that kube-apiserver restarts.


Cheatsheet

  • AppArmor is a Linux kernel security module that supplements the standard Linux user and group based permissions to confine programs to a limited set of resources

    • Load AppAmor profile frontend and check if is loaded

      apparmor_parser -q /etc/apparmor.d/frontend
      aa-status | grep frontend 
      
    • Add to the pod

      Metadata
      	container.apparmor.security.beta.kubernetes.io/frontend-site: localhost/frontend
      

      Note: Remember to use the profile name instead to file name

  • Decode a secret

    echo YmpCUGJqTkRRRzVJUUdOclRUTT0K | base64 -d
    
  • Trivy check vulnerabilities only CRITICAL

    trivy image --severity CRITICAL kodekloud/webapp-delayed-start
    
  • Seccomp allow us to restrict Syscalls.

    Enable profile /var/lib/kubelet/seccomp/profiles/audit.json

    {
        "defaultAction": "SCMP_ACT_ERRNO",
        "architectures": [
            "SCMP_ARCH_X86_64",
            "SCMP_ARCH_X86",
            "SCMP_ARCH_X32"
        ],
        "syscalls": [
            {
                "names": [
                    "accept4",
                    "read"
                    "writev"
                ],
                "action": "SCMP_ACT_ALLOW"
            }
        ]
    }
    

    Use on a pod

    spec:
      securityContext:
        seccompProfile:
          type: Localhost
          localhostProfile: profiles/audit.json
      containers:
      - image: nginx
    
  • Falco acts as a security camera detecting unexpected behavior, intrusions, and data theft in real time. For example, Falco alerts if somebody wants to see a password file, o clean a log file.

    • Check falco logs

      journalctl -u falco
      
    • Enable file_output in /etc/falco/falco.yaml

      file_output:
        enabled: true
        keep_alive: false
        filename: /opt/security_incidents/alerts.log
      
    • Add a custom rule under /etc/falco/falco_rules.local.yaml (copy original and edit some parameters)

    • Restart Falco

      kill -1 $(cat /var/run/falco.pid)
      # or systemctl restart falco
      
  • Runtime Class

    • Create a pod with gvisor

      spec:
          runtimeClassName: gvisor
          containers:
              - image: nginx
      
  • Makes sure that the service account token secret is not mounted in the pod

     containers:
      - image: nginx
        name: apps-cluster-dash
      serviceAccountName: cluster-view
      automountServiceAccountToken: false
    
  • Kubesec allows us to analyze yaml, also shows a score and the reason/improvements

    kubesec scan node.yaml  > kubesec_report.json
    
  • Inmutabiility fail with:

    • privileged: true
    • readOnlyRootFilesystem: true
  • Podsecurity is a K8S resource, we can limit privileged containers, volumes types, and more. First, enable podsecurity on K8S kube-apiserver /etc/kubernetes/manifests/kube-apiserver.yaml

        - --enable-admission-plugins=NodeRestriction,PodSecurityPolicy
    

    Create a podsecurity resource, disabling privileged true and limit the volumes

    apiVersion: policy/v1beta1
    kind: PodSecurityPolicy
    metadata:
      name: pod-psp
    spec:
      privileged: false
      seLinux:
        rule: RunAsAny
      runAsUser:
        rule: RunAsAny
      supplementalGroups:
        rule: RunAsAny
      fsGroup:
        rule: RunAsAny
      volumes:
      - configMap
      - secret
      - emptyDir
      - hostPath
    

    Now if we need to create a pod, we need to adapt the code to validate the podsecurity config, for example, priviliged: false

  • Check users permissions from roles/rolebindings

    kubectl auth can-i update pods --as=john --namespace=development
    
  • Limit using images with latest tag and ensure that all images have tags.

    1. Deploy an Image Policy Webhook server

      apiVersion: apps/v1
      kind: Deployment
      ...
          spec:
            containers:
              - name: image-bouncer-webhook
                imagePullPolicy: Always
                image: "kainlite/kube-image-bouncer:latest"
                args:
                  - "--cert=/etc/admission-controller/tls/tls.crt"
                  - "--key=/etc/admission-controller/tls/tls.key"
                  - "--debug"
                  - "--registry-whitelist=docker.io,k8s.gcr.io"
                volumeMounts:
                  - name: tls
                    mountPath: /etc/admission-controller/tls
            volumes:
              - name: tls
                secret:
                  secretName: tls-image-bouncer-webhook
      
    2. Create the AdmissoinConfiguration

      apiVersion: apiserver.config.k8s.io/v1
      kind: AdmissionConfiguration
      plugins:
      - name: ImagePolicyWebhook
        configuration:
          imagePolicy:
            kubeConfigFile: /etc/kubernetes/pki/admission_kube_config.yaml
            allowTTL: 50
            denyTTL: 50
            retryBackoff: 500
            defaultAllow: false
      

      With the configuration from admission_kube_config.yaml

      apiVersion: v1
      kind: Config
      clusters:
      - cluster:
          certificate-authority: /etc/kubernetes/pki/server.crt
          server: https://image-bouncer-webhook:30080/image_policy
        name: bouncer_webhook
      contexts:
      - context:
          cluster: bouncer_webhook
          user: api-server
        name: bouncer_validator
      current-context: bouncer_validator
      preferences: {}
      users:
      - name: api-server
        user:
          client-certificate: /etc/kubernetes/pki/apiserver.crt
          client-key:  /etc/kubernetes/pki/apiserver.key
      

      Note: The 30080 port is from the image-bouncer-webhook service exposed as Nodeport

    3. Enable the ImagePolicyWebhook so that our image policy validation can take place in API server

      grep admission /etc/kubernetes/manifests/kube-apiserver.yaml 
          - --enable-admission-plugins=NodeRestriction,ImagePolicyWebhook
          - --admission-control-config-file=/etc/kubernetes/pki/admission_configuration.yaml
      

      Note: API server will automatically restart and pickup this configuration.

    4. Now we can try to create a pod using the latest image

      kubectl run nginx --image nginx:latest
      Error Package management process launched in container (user=root user_loginuid=-1 command=apk container_id=3ed4079e7f61 container_name=nginx image=docker.io/library/nginx:1.19
      

Categories

Recent Posts

About

Over 15-year experience in the IT industry. Working in SysOps, DevOps and Architecture roles with mission-critical systems across a wide range of industries. Wide experience with AWS, Terraform, Kubernetes, Containers, CI/CD pipelines, and Linux. Always keeping up with the latest technologies. Passionate about automating the run of the mill. Big focus on problem-solving.