Nahuel Hernandez

Nahuel Hernandez

Another personal blog about IT, Automation, Cloud, DevOps and Stuff.

Building a Production-Ready Kubernetes Cluster on AWS EKS

A comprehensive guide to setting up a production-grade Kubernetes cluster on Amazon EKS. Learn how to integrate essential components like Karpenter for efficient node provisioning, Istio for service mesh, ArgoCD for GitOps deployments, AWS Load Balancer Controller for external access, and EFS CSI Driver for persistent storage.

5-Minute Read

BotKube

In this comprehensive guide, we’ll walk through setting up a production-ready Kubernetes cluster on Amazon EKS (Elastic Kubernetes Service). We’ll explore each component and understand why they’re essential for a robust, scalable, and maintainable infrastructure.

Table of contents

Introduction

Creating a production-ready Kubernetes cluster requires more than just spinning up a basic EKS cluster. It needs careful consideration of scalability, security, monitoring, and operational efficiency. In this guide, we’ll set up a cluster with essential components that make it production-ready.

Core Components

1. Amazon EKS

Amazon Elastic Kubernetes Service (EKS) provides a managed Kubernetes control plane, handling the complexity of running Kubernetes on AWS. It offers:

  • High availability across multiple Availability Zones
  • Integration with AWS services
  • Managed control plane updates
  • Security and compliance features

2. Karpenter

Karpenter is an open-source node provisioning project built for Kubernetes. It provides:

  • Faster node provisioning compared to Cluster Autoscaler
  • Better resource utilization
  • Support for multiple instance types
  • Cost optimization through efficient scaling

3. AWS Load Balancer Controller

The AWS Load Balancer Controller manages AWS Elastic Load Balancers for a Kubernetes cluster. It:

  • Provisions Application Load Balancers (ALB) and Network Load Balancers (NLB)
  • Handles SSL/TLS termination
  • Provides advanced routing capabilities
  • Integrates with AWS WAF and Shield

4. ArgoCD

ArgoCD is a declarative GitOps continuous delivery tool for Kubernetes. It offers:

  • Automated deployment of applications
  • Git-based configuration management
  • Rollback capabilities
  • Multi-cluster support
  • UI for application management

5. Istio

Istio is a service mesh that provides:

  • Traffic management
  • Security features
  • Observability
  • Load balancing
  • Service-to-service communication

6. EFS CSI Driver

The Amazon EFS CSI Driver enables Kubernetes to use Amazon EFS as persistent storage. It provides:

  • Persistent volume support
  • Shared file system access
  • Dynamic provisioning
  • Multi-AZ support

Infrastructure Setup

Let’s examine the key scripts that set up our infrastructure:

1. Variables Configuration

# 0-setting-variables.sh
export KARPENTER_NAMESPACE="kube-system"
export KARPENTER_VERSION="1.1.2"
export K8S_VERSION="1.32"
export AWS_PARTITION="aws"
export CLUSTER_NAME="nhernandez-poc"
export AWS_DEFAULT_REGION="us-west-2"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export TEMPOUT="$(mktemp)"
export ARM_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-arm64/recommended/image_id --query Parameter.Value --output text)"
export AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"
export GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"

This script sets up essential variables for our cluster configuration, including:

  • Kubernetes version
  • Component versions
  • AWS region and partition
  • Cluster name
  • AMI IDs for different architectures (ARM, AMD, GPU)

2. EKS Cluster Creation

# 1-create-eks-cluster.sh
curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > "${TEMPOUT}" \
&& aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "${TEMPOUT}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "${K8S_VERSION}"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}

This script:

  1. Deploys Karpenter CloudFormation template
  2. Creates the EKS cluster with proper IAM roles and configurations
  3. Sets up managed node groups
  4. Configures security settings

3. Component Installation

Each component is installed using Helm charts with specific configurations:

Karpenter Installation

# 2-install-karpenter.sh
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "${KARPENTER_VERSION}" \
  --namespace "${KARPENTER_NAMESPACE}" \
  --create-namespace \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

Load Balancer Controller Installation

# 3-install-lb-controller.sh
helm upgrade --install aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n kube-system \
  --set clusterName=$CLUSTER_NAME \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller \
  --set region=$AWS_DEFAULT_REGION \
  --set vpcId=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.resourcesVpcConfig.vpcId" --output text) \
  --set image.repository=602401143452.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/amazon/aws-load-balancer-controller \
  --set image.tag=v2.11.0 \
  --set enableShield=false \
  --set enableWaf=false \
  --set enableWafv2=false \
  --version 1.11.0

ArgoCD Installation

# 4-install-argocd.sh
helm upgrade --install argocd argo/argo-cd \
  -n "$ARGOCD_NAMESPACE" \
  --create-namespace \
  --set redis-ha.enabled="$HA_ENABLED"

Istio Installation

# 5-install-istio.sh
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update

helm install istio-base istio/base -n istio-system --create-namespace
helm install istiod istio/istiod -n istio-system --wait
helm install istio-ingress istio/gateway -n istio-system --wait

Istio provides:

  • Traffic management between services
  • Security with mTLS
  • Observability with metrics, logs, and traces
  • Load balancing
  • Circuit breaking
  • Fault injection

EFS CSI Driver Installation

# 6-install-efs-driver.sh
helm repo add aws-efs-csi-driver https://kubernetes-sigs.github.io/aws-efs-csi-driver/
helm repo update

helm upgrade --install aws-efs-csi-driver aws-efs-csi-driver/aws-efs-csi-driver \
  --namespace kube-system \
  --set controller.serviceAccount.create=true \
  --set controller.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="arn:aws:iam::${AWS_ACCOUNT_ID}:role/AmazonEKS_EFS_CSI_DriverRole" \
  --set controller.serviceAccount.name=efs-csi-controller-sa

The EFS CSI Driver enables:

  • Dynamic provisioning of EFS volumes
  • Persistent storage for stateful applications
  • Shared file system access across pods
  • Multi-AZ support for high availability

Component Integration

The components work together to provide a complete production environment:

  1. Infrastructure Layer

    • EKS provides the Kubernetes control plane
    • Karpenter handles node provisioning
    • AWS Load Balancer Controller manages external access
  2. Service Mesh Layer

    • Istio provides service-to-service communication
    • Traffic management and security
    • Observability and monitoring
  3. Storage Layer

    • EFS CSI Driver provides persistent storage
    • Shared file system access
    • Multi-AZ support
  4. Deployment Layer

    • ArgoCD manages application deployments
    • GitOps workflow
    • Automated rollbacks

Best Practices

  1. Security

    • Use IAM roles for service accounts
    • Implement network policies
    • Regular security updates
    • Secrets management
  2. Scalability

    • Implement proper resource limits
    • Use Karpenter for efficient scaling
    • Configure HPA for applications
  3. Monitoring

    • Set up logging
    • Implement metrics collection
    • Configure alerts
  4. Maintenance

    • Regular updates
    • Backup procedures
    • Disaster recovery plan

The combination of:

  • EKS for the control plane
  • Karpenter for efficient scaling
  • Istio for service mesh
  • EFS CSI Driver for storage
  • ArgoCD for deployments
  • AWS Load Balancer Controller for external access

Provides a complete solution for running production workloads on Kubernetes in AWS.

Reference

Categories

Recent Posts

About

Over 15-year experience in the IT industry. Working in SysOps, DevOps and Architecture roles with mission-critical systems across a wide range of industries. Wide experience with AWS, Terraform, Kubernetes, Containers, CI/CD pipelines, and Linux. Always keeping up with the latest technologies. Passionate about automating the run of the mill. Big focus on problem-solving.