If you've ever waited 5-10 minutes for Kubernetes Cluster Autoscaler to spin up new nodes while your pods sat in Pending state, you know the pain. Karpenter is AWS's open-source node provisioner that replaces Cluster Autoscaler with something dramatically faster and smarter. It provisions the right nodes in under 60 seconds, handles spot interruptions automatically, and can cut your compute costs by 40-60%.

What is Karpenter?

Karpenter is an open-source, high-performance Kubernetes node lifecycle manager. Unlike Cluster Autoscaler (which works with pre-defined node groups), Karpenter directly provisions compute capacity from the cloud provider based on the actual requirements of your pending pods.

Karpenter vs Cluster Autoscaler
⚡ Karpenter
Provisioning< 60 seconds
🎯Instance selectionBest-fit per pod
💰Spot supportNative + consolidation
Node groupsNot needed
🔄Bin packingAutomatic
🛡Spot interruptionAuto-replacement
VS
⏳ Cluster Autoscaler
Provisioning5-10 minutes
🎯Instance selectionPre-defined groups
💰Spot supportLimited
Node groupsRequired (ASGs)
🔄Bin packingPoor
🛡Spot interruptionManual handling

How Karpenter Works

Karpenter Provisioning Flow
Kubernetes(Scheduler)
Karpenter(Controller)
AWS EC2(Cloud Provider)
1 Pod enters Pending (unschedulable)
Analyze pod requirements (CPU, memory, GPU, topology)
2 Select optimal instance type + launch
3 EC2 instance ready (< 60s)
Node joins cluster, pod scheduled
4 Pod running ✅

Installation

Karpenter runs as a Helm chart inside your EKS cluster. Here's the production-ready setup:

# Prerequisites:
# - EKS cluster (1.25+)
# - IAM roles for service accounts (IRSA) configured
# - aws CLI, kubectl, helm installed

# Set your cluster variables
export CLUSTER_NAME="my-production-cluster"
export AWS_REGION="us-east-1"
export KARPENTER_VERSION="1.1.0"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"

# Create the IAM roles for Karpenter
# (Karpenter needs permission to create/terminate EC2 instances)
aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file karpenter-cloudformation.yaml \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

# Install Karpenter via Helm
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "${KARPENTER_VERSION}" \
  --namespace kube-system \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

# Verify Karpenter is running
kubectl get pods -n kube-system -l app.kubernetes.io/name=karpenter
# NAME                         READY   STATUS    RESTARTS   AGE
# karpenter-5f4b8c8d9f-xxxxx   1/1     Running   0          2m

Core Concepts

Karpenter Resource Hierarchy
NodePoolDefines WHAT to provision — instance types, zones, capacity type (spot/on-demand), limits
EC2NodeClassDefines HOW to provision — AMI, subnets, security groups, user data, block devices
NodeClaimAuto-created by Karpenter — represents a single provisioned node (like a Pod for nodes)
EC2 Instance + NodeThe actual cloud instance that joins the cluster and runs your pods

NodePool: Define What to Provision

A NodePool tells Karpenter what kind of nodes it can create. Think of it as a set of constraints and preferences:

# nodepool.yaml — Production-ready NodePool
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  # Template for nodes created by this pool
  template:
    metadata:
      labels:
        environment: production
        team: platform
    spec:
      # Which EC2NodeClass to use for AWS-specific config
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default

      # Instance type requirements
      requirements:
        # Architecture: amd64 or arm64 (Graviton)
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64", "arm64"]

        # Capacity type: spot first, on-demand as fallback
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]

        # Instance categories: general purpose + compute optimized
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]

        # Instance sizes: medium to 8xlarge
        - key: karpenter.k8s.aws/instance-size
          operator: In
          values: ["medium", "large", "xlarge", "2xlarge", "4xlarge", "8xlarge"]

        # Availability zones
        - key: topology.kubernetes.io/zone
          operator: In
          values: ["us-east-1a", "us-east-1b", "us-east-1c"]

      # Taints (optional — restrict what can run on these nodes)
      # taints:
      #   - key: workload-type
      #     value: compute-heavy
      #     effect: NoSchedule

  # Resource limits — cap total provisioned capacity
  limits:
    cpu: "1000"        # Max 1000 vCPUs across all nodes
    memory: "2000Gi"   # Max 2TB RAM

  # Disruption policy — how Karpenter consolidates/replaces nodes
  disruption:
    # Consolidation: merge underutilized nodes to save money
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s

    # Budget: max nodes that can be disrupted simultaneously
    budgets:
      - nodes: "10%"    # Disrupt at most 10% of nodes at once

  # How long before an idle node is terminated
  weight: 10  # Priority (higher = preferred over other NodePools)

EC2NodeClass: Define How to Provision

# ec2nodeclass.yaml — AWS-specific configuration
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  # IAM role for the nodes
  role: "KarpenterNodeRole-my-production-cluster"

  # AMI selection — use the latest EKS-optimized AMI
  amiSelectorTerms:
    - alias: al2023@latest   # Amazon Linux 2023 (recommended)

  # Subnet discovery — find subnets by tag
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-production-cluster"

  # Security group discovery
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-production-cluster"

  # Block device mappings (root volume)
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        iops: 3000
        throughput: 125
        encrypted: true
        deleteOnTermination: true

  # Tags applied to all EC2 instances
  tags:
    Environment: production
    ManagedBy: karpenter
    Team: platform

  # User data (optional — bootstrap scripts)
  # userData: |
  #   #!/bin/bash
  #   echo "Custom bootstrap logic here"

  # Metadata options (IMDSv2 required for security)
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required  # Enforce IMDSv2

Spot Instance Optimization

Karpenter's spot handling is one of its killer features. It automatically diversifies across instance types and handles interruptions gracefully:

Karpenter Spot Instance Strategy
💰Spot First60-90% savings
🔄Diversify15+ instance types
Interruption2-min warning
🔄ReplaceAuto-provision new
🛡FallbackOn-demand if needed
# Spot-optimized NodePool — maximize savings
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: spot-compute
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      requirements:
        # SPOT ONLY for this pool
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]

        # Wide instance diversity = fewer interruptions
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r", "c5", "m5", "r5", "c6i", "m6i", "r6i"]

        - key: karpenter.k8s.aws/instance-size
          operator: In
          values: ["large", "xlarge", "2xlarge", "4xlarge"]

        # Use Graviton (arm64) for 20% better price-performance
        - key: kubernetes.io/arch
          operator: In
          values: ["arm64"]

  # Disruption: enable consolidation for further savings
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s

---
# On-demand fallback pool (lower priority)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: on-demand-fallback
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["m", "c"]
  weight: 1  # Lower priority than spot pool (weight: 10)

Consolidation: Automatic Cost Optimization

Karpenter continuously watches for underutilized nodes and consolidates workloads to fewer, better-fitting instances. This happens automatically — no cron jobs, no manual intervention.

How Consolidation Works
👀Monitor
📊Detect
💸Waste
📦Repack
♻ Auto
💰Save
# Example: Consolidation in action
# Before consolidation:
#   Node 1 (m5.2xlarge — 8 vCPU, 32GB): using 2 vCPU, 4GB (25% utilized)
#   Node 2 (m5.2xlarge — 8 vCPU, 32GB): using 3 vCPU, 8GB (37% utilized)
#   Total cost: 2x m5.2xlarge = ~$0.384/hr * 2 = $0.768/hr

# After consolidation (Karpenter automatically):
#   1. Launches m5.xlarge (4 vCPU, 16GB) — fits both workloads
#   2. Cordons Node 1 and Node 2
#   3. Drains pods (respecting PDBs)
#   4. Terminates old nodes
#   Result: 1x m5.xlarge = $0.192/hr (75% savings!)

# Monitor consolidation events
kubectl get events --field-selector reason=DisruptionInitiated -n kube-system

GPU Workloads

Karpenter can provision GPU instances for ML/AI workloads just as easily:

# GPU NodePool for ML training
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: gpu-training
spec:
  template:
    metadata:
      labels:
        workload-type: gpu
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: gpu-class
      requirements:
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["g", "p"]  # GPU instance families
        - key: karpenter.k8s.aws/instance-gpu-count
          operator: Gt
          values: ["0"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
      taints:
        - key: nvidia.com/gpu
          value: "true"
          effect: NoSchedule
  limits:
    cpu: "200"
    memory: "800Gi"
    nvidia.com/gpu: "16"  # Max 16 GPUs across all nodes

---
# Pod requesting a GPU
apiVersion: v1
kind: Pod
metadata:
  name: ml-training-job
spec:
  tolerations:
    - key: nvidia.com/gpu
      operator: Exists
  containers:
    - name: trainer
      image: my-registry/ml-trainer:latest
      resources:
        requests:
          nvidia.com/gpu: 1
          memory: 16Gi
          cpu: 4
        limits:
          nvidia.com/gpu: 1
          memory: 32Gi
# Karpenter sees this pod Pending, provisions a g5.xlarge (1 GPU),
# and the pod starts in under 90 seconds

Multi-Architecture (ARM64 / Graviton)

AWS Graviton processors offer 20% better price-performance than x86. Karpenter makes it easy to use both:

# NodePool allowing both architectures
requirements:
  - key: kubernetes.io/arch
    operator: In
    values: ["amd64", "arm64"]  # Karpenter picks the cheapest

# Your pods need multi-arch images:
# docker buildx build --platform linux/amd64,linux/arm64 -t my-app:latest .

# Karpenter's decision process:
# 1. Pod requests 2 vCPU, 4GB memory
# 2. Karpenter evaluates: m6i.large (amd64) = $0.096/hr
#                          m6g.large (arm64) = $0.077/hr
# 3. Picks m6g.large (arm64) — 20% cheaper, same performance
# 4. Only if your image doesn't support arm64, falls back to amd64

Monitoring & Observability

# Karpenter exposes Prometheus metrics out of the box

# Key metrics to monitor:
# karpenter_nodes_total              — Current node count by pool
# karpenter_nodeclaims_terminated    — Node terminations (consolidation, expiry)
# karpenter_pods_startup_duration    — Time from Pending to Running
# karpenter_provisioner_scheduling   — Scheduling decisions per second
# karpenter_interruption_received    — Spot interruption events

# Grafana dashboard (community):
# https://github.com/aws/karpenter/tree/main/charts/karpenter/dashboards

# Useful kubectl commands:
# See all NodeClaims (Karpenter-managed nodes)
kubectl get nodeclaims
# NAME              TYPE          ZONE         NODE              READY   AGE
# default-abc123    m6g.xlarge    us-east-1a   ip-10-0-1-45      True    2h
# spot-xyz789       c6g.2xlarge   us-east-1b   ip-10-0-2-78      True    45m

# See NodePool status (capacity used vs limits)
kubectl get nodepool
# NAME       NODECLASS   NODES   READY   AGE
# default    default     12      12      30d
# spot       default     8       8       30d

# Describe a NodeClaim for details
kubectl describe nodeclaim default-abc123
# Shows: instance type, zone, capacity type, allocatable resources, pods running

# Check for disruption events
kubectl get events -A --field-selector reason=DisruptionInitiated

# Logs
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter -f

Production Best Practices

Karpenter Production Checklist
Set resource limits on NodePools
Prevent runaway scaling — cap CPU and memory per pool
Use Pod Disruption Budgets (PDBs)
Protect availability during consolidation — at least 1 replica always running
Diversify instance types widely
Allow 15+ instance types for spot — reduces interruption probability by 90%
Set pod resource requests accurately
Karpenter uses requests (not limits) to bin-pack — wrong requests = wasted capacity
Enable SQS interruption queue
Graceful handling of spot interruptions, maintenance events, and rebalance recommendations
Use multiple NodePools
Separate pools for: general workloads, GPU, spot-only, on-demand critical — different rules for each
Monitor consolidation aggressiveness
Start with consolidateAfter: 60s, tune based on workload stability

Migrating from Cluster Autoscaler

# Migration strategy — run both side-by-side, then decommission CA

# Step 1: Install Karpenter alongside Cluster Autoscaler
# (They can coexist — Karpenter handles new provisioning,
# CA manages existing node groups)

# Step 2: Create NodePool + EC2NodeClass
kubectl apply -f nodepool.yaml
kubectl apply -f ec2nodeclass.yaml

# Step 3: Taint existing CA-managed node groups
# This prevents new pods from scheduling on CA nodes
kubectl taint nodes -l eks.amazonaws.com/nodegroup=old-ng \
  legacy=cluster-autoscaler:PreferNoSchedule

# Step 4: Gradually drain CA node groups
# Karpenter will provision replacement capacity automatically
kubectl cordon 
kubectl drain  --ignore-daemonsets --delete-emptydir-data

# Step 5: Once all workloads are on Karpenter nodes:
# - Scale CA node groups to 0
# - Uninstall Cluster Autoscaler
# - Delete old ASGs/node groups

# Step 6: Celebrate your 40-60% cost reduction 🎉

Cost Comparison

Typical Monthly Cost Savings (100-node cluster)
On-Demand (no autoscaling)
Cluster Autoscaler
Karpenter (spot + consolidation)
Karpenter + Graviton

Karpenter is the most impactful cost optimization tool in the Kubernetes ecosystem. It's faster than Cluster Autoscaler, smarter about instance selection, and aggressively consolidates underutilized capacity. If you're running EKS in production, migrating to Karpenter is one of the highest-ROI infrastructure changes you can make.