Optimizing EKS costs requires a structured approach that balances quick wins with sustainable practices. This 30/60/90 day plan walks you through measurement, right-sizing, autoscaling, and Spot adoption—the four levers that collectively reduce EKS spend by 50-80% according to real-world implementations.
Key Takeaways
- Week 1-2 focuses on cost visibility: deploy Kubecost and enable AWS Cost & Usage Reports
- Week 3-6 tackles right-sizing using VPA recommendations and actual resource usage data
- Week 7-10 implements autoscaling with HPA and Cluster Autoscaler or Karpenter
- Week 11-14 introduces Spot instances with safety guardrails for non-critical workloads
- Monthly reviews and cleanup automation sustain savings long-term
Days 1-14: Establish Cost Visibility
You can’t optimize what you can’t measure. The first two weeks focus entirely on instrumentation—no optimization yet.
Deploy Kubecost or OpenCost
Kubecost provides per-pod, per-namespace cost attribution by mapping Kubernetes resource requests to AWS pricing data.
helm repo add kubecost https://kubecost.github.io/cost-analyzer/ helm upgrade --install kubecost kubecost/cost-analyzer \ --namespace kubecost --create-namespace \ --set kubecostToken="your-token-here" For OpenCost (the open-source alternative):
kubectl apply -f https://raw.githubusercontent.com/opencost/opencost/develop/kubernetes/opencost.yaml Access the Kubecost UI:
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090 Navigate to http://localhost:9090 to see cost breakdowns by namespace, deployment, and pod.
Enable AWS Cost & Usage Reports
Cost & Usage Reports (CUR) provide the authoritative source of AWS pricing data. Kubecost integrates with CUR for accurate cost allocation.
- Go to AWS Billing Console → Cost & Usage Reports
- Create a new report with hourly granularity
- Enable resource IDs and split cost allocation
- Configure S3 bucket for report delivery
- Update Kubecost configuration to point to your CUR S3 bucket
Tag Everything
Tags enable cost allocation by team, environment, and application. Apply tags to:
- EC2 instances (via node group tags)
- EBS volumes (via StorageClass parameters)
- Load balancers (via Service annotations)
Example node group tags in eksctl:
nodeGroups: - name: production-nodes tags: Environment: production Team: platform CostCenter: engineering Activate cost allocation tags in AWS Billing preferences so they appear in Cost Explorer.
Baseline Current Costs
Document your starting point:
aws ce get-cost-and-usage \ --time-period Start=2025-01-01,End=2025-01-31 \ --granularity DAILY \ --metrics UnblendedCost \ --group-by Type=DIMENSION,Key=SERVICE Record EC2, EBS, data transfer, and EKS control plane costs. This becomes your benchmark for measuring progress.
Days 15-45: Right-Size Resources
Right-sizing eliminates the gap between reserved resources (requests) and actual usage—the single biggest cost waste in most clusters.
Install Metrics Server
Metrics Server provides the foundation for autoscaling and right-sizing decisions:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml Verify it’s working:
kubectl top nodes kubectl top pods --all-namespaces Analyze Resource Slack
Compare pod requests to actual usage:
kubectl get pods --all-namespaces -o custom-columns=\ 'NAMESPACE:.metadata.namespace,\ NAME:.metadata.name,\ CPU_REQ:.spec.containers[*].resources.requests.cpu,\ MEM_REQ:.spec.containers[*].resources.requests.memory' In Kubecost, navigate to the “Savings” tab to see rightsizing recommendations. Look for pods with <20% CPU utilization or excessive memory requests.
Deploy Vertical Pod Autoscaler (Recommendation Mode)
VPA analyzes historical usage and recommends optimal requests and limits:
git clone https://github.com/kubernetes/autoscaler.git cd autoscaler/vertical-pod-autoscaler ./hack/vpa-up.sh Create a VPA in recommendation-only mode:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-app-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-app updatePolicy: updateMode: "Off" After 24-48 hours, check recommendations:
kubectl describe vpa my-app-vpa Apply recommendations incrementally—reduce requests by 10-30% initially and monitor for OOMKilled events or CPU throttling.
Set Resource Limits and Quotas
Prevent future over-provisioning with LimitRanges and ResourceQuotas:
apiVersion: v1 kind: LimitRange metadata: name: default-limits namespace: production spec: limits: - default: cpu: 500m memory: 512Mi defaultRequest: cpu: 100m memory: 128Mi type: Container Days 46-75: Implement Autoscaling
Autoscaling eliminates manual capacity management and ensures you pay only for what you use.
Configure Horizontal Pod Autoscaler
HPA scales pod replicas based on CPU, memory, or custom metrics:
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: my-app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-app minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 Monitor HPA decisions:
kubectl get hpa --watch Deploy Cluster Autoscaler or Karpenter
Cluster Autoscaler scales node groups based on pending pods. It’s mature and works well with managed node groups.
Create an IAM role with autoscaling permissions and annotate the ServiceAccount:
apiVersion: v1 kind: ServiceAccount metadata: name: cluster-autoscaler namespace: kube-system annotations: eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/cluster-autoscaler Deploy Cluster Autoscaler with the --balance-similar-node-groups flag to distribute scale-outs evenly across AZs:
helm repo add autoscaler https://kubernetes.github.io/autoscaler helm upgrade --install cluster-autoscaler autoscaler/cluster-autoscaler \ --namespace kube-system \ --set autoDiscovery.clusterName=my-cluster \ --set extraArgs.balance-similar-node-groups=true Karpenter is a newer alternative that provisions right-sized nodes faster and supports more aggressive consolidation. It’s ideal for dynamic workloads.
Choose based on your operational maturity—Cluster Autoscaler for stability, Karpenter for optimization.
Test Autoscaling Behavior
Generate load to trigger scaling:
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://my-app; done" Watch HPA scale pods and Cluster Autoscaler provision nodes. Verify that scale-down happens after the load subsides (default: 10 minutes).
Days 76-90: Introduce Spot Instances
Spot instances can reduce compute costs by up to 90%, but they require careful handling to avoid disruptions.
Create Mixed Node Groups
Start with a small Spot node group for non-critical workloads:
eksctl create nodegroup \ --cluster=my-cluster \ --name=spot-nodes \ --node-type=m5.large \ --nodes=3 \ --nodes-min=1 \ --nodes-max=10 \ --spot \ --node-labels="kubernetes.io/lifecycle=preemptible" Keep a separate On-Demand node group for critical workloads:
eksctl create nodegroup \ --cluster=my-cluster \ --name=ondemand-nodes \ --node-type=m5.large \ --nodes=2 \ --node-labels="kubernetes.io/lifecycle=essential" Install Spot Termination Handler
AWS sends a 2-minute warning before reclaiming Spot instances. A termination handler cordons and drains nodes gracefully:
kubectl apply -f https://github.com/aws/aws-node-termination-handler/releases/download/v1.19.0/all-resources.yaml Verify it’s running:
kubectl get daemonset -n kube-system | grep aws-node-termination-handler Use Node Affinity to Pin Critical Pods
Ensure databases and stateful workloads stay on On-Demand nodes:
affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: "kubernetes.io/lifecycle" operator: "In" values: - essential For batch jobs, prefer Spot:
affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 preference: matchExpressions: - key: "kubernetes.io/lifecycle" operator: "In" values: - preemptible Set Pod Disruption Budgets
PDBs prevent too many pods from being evicted simultaneously:
apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: my-app-pdb spec: minAvailable: 2 selector: matchLabels: app: my-app Ongoing: Sustain and Improve
Cost optimization isn’t a one-time project. Establish monthly rituals:
Monthly Cost Review
- Review Kubecost “Savings” tab for new rightsizing opportunities
- Check for orphaned EBS volumes and unused load balancers
- Analyze Cost Explorer for anomalies and trends
- Adjust Savings Plans coverage based on baseline usage
Automate Cleanup
Find and delete orphaned volumes:
aws ec2 describe-volumes \ --filters Name=status,Values=available \ --query 'Volumes[*].[VolumeId,Size,CreateTime]' \ --output table Schedule non-production cluster shutdowns using kube-downscaler:
helm repo add kube-downscaler https://charts.kiwigrid.com helm upgrade --install kube-downscaler kube-downscaler/kube-downscaler \ --set env.DEFAULT_UPTIME="Mon-Fri 08:00-18:00 America/New_York" Measuring Success
Track these metrics monthly:
- Total EKS spend (EC2 + EBS + data transfer + control plane)
- Cost per pod (from Kubecost)
- Node utilization (target: >60% average CPU/memory)
- Spot instance percentage (target: 50-70% for tolerant workloads)
- Orphaned resource count (target: zero)
Expect 15-25% savings from right-sizing alone, 10-20% from autoscaling, and 30-50% from Spot adoption—compounding to 50-80% total reduction when combined.
Conclusion
This 90-day plan provides a structured path from measurement to meaningful savings. Start with visibility in weeks 1-2, tackle right-sizing in weeks 3-6, implement autoscaling in weeks 7-10, and carefully introduce Spot in weeks 11-14. The key is incremental progress with validation at each step—don’t skip measurement, don’t apply VPA in auto mode without testing, and don’t put critical workloads on Spot without fallback capacity. By day 90, you’ll have the foundation for sustainable cost optimization that adapts as your cluster grows.