Scaling

Cluster Autoscaler

A component that automatically adds nodes when Pods are unschedulable and removes nodes when they are underutilized.

What is Cluster Autoscaler?

The Cluster Autoscaler (CA) is a Kubernetes component that adjusts the number of nodes in a cluster based on Pod scheduling demand. It monitors for Pods in Pending state — Pods that the scheduler cannot place because no node has sufficient available CPU, memory, or other resources. When pending Pods are detected, CA adds nodes to the cluster (by scaling up the underlying node group or autoscaling group) until the Pods can be scheduled. Conversely, CA identifies underutilized nodes (where all Pods could fit on other nodes) and safely drains and terminates them.

CA makes scale-down decisions conservatively: a node must be underutilized for a configurable duration (default 10 minutes), and CA checks that all evictable Pods on that node can be rescheduled elsewhere, respecting PodDisruptionBudgets, local storage, and system Pod constraints. Nodes with Pods that cannot be evicted (e.g., Pods with no controller, or Pods with PDBs that would be violated) are never removed by CA.

CA works with cloud-provider-specific node groups (AWS Auto Scaling Groups, GCE Managed Instance Groups, Azure VMSS). It requires the cluster's nodes to be organized into node groups, each sharing the same instance type and configuration. Karpenter is a newer alternative that provisions nodes individually rather than through pre-configured node groups, enabling more flexible and faster scaling.

Example

# Check cluster autoscaler status
kubectl -n kube-system describe configmap cluster-autoscaler-status

# View CA logs
kubectl -n kube-system logs -l app=cluster-autoscaler --tail=100

# Check for pending pods that triggered scale-up
kubectl get pods --all-namespaces --field-selector=status.phase=Pending

Cost & Waste Implications

Without Cluster Autoscaler, clusters must be over-provisioned for peak load — paying for idle nodes 24/7. With CA, off-peak hours can run with significantly fewer nodes, saving 30–60% of node costs for workloads with variable traffic patterns. Poorly tuned CA (scale-down delay too high, underutilization threshold too low) leaves idle nodes running longer than necessary.

KorPro— Kubernetes Cost Optimization

How KorPro Helps

KorPro analyzes cluster node utilization trends over time and identifies clusters where Cluster Autoscaler is absent or misconfigured, estimating the cost difference between static and dynamic node provisioning.

Scan Your Cluster Free

Related Terms

Node

Core Concepts

A physical or virtual machine in a Kubernetes cluster that runs Pods under the direction of the control plane.

Read definition

HorizontalPodAutoscaler(HPA)

Scaling

A controller that automatically scales the replica count of a Deployment or StatefulSet based on observed metrics.

Read definition

Karpenter

Scaling

An open-source Kubernetes node provisioner that launches the optimal nodes for pending Pods in seconds, without pre-configured node groups.

Read definition

PodDisruptionBudget(PDB)

Operations

A policy object that limits how many Pods of a deployment can be simultaneously unavailable during voluntary disruptions.

Read definition

Stop Wasting Money on Orphaned Kubernetes Resources

KorPro connects to your clusters across GCP, AWS, and Azure — no agents, no installation — and surfaces every orphaned resource with its monthly cost estimate.

Get Started Free Contact Sales