Kubernetes Cost Audit Checklist for EKS, GKE, and AKS

Every Kubernetes cluster older than a few months has resources costing money without doing useful work. The culprits are rarely obvious: a PersistentVolumeClaim from a deleted staging deployment, a LoadBalancer service that outlived its Ingress, a namespace full of completed jobs no one cleaned up. Across EKS, GKE, and AKS, these patterns are nearly universal.

A structured cost audit turns invisible waste into a recoverable list. This checklist covers everything a platform team, FinOps practitioner, or cloud engineer needs to run a thorough Kubernetes cost audit — with concrete detection commands, cloud-specific callouts, and guidance on how to validate findings before acting.

Why Managed Clusters Accumulate Waste Faster Than You Think

The operational model for managed Kubernetes encourages rapid iteration. Teams deploy frequently, spin up feature environments, scale during peak events, and migrate between services. The cloud provider handles the control plane; the team handles the rest.

The problem is cleanup rarely keeps pace with deployment. When a team deletes a Deployment, the associated PersistentVolumeClaim is not automatically deleted unless the StorageClass has a Delete reclaim policy. When a Namespace is marked for deletion but a finalizer holds it open, resources inside it may remain billable. When a LoadBalancer service outlives its application, the cloud provider keeps provisioning the external load balancer and charging accordingly.

EKS, GKE, and AKS each add their own wrinkles: EBS volumes that do not follow pods across availability zones, GKE Autopilot charges at the pod resource request level, AKS pricing tied to VM SKU rather than container workload.

A cost audit surfaces these issues systematically before your FinOps review turns them into a budget conversation.

The Complete Kubernetes Cost Audit Checklist

1. Idle and Oversized Workloads

Idle workloads include Deployments and StatefulSets with replicas set to zero, pods with CPU and memory utilization consistently below 5%, and Jobs or CronJobs that have not run successfully in weeks.

Oversized workloads — those with resource requests far exceeding actual usage — do not always waste money directly, but on managed node groups and Autopilot-style pricing, over-requesting blocks bin-packing and forces unnecessary node provisioning.

Detection:

bash
# Find Deployments with zero replicas
kubectl get deployments --all-namespaces -o json | \
  jq '.items[] | select(.spec.replicas == 0) | {name: .metadata.name, namespace: .metadata.namespace}'

# Find pods with no CPU requests set
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.containers[].resources.requests.cpu == null) | {name: .metadata.name, namespace: .metadata.namespace}'

What to look for: Workloads with zero replicas that have been scaled down for more than 30 days are strong cleanup candidates. Pods with no resource requests set are a node efficiency risk.

2. Orphaned PersistentVolumeClaims and PVs

This is consistently the highest-value line item in a Kubernetes cost audit. PersistentVolumeClaims become orphaned when the pod that mounted them is deleted without deleting the claim. The underlying cloud disk — EBS, Persistent Disk, or Managed Disk — continues to bill at the full provisioned size.

PersistentVolumes enter Released status when their claim is deleted but the reclaim policy is Retain. The volume remains provisioned and billable until manually deleted.

Detection:

bash
# Find PVCs not in Bound phase
kubectl get pvc --all-namespaces -o json | \
  jq '.items[] | select(.status.phase != "Bound") | {name: .metadata.name, namespace: .metadata.namespace, status: .status.phase, size: .spec.resources.requests.storage}'

# Find Released PVs
kubectl get pv -o json | \
  jq '.items[] | select(.status.phase == "Released") | {name: .metadata.name, capacity: .spec.capacity.storage, storageClass: .spec.storageClassName}'

For a deep dive on detection, validation, and safe deletion steps, see how to find orphaned PVCs and PVs before they inflate your cloud bill.

3. Unused Services, LoadBalancers, and Ingresses

LoadBalancer services provision a cloud load balancer on creation. If the application backing the service is gone or the service was created for a one-time test, the load balancer continues running — typically $15–$20/month per resource on most cloud providers.

ClusterIP and NodePort services that have no pods matching their selector are functionally useless and often signal incomplete cleanup.

Detection:

bash
# List all LoadBalancer services across namespaces
kubectl get services --all-namespaces --field-selector spec.type=LoadBalancer

# Find services with no matching endpoints
kubectl get endpoints --all-namespaces -o json | \
  jq '.items[] | select(.subsets == null or (.subsets | length == 0)) | {name: .metadata.name, namespace: .metadata.namespace}'

What to look for: Services with empty endpoint lists that have persisted for more than 7 days are strong cleanup candidates.

4. Stale Namespaces and Forgotten Environments

Feature branches, pull request environments, QA environments, and load test namespaces are a major source of accumulated waste. When a sprint ends or a feature ships, the namespace often stays.

Detection:

bash
# List namespaces sorted by creation date
kubectl get namespaces -o json | \
  jq '.items | sort_by(.metadata.creationTimestamp) | .[] | {name: .metadata.name, created: .metadata.creationTimestamp}'

# Find namespaces with no running pods
kubectl get pods --all-namespaces -o json | \
  jq '[.items[].metadata.namespace] | unique' > /tmp/active_namespaces.json
kubectl get namespaces -o jsonpath='{.items[*].metadata.name}' | \
  tr ' ' '\n' | grep -Fvf /tmp/active_namespaces.json

What to look for: Namespaces older than 60 days with no active pods and naming patterns like pr-*, test-*, staging-v*, or dev-*.

5. Unused ConfigMaps and Secrets

ConfigMaps and Secrets that are not mounted by any pod or referenced by any deployment accumulate over time without contributing to running workloads. While storage cost is negligible, unused Secrets represent a security hygiene risk — credentials and tokens that are no longer needed but remain accessible.

For detection at scale, see how to find and remove orphaned ConfigMaps and finding unused Kubernetes secrets.

6. Ownership and Label Hygiene

Resources without owner, team, or cost-center labels cannot be attributed to a team or project. This makes cost allocation guesswork and means cleanup approvals have no clear owner to contact.

Detection:

bash
# Find Deployments without an 'owner' label
kubectl get deployments --all-namespaces -o json | \
  jq '.items[] | select(.metadata.labels.owner == null) | {name: .metadata.name, namespace: .metadata.namespace}'

What to look for: Any resource class with more than 20% of objects missing standard ownership labels needs a labeling policy before the next audit.

7. Node Utilization and Right-Sizing

Nodes running at consistently low CPU and memory utilization indicate over-provisioning of the node group. On EKS and AKS with reserved instances or committed-use discounts, this means paying for capacity that pods cannot fill.

Detection:

bash
# Node resource usage (requires metrics-server)
kubectl top nodes

# Node capacity vs allocatable
kubectl get nodes -o json | \
  jq '.items[] | {name: .metadata.name, cpu: .status.capacity.cpu, memory: .status.capacity.memory, allocatable_cpu: .status.allocatable.cpu, allocatable_memory: .status.allocatable.memory}'

What to look for: Nodes consistently below 30% CPU and 40% memory utilization for more than two weeks are candidates for right-sizing or bin-packing review.

EKS-Specific Audit Points

EBS volumes in the wrong AZ: EBS volumes are availability-zone-locked. If a pod reschedules to a different AZ, a new volume is provisioned and the old one may be orphaned. Check for PVs in Released state and cross-reference with EBS volumes in your AWS console.
NAT Gateway traffic costs: Pods communicating across AZs incur NAT Gateway charges that do not appear in Kubernetes-side cost tools. Check AWS Cost Explorer for inter-AZ data transfer.
Unused EKS node groups: Node groups created for specific workloads but currently empty still incur reservation costs on on-demand capacity.
Fargate idle pods: Fargate pods bill per second of vCPU and memory. Idle or long-running Completed pods still bill. Check for Fargate pods in Pending state.

GKE-Specific Audit Points

Autopilot pricing: GKE Autopilot charges on pod resource requests, not node capacity. Over-requested pods directly increase your bill. Audit pod resource requests against actual utilization from Cloud Monitoring.
Regional vs. zonal clusters: Regional clusters provision three control plane instances. If you are running regional GKE clusters for development workloads, consider switching them to zonal.
Persistent Disk reclaim policies: Verify your StorageClasses have the expected reclaim policy. GKE does not always default to Delete — a Retain policy means disks stay after PVC deletion.
Dataplane V2 logging costs: Verbose flow logs can generate unexpected Cloud Logging costs. Confirm log sinks are configured correctly.

AKS-Specific Audit Points

Stopped vs. deallocated node pools: AKS allows stopping node pools, but stopped pools still bill for their managed disks. Verify that stopped pools are either fully deleted or have no associated disks.
Azure Load Balancer idle charges: Azure Load Balancers in idle state still accrue a small hourly charge. Unused LoadBalancer-type services in AKS should be deleted, not just left with zero replicas behind them.
Azure Disk retention on cluster delete: When deleting AKS clusters, Managed Disks associated with PVCs are not always automatically deleted. Check your resource group for orphaned disks after cluster teardowns.
System node pool overhead: AKS requires a system node pool at all times. Verify the system node pool VM SKU is not overprovisioned for the actual system workload it carries.

Audit Checklist Summary

Audit Item	Detection Method	Potential Business Impact
Zero-replica Deployments	`spec.replicas == 0` filter	Node slot waste; active LB services billing with no workload
Orphaned PVCs (Unbound or Pending)	PVC phase filter	Direct storage billing per provisioned GB
Released PVs	PV phase filter	Direct storage billing per provisioned GB
Empty-endpoint Services	Endpoints `subsets` check	LB provisioning cost per service
Stale namespaces	Namespace age + pod activity	Accumulated waste across all resource types
Unused Secrets and ConfigMaps	Cross-reference with mounted volumes	Security risk and hygiene debt
Unlabeled resources	Label selector absence filter	Cost attribution gaps and slow cleanup cycles
Low-utilization nodes	`kubectl top nodes`	Over-provisioned node groups and wasted reserved capacity

Turning Audit Findings Into Action

A good audit generates a prioritized list of cleanup opportunities — not a deletion script. Before acting on any finding, validate:

Is the resource still referenced? Check for indirect references — HorizontalPodAutoscalers, VolumeSnapshotContents, Helm release metadata.
Who owns this? Check labels, annotations, and recent events. If ownership is unclear, check Git history for who created the namespace or Helm release.
What is the actual cost? Estimate monthly impact before deciding urgency. A 10 GB EBS volume is roughly $1/month; a 2 TB volume is a different conversation.
Can it be safely deleted without data loss? For PVCs especially, confirm no application is in a failed-restart loop that needs the data to recover.

The audit-first workflow — discover with read-only access, review findings with owners, act with confidence — protects against cleanup incidents and creates a repeatable process.

For namespace-level cost breakdown, see how to audit Kubernetes costs by namespace.

Start Your Cost Recovery Audit

KorPro scans all seven checklist categories above automatically — across EKS, GKE, and AKS — without requiring cloud credentials or write access to your cluster.

Create your free KorPro account | Contact our team

Kubernetes Cost Audit Checklist for EKS, GKE, and AKS

Why Managed Clusters Accumulate Waste Faster Than You Think

The Complete Kubernetes Cost Audit Checklist

1. Idle and Oversized Workloads

2. Orphaned PersistentVolumeClaims and PVs

3. Unused Services, LoadBalancers, and Ingresses

4. Stale Namespaces and Forgotten Environments

5. Unused ConfigMaps and Secrets

6. Ownership and Label Hygiene

7. Node Utilization and Right-Sizing

EKS-Specific Audit Points

GKE-Specific Audit Points

AKS-Specific Audit Points

Audit Checklist Summary

Turning Audit Findings Into Action

Start Your Cost Recovery Audit

Ready to Clean Up Your Clusters?

Related Articles

Read-Only Kubernetes Cost Optimization: How to Find Waste Without Installing Agents

How MSPs Recover Margin from Unused Kubernetes Resources Across Customer Clusters

Kubecost vs KorPro: Which Tool Actually Cleans Up Unused Kubernetes Resources?