Kubernetes Cost Recovery: Reclaim 20–40% of Wasted Cloud Spend

The average Kubernetes cluster wastes between 20% and 40% of its monthly cloud bill.

That number comes up consistently across FinOps audits, cloud provider usage reports, and teams that run their first serious cleanup sweep. The waste doesn't happen because teams are careless. It happens because Kubernetes makes it easy to create resources and hard to track whether they're still needed.

This guide explains the cost recovery process: where waste hides, how to find it systematically, how to quantify it in dollar terms, and how to reclaim it without risking production stability.

What "Cost Recovery" Actually Means in Kubernetes

Cost recovery in a Kubernetes context means identifying cloud resources that are allocated but not providing business value, then safely reclaiming the spend associated with them.

This is distinct from cost optimization (right-sizing workloads, choosing spot instances, etc.), which focuses on making active resources cheaper. Cost recovery targets resources that shouldn't exist at all.

The typical split in a cluster audit:

Cost optimization targets: 30–40% of findings (reduce cost of existing workloads)
Cost recovery targets: 60–70% of findings (eliminate cost of unused resources entirely)

Cost recovery almost always has a higher ROI because deleting an unused resource is instant and zero-risk, while right-sizing an active workload requires testing.

Where Kubernetes Waste Hides

1. Orphaned PersistentVolumeClaims

PVCs are the most expensive category of Kubernetes waste. When a StatefulSet or Deployment is deleted, the underlying PersistentVolume often persists — especially when the StorageClass uses Retain reclaim policy, which is the default on most managed Kubernetes offerings.

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Retain  # PV survives even after PVC deletion

A 100 GB SSD PVC on GKE costs roughly $17/month. A 100 GB EBS gp3 on EKS costs roughly $8/month. A cluster with 20 orphaned PVCs at average 50 GB each adds $85–$170/month in pure waste — for storage that serves no application.

2. Unused Services and LoadBalancers

A Kubernetes Service of type LoadBalancer provisions a cloud load balancer. On GKE, that's ~$18/month. On EKS with NLB, ~$16/month. On AKS, ~$18/month.

Services frequently outlive the Deployments they front. The application is gone but the Service — and its cloud load balancer — keeps billing you.

How common is this? On clusters older than 6 months with active development, it's normal to find 5–15 orphaned Services. At $18/month each, that's $90–$270/month.

3. Orphaned ConfigMaps and Secrets

These don't carry a direct storage cost (Kubernetes stores them in etcd, which is typically fixed overhead). However, orphaned Secrets carry a serious indirect cost: compliance exposure.

An orphaned Secret containing a database password or API key from a decommissioned service is an attack surface. In regulated industries (SOC 2, HIPAA, PCI-DSS), undocumented credentials are an audit finding. Remediation costs — incident response, audit fees, policy updates — can dwarf the cloud bill impact.

4. Unused Deployments and ReplicaSets

Workloads created for one-off tasks, A/B testing, debugging sessions, or staging experiments that never got cleaned up. These consume CPU and memory reservations even at zero traffic.

A Deployment with requests: cpu: 200m, memory: 256Mi on a 3-replica setup reserves 600m CPU and 768Mi RAM. On a node with $0.048/hour for those resources, that's ~$35/month for an application nobody is using.

5. Stale Namespaces

Entire namespaces created for a project, feature branch, or team that no longer exists. These can contain dozens of orphaned resources. The namespace itself doesn't cost anything, but its contents do.

The Cost Recovery Process

Step 1: Get a Complete Inventory With Cost Attribution

You can't recover what you can't see. The challenge with manual auditing is that orphan detection isn't just "does this resource exist" — it requires checking whether each resource is referenced by anything else.

A ConfigMap that looks unused might be mounted by a CronJob that only runs monthly. A Service that appears to have no pods might be a valid headless service for StatefulSet discovery. Naive scripts that just list resources with no traffic will create false positives.

Proper orphan detection requires building a dependency graph: which resources reference which other resources, so you can identify nodes in the graph with no inbound references from active workloads.

To do this manually for a production cluster takes days. The KorPro Inspector does it automatically:

bash
helm install korpro-inspector oci://ghcr.io/kortechnologies/charts/korpro-inspector \
  --namespace korpro-system \
  --create-namespace \
  --set licenseKey="<your-key>"

After the first scan (1–3 minutes), you get a full breakdown in the dashboard: every orphaned resource, its type, namespace, cost estimate, and whether it's a direct or transitive orphan.

Step 2: Classify by Risk Level

Not all orphans are equal. Before deleting anything, classify findings into risk tiers:

Low risk — safe to delete immediately:

ConfigMaps and Secrets confirmed not mounted anywhere
Services with no matching pods and no traffic for 30+ days
Completed Jobs and their associated pods
Old ReplicaSets with 0 desired replicas (left behind by Deployments after rollout)

Medium risk — verify before deleting:

PVCs not mounted by any active pod (check if any CronJob or suspended workload needs them)
ServiceAccounts not assigned to any workload (check if any external system uses the token)
Deployments with 0 replicas (might be intentionally scaled down, not abandoned)

High risk — require team review:

Resources in default namespace (often poorly documented, may have implicit dependencies)
Resources with no owner labels (no way to determine original creator or purpose)
Ingress resources (might be needed for SSL cert management even without backend traffic)

Step 3: Quantify the Dollar Value

For cost recovery to get organizational support, it needs a dollar figure. Break it down by resource type:

Orphaned PVCs:       12 × avg 80 GB × $0.08/GB/month = $76.80/month
Orphaned Services:    7 × LoadBalancer              = $126.00/month
Idle Deployments:     4 × avg $35/month             = $140.00/month
─────────────────────────────────────────────────────────────────────
Total recoverable:                                   = $342.80/month
Annual:                                              = $4,113.60/year

That's a number you can take to a manager or a budget review.

Step 4: Delete in Dependency Order

Transitive orphans must be deleted in the right order. If Resource B is orphaned because Resource A (its parent workload) is also orphaned, delete A first. Deleting B first sometimes works, but occasionally triggers cascade issues.

For PVCs, the safe sequence:

Confirm no pod is currently mounting the PVC (kubectl get pods -A -o json | grep <pvc-name>)
Delete the PVC: kubectl delete pvc <name> -n <namespace>
Verify the PV transitions to Released then Available or is automatically deleted based on reclaim policy

For LoadBalancer Services:

Check cloud provider console to confirm no external DNS is pointing at the load balancer IP
Delete the Service: kubectl delete service <name> -n <namespace>
Verify the cloud load balancer is deprovisioned (check cost billing or cloud console)

Step 5: Prevent Recurrence With Continuous Monitoring

A one-time audit doesn't stay clean. New resources get created, new workloads get decommissioned, and within 3–6 months you're back where you started.

Continuous monitoring runs the dependency graph analysis on a schedule and alerts you when new orphans appear — before they accumulate into a large cost problem. The KorPro Inspector's CronJob does this every 6 hours by default, with scan history tracking your cleanup progress over time.

Realistic Recovery Benchmarks

Based on typical cluster profiles:

Cluster Age	Expected Orphan Rate	Typical Monthly Recovery
< 3 months	5–10% of resources	$50–$200
3–12 months	15–25% of resources	$200–$800
1–2 years	25–40% of resources	$500–$2,000
2+ years	35–50% of resources	$1,000–$5,000+

Multi-cluster organizations multiply these numbers by cluster count. A company running 10 clusters with an average age of 18 months can realistically recover $5,000–$15,000/month after a systematic audit.

The Most Common Mistake: Auditing Namespaces in Isolation

The most common DIY cost recovery mistake is auditing namespace by namespace and treating each resource in isolation. This misses cross-namespace dependencies — for example, a ClusterRole referenced by a ServiceAccount in one namespace from a workload in another. An isolated namespace scan would flag the ClusterRole as orphaned; a full-cluster dependency graph would correctly identify it as active.

Always audit at the cluster level, not the namespace level.

Getting Started

If you want to run a cost recovery audit on your cluster today:

Create a free KorPro account — no credit card required
Get your license key from Settings → Inspector
Deploy the Inspector with the Helm command above
Trigger an immediate scan: kubectl create job cost-audit --from=cronjob/korpro-inspector -n korpro-system
View your full findings, cost breakdown, and prioritized deletion list in the dashboard

The free tier covers 1 cluster. Most teams see their first cost recovery opportunities within 5 minutes of the first scan completing.

Start Your Cost Recovery Audit

How much is your cluster spending on resources nobody is using? Deploy the KorPro Inspector for free and find out in under 5 minutes. For larger organizations looking to audit multiple clusters or integrate findings into FinOps workflows, contact our team for a guided assessment.

Stop Wasting Kubernetes Resources

Ready to Clean Up Your Clusters?

KorPro automatically detects unused resources, orphaned secrets, and wasted spend across all your Kubernetes clusters. Start optimizing in minutes.

Get Started Free Contact Us