Resource Requests and Limits
Per-container declarations of guaranteed CPU/memory (requests) and hard maximums (limits) that drive scheduling and enforcement.
What is Resource Requests and Limits?
Resource requests and limits are the primary mechanism by which Kubernetes manages compute resources at the container level. Requests are the amount of CPU and memory the scheduler assumes a container will use — the node must have at least this much allocatable capacity for the Pod to be scheduled there. Limits are the hard ceilings: a container exceeding its CPU limit is throttled; a container exceeding its memory limit is killed with OOMKilled and restarted.
The relationship between requests and limits determines a Pod's Quality of Service (QoS) class. Guaranteed QoS (requests == limits for all containers) gives the Pod the highest scheduling priority and is the last to be evicted under node pressure. Burstable QoS (requests < limits, or only some containers set both) is the middle tier. BestEffort QoS (no requests or limits set) is evicted first and offers no scheduling guarantees.
Setting requests too high wastes cluster capacity — reserved but unused CPU and memory can't be used by other Pods. Setting requests too low causes over-scheduling, node pressure, and OOMKill cascades. The Vertical Pod Autoscaler (VPA) analyzes historical usage and recommends (or automatically applies) right-sized requests, closing the gap between reserved and used resources.
Example
# A container spec with both requests and limits set
containers:
- name: api
image: my-org/api:v3
resources:
requests:
cpu: "250m" # 0.25 cores guaranteed
memory: "512Mi" # 512 MiB guaranteed
limits:
cpu: "1000m" # 1 core maximum
memory: "1Gi" # 1 GiB maximum (OOMKill if exceeded)
# Check actual usage vs requests
kubectl top pods -n production --containersCost & Waste Implications
Kubernetes cloud costs are determined by node capacity, not Pod utilization. If Pods request 4Gi of memory each but only use 500Mi, you need 8x as many nodes as actually necessary. Studies show average cluster memory utilization around 20% — meaning 80% of provisioned memory is paid for but unused. Rightsizing requests to match p95 actual usage typically reduces cluster node count by 30–50%.
How KorPro Helps
KorPro analyzes actual CPU and memory utilization from cluster metrics and compares it against declared resource requests, surfacing over-provisioned workloads with estimated monthly savings from rightsizing.
Scan Your Cluster FreeRelated Terms
Pod
Core ConceptsThe smallest deployable unit in Kubernetes — one or more containers that share a network namespace and storage volumes.
Read definitionVerticalPodAutoscaler(VPA)
ScalingA controller that recommends or automatically adjusts CPU and memory resource requests for Pods based on observed usage.
Read definitionHorizontalPodAutoscaler(HPA)
ScalingA controller that automatically scales the replica count of a Deployment or StatefulSet based on observed metrics.
Read definitionKubernetes Cost Optimization
FinOpsThe practice of reducing Kubernetes infrastructure spend while maintaining performance and reliability.
Read definitionStop Wasting Money on Orphaned Kubernetes Resources
KorPro connects to your clusters across GCP, AWS, and Azure — no agents, no installation — and surfaces every orphaned resource with its monthly cost estimate.