Deployment
A controller that manages a ReplicaSet to keep a specified number of identical Pod replicas running and handles rolling updates.
What is Deployment?
A Deployment is the standard way to run stateless, replicated workloads in Kubernetes. You declare the desired state — container image, replica count, update strategy, resource requests — and the Deployment controller continuously reconciles actual state to match it. If a Pod crashes, the controller replaces it. If you update the image, it performs a rolling update: creating new Pods with the new image while terminating old ones, maintaining availability throughout.
Deployments manage Pods indirectly through a ReplicaSet. Each rollout creates a new ReplicaSet; the Deployment controller scales up the new RS and scales down the old RS according to the configured maxSurge and maxUnavailable parameters. Old ReplicaSets are retained (configurable via revisionHistoryLimit) to enable rollbacks with kubectl rollout undo.
HorizontalPodAutoscaler (HPA) and VerticalPodAutoscaler (VPA) both target Deployments. HPA adjusts the replica count based on metrics like CPU or custom metrics; VPA adjusts the resource requests of individual Pod specs. Combining a Deployment with HPA is the most common pattern for production autoscaling of stateless services.
Example
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-api
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: web-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: web-api
spec:
containers:
- name: api
image: my-org/web-api:v2.1.0
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"Cost & Waste Implications
Deployments with over-provisioned replica counts or resource requests are the most common source of compute waste in Kubernetes. A Deployment with replicas: 10 when actual traffic only requires 3 wastes 70% of its allocated compute budget. Without HPA, replicas are never automatically reduced during off-peak hours.
How KorPro Helps
KorPro identifies Deployments with consistently low CPU and memory utilization relative to their requests, flags replicas that have been idle for extended periods, and estimates monthly savings from rightsizing.
Scan Your Cluster FreeRelated Terms
ReplicaSet
WorkloadsA controller that ensures a specified number of Pod replicas are running at any given time.
Read definitionHorizontalPodAutoscaler(HPA)
ScalingA controller that automatically scales the replica count of a Deployment or StatefulSet based on observed metrics.
Read definitionResource Requests and Limits
ConfigurationPer-container declarations of guaranteed CPU/memory (requests) and hard maximums (limits) that drive scheduling and enforcement.
Read definitionPod
Core ConceptsThe smallest deployable unit in Kubernetes — one or more containers that share a network namespace and storage volumes.
Read definitionStop Wasting Money on Orphaned Kubernetes Resources
KorPro connects to your clusters across GCP, AWS, and Azure — no agents, no installation — and surfaces every orphaned resource with its monthly cost estimate.