Workloads

Deployment

A controller that manages a ReplicaSet to keep a specified number of identical Pod replicas running and handles rolling updates.

What is Deployment?

A Deployment is the standard way to run stateless, replicated workloads in Kubernetes. You declare the desired state — container image, replica count, update strategy, resource requests — and the Deployment controller continuously reconciles actual state to match it. If a Pod crashes, the controller replaces it. If you update the image, it performs a rolling update: creating new Pods with the new image while terminating old ones, maintaining availability throughout.

Deployments manage Pods indirectly through a ReplicaSet. Each rollout creates a new ReplicaSet; the Deployment controller scales up the new RS and scales down the old RS according to the configured maxSurge and maxUnavailable parameters. Old ReplicaSets are retained (configurable via revisionHistoryLimit) to enable rollbacks with kubectl rollout undo.

HorizontalPodAutoscaler (HPA) and VerticalPodAutoscaler (VPA) both target Deployments. HPA adjusts the replica count based on metrics like CPU or custom metrics; VPA adjusts the resource requests of individual Pod specs. Combining a Deployment with HPA is the most common pattern for production autoscaling of stateless services.

Example

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: web-api
    spec:
      containers:
      - name: api
        image: my-org/web-api:v2.1.0
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "500m"
            memory: "1Gi"

Cost & Waste Implications

Deployments with over-provisioned replica counts or resource requests are the most common source of compute waste in Kubernetes. A Deployment with replicas: 10 when actual traffic only requires 3 wastes 70% of its allocated compute budget. Without HPA, replicas are never automatically reduced during off-peak hours.

KorPro— Kubernetes Cost Optimization

How KorPro Helps

KorPro identifies Deployments with consistently low CPU and memory utilization relative to their requests, flags replicas that have been idle for extended periods, and estimates monthly savings from rightsizing.

Scan Your Cluster Free

Related Terms

ReplicaSet

Workloads

A controller that ensures a specified number of Pod replicas are running at any given time.

Read definition

HorizontalPodAutoscaler(HPA)

Scaling

A controller that automatically scales the replica count of a Deployment or StatefulSet based on observed metrics.

Read definition

Resource Requests and Limits

Configuration

Per-container declarations of guaranteed CPU/memory (requests) and hard maximums (limits) that drive scheduling and enforcement.

Read definition

Pod

Core Concepts

The smallest deployable unit in Kubernetes — one or more containers that share a network namespace and storage volumes.

Read definition

Stop Wasting Money on Orphaned Kubernetes Resources

KorPro connects to your clusters across GCP, AWS, and Azure — no agents, no installation — and surfaces every orphaned resource with its monthly cost estimate.

Get Started Free Contact Sales