Back to Glossary
Workloads

Job

A workload controller that runs one or more Pods to completion, guaranteeing that a specified number of completions succeed.

What is Job?

A Kubernetes Job creates one or more Pods and tracks their completion. Unlike Deployments, which keep Pods running indefinitely, a Job's goal is to run a task to successful completion — a database migration, a batch data-processing step, a one-off report generation. The Job controller retries failed Pods (up to backoffLimit attempts) and marks itself Complete when the required number of successful completions is reached.

Jobs support parallelism: you can run multiple Pods simultaneously (spec.parallelism) and require a total number of completions (spec.completions). The work-queue pattern uses parallelism without a fixed completion count — Pods pull items from a queue and exit when the queue is empty. Indexed completion mode assigns each Pod a stable index (via the JOB_COMPLETION_INDEX env var) for sharding batch workloads.

Completed Jobs and their Pods are not automatically garbage-collected by default. The ttlSecondsAfterFinished field (GA since Kubernetes 1.23) enables automatic cleanup after a configurable number of seconds. Without it, completed Jobs and their Pods accumulate indefinitely, bloating etcd and the output of kubectl get pods.

Example

apiVersion: batch/v1
kind: Job
metadata:
  name: db-migration-v4
  namespace: production
spec:
  ttlSecondsAfterFinished: 3600
  backoffLimit: 3
  template:
    spec:
      restartPolicy: OnFailure
      containers:
      - name: migrator
        image: my-org/migrator:v4
        resources:
          requests:
            cpu: "500m"
            memory: "256Mi"

Cost & Waste Implications

Completed and failed Jobs with no ttlSecondsAfterFinished configured accumulate indefinitely. Clusters running CI/CD pipelines through Jobs can accumulate thousands of completed Job and Pod objects over months, degrading API server performance and adding unnecessary etcd load. The Pods themselves no longer consume compute, but the API objects consume etcd storage and slow list operations.

KorPro— Kubernetes Cost Optimization

How KorPro Helps

KorPro detects clusters with large counts of accumulated completed or failed Jobs and estimates the etcd bloat they cause, recommending ttlSecondsAfterFinished configurations or batch cleanup.

Scan Your Cluster Free

Stop Wasting Money on Orphaned Kubernetes Resources

KorPro connects to your clusters across GCP, AWS, and Azure — no agents, no installation — and surfaces every orphaned resource with its monthly cost estimate.