Day-17: Mastering Resource Limits and Requests in Kubernetes: Kubernetes Resource Limits and Requests: How to Manage CPU & Memory Effectively

Introduction:

In Kubernetes, containers are lightweight — but without proper control, a container can:

  • Consume all memory and crash the node
  • Starve other containers of CPU
  • Break production apps due to resource contention

That's why resource requests and limits are vital.

In this post, you'll learn:

  • What resource requests and limits are
  • How Kubernetes uses them for scheduling and throttling
  • How to set them correctly with real-world YAMLs
  • Best practices for managing resources at scale

What Are Resource Requests and Limits?

Resource Requests define the minimum amount of CPU/memory needed. Resource Limits define the maximum amount a container can use.

None

CPU and Memory Units

None

Example: Setting Limits & Requests

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["sh", "-c", "while true; do echo running; sleep 5; done"]
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

Apply it:

kubectl apply -f pod.yaml
kubectl describe pod resource-demo

How It Works

  • Kube-scheduler uses requests to decide where to place the Pod.
  • Kubelet uses limits to enforce usage:
  • CPU: throttled (not killed)
  • Memory: container is killed if it exceeds the limit

Check Usage with kubectl

kubectl top pod resource-demo
kubectl describe pod resource-demo

Use metrics-server to enable real-time stats (kubectl top).

Resource Requests & Limits in Deployments

resources:
  requests:
    cpu: "200m"
    memory: "256Mi"
  limits:
    cpu: "400m"
    memory: "512Mi"

Add this under containers in any Deployment spec.

What Happens If You Don't Set Them?

  • Pod can overuse resources
  • Risk of OOMKill of neighboring containers
  • Scheduler may overcommit CPU
  • No control over fair distribution

Best Practices

  • Always define requests and limits for production workloads
  • Set realistic requests — not too low, not too high
  • Use vertical pod autoscaler (VPA) to auto-tune values (advanced)
  • Monitor usage with Prometheus, Grafana, or kubectl top

Hands-On Lab

  1. Deploy a pod with 64Mi memory limit and force it to consume more:
dd if=/dev/zero of=/dev/null bs=1M count=512

Watch the pod get OOMKilled.

2. Try a CPU-throttled container:

yes > /dev/null

→ CPU usage is limited but container continues.

Namespace-Level Resource Quotas

You can enforce resource caps at the namespace level:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: dev
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 1Gi
    limits.cpu: "4"
    limits.memory: 2Gi

Quick Recap:

  • Requests = minimum guaranteed resources
  • Limits = maximum allowed resources
  • Setting both helps Kubernetes schedule and throttle effectively
  • Critical for performance, stability, and fairness in multi-tenant clusters