What does OOM killed mean?

OOM killed means the Linux kernel's Out of Memory killer terminated your process. This happens when the system or container runs out of available memory and the kernel must free resources to prevent a complete system crash. The killed process receives SIGKILL (signal 9) and exits with code 137.

How do I find out which process was OOM killed?

Check the kernel logs with dmesg -T | grep -i "oom\|killed process" on Linux, or journalctl -k | grep -i oom on systemd-based systems. The kernel logs the process name, PID, and memory stats at the time of the kill. In Docker, run docker inspect --format='{{.State.OOMKilled}}'. In Kubernetes, run kubectl describe pod and look for Reason: OOMKilled.

How do I fix OOM killed in Docker?

Set appropriate memory limits with docker run --memory=512m, then verify your application actually fits within them using docker stats. If the process genuinely needs more memory, raise the limit. If memory grows unbounded over time, you have a leak — profile with language-specific tools before increasing limits. Also set --memory-swap equal to --memory to prevent swap usage from masking the problem.

How do I fix OOM killed in Kubernetes?

Check kubectl describe pod to confirm the OOM kill, then kubectl top pod --containers to see current usage. Increase resources.limits.memory if the limit is too tight, or investigate the application for memory leaks if usage grows without bound. Set resources.requests.memory to steady-state usage and resources.limits.memory to 1.5-2x that value. Equal requests and limits give Guaranteed QoS, making the pod the last to be evicted under node pressure.

Can I disable the OOM killer?

On bare metal Linux, you can set oom_score_adj to -1000 for critical processes, making them effectively immune. Docker offers --oom-kill-disable, but only use it with a memory limit set — otherwise a runaway process can freeze the entire host. In Kubernetes, there is no direct way to disable the OOM killer, but Guaranteed QoS class (requests equal limits) gives the lowest possible oom_score_adj of -998.

Why does my process get OOM killed even though the host has free memory?

In containers, memory limits are enforced per-cgroup, not per-host. Your container might have a 512MB limit while the host has 16GB free — the OOM killer still fires when the container exceeds its cgroup ceiling. Check docker inspect or kubectl describe pod for the configured memory limit, not free -m on the host. Node-level eviction in Kubernetes is a separate mechanism that considers the host's total memory.

←All posts

March 5, 2026·1 min read

OOM Killed: What It Means and How to Fix It

devopstroubleshootingdockerkuberneteslinux

OOM killed means the Linux kernel's Out of Memory killer terminated your process because the system (or its cgroup) ran out of available memory. The kernel had two options — freeze the entire machine or kill something. It killed your process. If you're seeing exit code 137, an OOM kill is almost certainly the reason.

#How the Linux OOM killer works

Linux overcommits memory by default. When a process calls malloc(), the kernel hands back a virtual address range without actually reserving physical pages. It assumes most processes won't use everything they request. This works well until too many processes actually touch their allocated memory at the same time, and the kernel has no free pages left to back the promises it made.

At that point, the OOM killer activates. It scores every running process, picks the one with the highest score, and sends it SIGKILL (signal 9). No warning, no chance to clean up. The process dies instantly.

How processes get scored

Every process has an OOM score visible at /proc/<pid>/oom_score. The score is roughly proportional to the percentage of physical memory the process consumes — a process using 10% of RAM gets a score around 100, one using 50% gets around 500. The range is 0 to 1000.

The kernel also factors in whether the process is privileged (root processes score slightly lower) and how the administrator has tuned the score via oom_score_adj.

# Check a process's OOM score
cat /proc/$(pidof my-app)/oom_score
 
# Check its adjustment value
cat /proc/$(pidof my-app)/oom_score_adj

The oom_score_adj parameter ranges from -1000 to +1000. Setting it to -1000 makes the process effectively immune to the OOM killer. Setting it to +1000 makes it the first target. Only root can decrease the value.

# Protect a critical process (requires root)
echo -500 | sudo tee /proc/$(pidof postgres)/oom_score_adj
 
# Make a process more likely to be killed
echo 500 > /proc/$(pidof cache-warmer)/oom_score_adj

Memory overcommit modes

The kernel's overcommit behavior is controlled by vm.overcommit_memory:

0 (default) — Heuristic overcommit. The kernel guesses whether a memory allocation is reasonable and allows most requests, even if physical memory is not fully available.
1 — Always overcommit. Every malloc() succeeds regardless of available memory. The OOM killer becomes the only safety net.
2 — Never overcommit. The kernel refuses allocations that exceed physical RAM plus swap times a configurable ratio. Processes get clean allocation failures instead of OOM kills — but applications that rely on overcommit will break.

# Check current overcommit mode
cat /proc/sys/vm/overcommit_memory
 
# Switch to strict mode (no overcommit)
sudo sysctl vm.overcommit_memory=2

Most containerized workloads run with mode 0, and memory limits are enforced via cgroups rather than overcommit settings.

#OOM killed in Docker

Docker uses Linux cgroups to enforce memory limits. When you pass --memory to docker run, the kernel creates a cgroup with a hard memory ceiling. If the container's processes collectively exceed that ceiling, the kernel's OOM killer terminates the offending process inside the cgroup — not on the host.

# Run with a 512MB memory limit
docker run --memory=512m --memory-swap=512m my-app

When --memory-swap equals --memory, the container gets no swap space. If you omit --memory-swap, the container can use swap equal to its memory limit (so 512m memory + 512m swap = 1024m total). Setting --memory-swap=-1 gives unlimited swap, which defeats the purpose of memory limits in most cases.

Confirming a Docker OOM kill

# Did the OOM killer get this container?
docker inspect my-container --format='{{.State.OOMKilled}}'
# true
 
# Full state output
docker inspect my-container --format='{{json .State}}' | jq

The inspect output looks like this when OOM is the cause:

{
  "Status": "exited",
  "Running": false,
  "OOMKilled": true,
  "ExitCode": 137
}

If OOMKilled is false but the exit code is still 137, something else sent SIGKILL — docker kill, a health check timeout, or host-level memory pressure killed the container from the outside.

Monitoring container memory

# Live memory usage for all containers
docker stats
 
# One-shot snapshot
docker stats --no-stream
 
# Host kernel logs showing OOM events
dmesg | grep -i "oom\|killed process"

The dmesg output for an OOM kill looks roughly like this:

[123456.789] my-app invoked oom-killer: gfp_mask=0xcc0, order=0
[123456.790] Memory cgroup out of memory: Killed process 4521 (node)
             total-vm:1548672kB, anon-rss:524288kB, file-rss:12288kB

That anon-rss value is the resident memory at the time of the kill. Compare it to your --memory limit.

#OOM killed in Kubernetes

Kubernetes wraps Docker's (or containerd's) cgroup limits with its own resource model. You declare resources.requests and resources.limits in your pod spec, and the kubelet translates those into cgroup constraints.

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
    - name: app
      image: my-app:latest
      resources:
        requests:
          memory: "256Mi"
        limits:
          memory: "512Mi"

When the container exceeds 512Mi, the cgroup OOM killer terminates it. Kubernetes detects the SIGKILL, marks the container as OOMKilled, and — depending on your restartPolicy — restarts it. This is where the CrashLoopBackOff spiral begins if the process immediately consumes the same amount of memory on restart.

Diagnosing a Kubernetes OOM kill

# See the OOMKilled reason and exit code
kubectl describe pod my-app

The relevant section in the output:

Last State:     Terminated
  Reason:       OOMKilled
  Exit Code:    137
  Started:      Wed, 05 Mar 2026 10:23:00 +0000
  Finished:     Wed, 05 Mar 2026 10:24:12 +0000

More useful commands:

# Logs from the previous container instance (before OOM)
kubectl logs my-app --previous
 
# Current memory usage per container
kubectl top pod my-app --containers
 
# Node-level memory pressure
kubectl describe node $(kubectl get pod my-app -o jsonpath='{.spec.nodeName}') | grep -A5 "Conditions"
 
# All OOM events in the cluster
kubectl get events --field-selector reason=OOMKilling --sort-by='.lastTimestamp'

QoS classes and eviction priority

Kubernetes assigns a Quality of Service class to each pod based on how you define resources. This determines the order in which pods get killed when the node itself is under memory pressure — distinct from a container hitting its own limit.

Guaranteed — Requests equal limits for every container. Gets oom_score_adj of -998. Last to be evicted under node pressure.

Burstable — Requests are set but lower than limits. Gets a calculated oom_score_adj between 2 and 999 based on the ratio. Evicted after BestEffort.

BestEffort — No requests or limits defined. Gets oom_score_adj of 1000. First to die when the node runs low.

# Check a pod's QoS class
kubectl get pod my-app -o jsonpath='{.status.qosClass}'

If your pods keep getting OOM killed and they are BestEffort, adding even minimal resource requests changes their eviction priority significantly. The jump from oom_score_adj=1000 to oom_score_adj=-998 is the difference between being the first and last process the kernel targets.

#Finding which process was OOM killed

The kernel logs every OOM kill event. Where you find those logs depends on the system.

The kernel log entry names the killed process, its PID, and the memory stats at the time of death. It also dumps a table of all running processes and their memory consumption leading up to the kill — this table is your best diagnostic tool because it shows exactly what was consuming memory across the system.

For containers, the cgroup path in the log identifies which container was involved. In Kubernetes, the cgroup path includes the pod UID, making it traceable back to a specific pod.

#How to fix and prevent OOM kills

Rule out memory leaks first

Raising limits is the obvious fix, but it just delays the crash if the application leaks memory. Profile first, then set limits.

Node.js — V8's heap grows until the OS kills the process unless you cap it. The --max-old-space-size flag sets a hard ceiling on the V8 heap. Without it, a Node.js process in a container with 512MB will happily try to allocate 1.5GB.

node --max-old-space-size=384 app.js

Python — tracemalloc tracks allocations back to source lines. Common leaks: global lists that accumulate records, Django querysets that never get evaluated, and C extensions allocating outside Python's allocator.

import tracemalloc
tracemalloc.start()
 
# ... run your workload ...
 
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics("lineno")[:10]:
    print(stat)

Java — The JVM needs memory well beyond the heap. Metaspace, thread stacks, JIT compiler buffers, and GC overhead can consume 25-30% of total memory. Never set -Xmx equal to the container memory limit.

# Let the JVM calculate heap based on the cgroup limit
java -XX:MaxRAMPercentage=75.0 -jar app.jar

Go — Since Go 1.19, GOMEMLIMIT tells the runtime to GC aggressively before hitting a hard limit. Set it to 80-90% of the container memory limit and the garbage collector will work harder to stay within bounds.

GOMEMLIMIT=400MiB ./my-app

See the Node.js deploy guide, Python deploy guide, and Go deploy guide for full configuration details.

Set container memory limits

Running containers without memory limits is asking for trouble. A single process can consume all available memory on a node, affecting every other workload.

Docker:

docker run -d \
  --memory=512m \
  --memory-swap=512m \
  --name my-app \
  my-app:latest

Kubernetes:

resources:
  requests:
    memory: "256Mi"   # What the app normally uses (scheduler guarantee)
  limits:
    memory: "512Mi"   # Hard ceiling — set to 1.5-2x the request

Set requests to your application's steady-state memory consumption. Set limits to accommodate spikes with some headroom. If requests and limits are equal, the pod gets Guaranteed QoS — lowest eviction priority.

Right-size your limits with data

Guessing memory limits leads to either wasted resources or OOM kills. Measure actual usage under load, then add headroom.

# Docker — watch peak memory over time
docker stats --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"
 
# Kubernetes — current usage (requires metrics-server)
kubectl top pod --containers
 
# Prometheus query for peak memory over 24h
max_over_time(container_memory_usage_bytes{pod="my-app"}[24h])

Run your application under realistic load (not just startup), observe the peak, and set limits to 1.5x that peak. For JVM applications, account for non-heap memory by adding 30% above max heap.

Configure application-level memory limits

The container cgroup limit is the last line of defense. Application-level limits give the runtime a chance to GC, shed load, or fail gracefully before the kernel steps in.

Runtime	Flag	Effect
Node.js	`--max-old-space-size=384`	Caps V8 heap at 384MB
Java	`-XX:MaxRAMPercentage=75.0`	Sets heap to 75% of cgroup limit
Go	`GOMEMLIMIT=400MiB`	Triggers aggressive GC at 400MB
Python	No built-in limit	Use monitoring + process managers
.NET	`System.GC.HeapHardLimit`	Hard heap ceiling

Use multi-stage Docker builds

Dev dependencies, build tools, and package managers left in the final image don't directly cause OOM, but they add memory overhead at runtime if they load shared libraries or background processes. Multi-stage builds keep the production image lean. Our Docker deploy guide covers this in detail.

#OOM killed vs exit code 137

They are the same event seen from different perspectives. The OOM killer is the cause; exit code 137 is the symptom.

When the kernel's OOM killer sends SIGKILL to a process, the process exits with code 137 (calculated as 128 + signal 9). Docker reports OOMKilled: true in the container state. Kubernetes sets the termination reason to OOMKilled. The dmesg log shows Killed process <pid>.

Not every exit code 137 is an OOM kill, though. Any SIGKILL — manual kill -9, docker kill, a forced pod deletion — also produces exit code 137. The distinction matters for debugging. Check docker inspect for the OOMKilled flag, or kubectl describe pod for the OOMKilled reason, to confirm memory was the cause. For a deep dive on all the sources of exit code 137, see the companion article on exit code 137.

Auto-deploy into your own cloud

Push code, AZIN handles the rest. Auto-detected builds, your cloud account, no vendor lock-in.

Start deploying

debugging

Exit Code 137: What It Means and How to Fix It

Exit code 137 means your process was killed by SIGKILL — usually the OOM killer. Here's how to diagnose and fix it in Docker, Kubernetes, and CI/CD pipelines.

deployment

Blue-Green Deployment: Ship Without Downtime

Blue-green deployment eliminates downtime by running two identical environments and switching traffic between them. Learn how it works on Kubernetes, when to use it over canary or rolling deploys, and real failure modes to avoid.

PreviousBest Fly.io Alternatives in 2026: 8 Platforms Compared NextBest Coolify Alternatives in 2026: 8 Platforms Compared