Exit Code 137: What It Means and How to Fix It

All posts
·1 min read

Exit Code 137: What It Means and How to Fix It

debuggingdockerkubernetesdevops

Exit code 137 means your process received SIGKILL (signal 9) and was terminated immediately. The exit code is calculated as 128 + 9 = 137. In nearly every case, the Linux kernel's OOM (Out of Memory) killer is responsible. Your process used more memory than it was allowed, and the kernel killed it to protect the system.

#What exit code 137 actually means

Unix processes that terminate due to a signal produce an exit code of 128 plus the signal number. SIGKILL is signal 9 — unblockable, uncatchable. Unlike SIGTERM (signal 15, exit code 143), SIGKILL gives the process no opportunity to clean up, flush buffers, or shut down gracefully. The kernel ends it immediately.

# Kill a process with SIGKILL
kill -9 <pid>
 
# The process exits with code 137 (128 + 9)
echo $?
# 137

When you see exit code 137 in Docker, Kubernetes, or a CI/CD pipeline, the cause is almost always the same: the process exceeded its memory limit, and the Linux OOM killer sent SIGKILL.

The OOM killer is a kernel mechanism that activates when the system (or a cgroup) runs out of available memory. It scores each process based on memory consumption and priority, then kills the highest-scoring process. In containerized environments, the container's memory limit defines the boundary — exceed it, and the OOM killer targets your process.

#Common causes

Docker memory limits exceeded

When you run a container with --memory, Docker sets a hard cgroup limit. If the process inside exceeds that limit, the kernel kills it.

# Run with a 256MB memory limit
docker run --memory=256m my-app
 
# If the process uses more than 256MB → exit code 137

You can confirm this by inspecting the container:

docker inspect <container_id> --format='{{.State.OOMKilled}}'
# true

The full inspect output shows the kill:

{
  "State": {
    "Status": "exited",
    "Running": false,
    "OOMKilled": true,
    "ExitCode": 137
  }
}

If OOMKilled is false but the exit code is still 137, the process was killed by something else sending SIGKILL — possibly docker kill, a health check timeout, or host-level memory pressure.

Kubernetes OOM killed

Kubernetes enforces memory limits through cgroups. When a container exceeds its resources.limits.memory, the kubelet's OOM killer terminates the container.

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
    - name: app
      image: my-app:latest
      resources:
        requests:
          memory: "128Mi"
        limits:
          memory: "256Mi"

When the container exceeds 256Mi, kubectl describe pod shows:

Last State:     Terminated
  Reason:       OOMKilled
  Exit Code:    137

Kubernetes also distinguishes between container-level OOM (container hit its own limit) and node-level OOM (node ran out of memory, eviction triggered). Both produce exit code 137, but the events differ:

# Check pod events
kubectl describe pod my-app | grep -A5 "Events"
 
# Check node-level OOM events
kubectl get events --field-selector reason=OOMKilling

CI/CD runner out of memory

GitHub Actions runners provide about 7 GB of RAM (per GitHub's documentation for ubuntu-latest). GitLab shared runners vary. If your build, test suite, or bundler exceeds the runner's memory limit, the kernel kills the process.

Common culprits:

  • npm run build on a large Next.js project (webpack/turbopack can spike to 4+ GB)
  • Jest running tests in parallel with --maxWorkers set too high
  • Java builds without -Xmx tuned for the CI environment
  • Docker-in-Docker builds with no memory constraints

There is no OOMKilled flag to inspect in CI. Check the kernel logs if you have access, or reduce memory consumption and see if the error disappears.

Manual kill (less common)

Exit code 137 can also result from someone or something explicitly sending SIGKILL:

# Manual kill
kill -9 <pid>
 
# Docker kill (sends SIGKILL by default)
docker kill <container_id>
 
# Kubernetes — deleting a pod that doesn't terminate within grace period
kubectl delete pod my-app --grace-period=0 --force

If you see exit code 137 but memory usage was well within limits, check whether another process, an orchestrator, or a health check timeout sent the kill signal.

#How to diagnose the cause

Start with the environment your process runs in and work outward.

Docker

# Check if OOM killed the container
docker inspect <container_id> --format='{{.State.OOMKilled}}'
 
# View memory stats at time of death
docker stats --no-stream <container_id>
 
# Check host kernel logs for OOM events
dmesg | grep -i "oom\|killed process"
 
# Or on systemd-based hosts
journalctl -k | grep -i "oom\|killed process"

Kubernetes

# Pod status and OOM events
kubectl describe pod <pod-name>
 
# Previous container logs (before it was killed)
kubectl logs <pod-name> --previous
 
# Node-level memory pressure
kubectl describe node <node-name> | grep -A5 "Conditions"
 
# Cluster-wide OOM events
kubectl get events --field-selector reason=OOMKilling --sort-by='.lastTimestamp'

Memory profiling by language

If you know the OOM killer is responsible, the next step is understanding why your process uses so much memory.

#How to fix it

Set appropriate memory limits

Before raising limits, rule out a memory leak. Bumping a limit just delays the next crash if the process is leaking.

Docker:

# Run with memory limit and swap disabled
docker run --memory=512m --memory-swap=512m my-app
 
# Monitor actual usage
docker stats my-app

Kubernetes:

resources:
  requests:
    memory: "256Mi"   # Scheduler guarantee — what your app normally uses
  limits:
    memory: "512Mi"   # Hard ceiling — peak usage with headroom

Set requests to your application's steady-state usage. Set limits to 1.5–2x the request to accommodate spikes. If requests and limits are equal, the pod gets Guaranteed QoS — lowest eviction priority under node pressure.

Understand Kubernetes QoS classes

Kubernetes assigns a QoS class to each pod based on its resource configuration. This determines eviction order when the node runs low on memory:

  • Guaranteed — requests equal limits for all containers. Last to be evicted. Lowest oom_score_adj (-997).
  • Burstable — requests set but lower than limits. Evicted after BestEffort pods.
  • BestEffort — no requests or limits set. First to be evicted. Highest oom_score_adj (1000).

If your pods keep getting OOM killed, check whether they are BestEffort. Setting even modest resource requests changes the eviction priority.

# Check QoS class
kubectl get pod my-app -o jsonpath='{.status.qosClass}'

Fix memory leaks

Raising limits is a stopgap if the application has a genuine memory leak. Common patterns:

Node.js — unbounded caches, event listener accumulation, closures holding references to large objects. Use --max-old-space-size to set a hard V8 heap limit:

node --max-old-space-size=384 app.js

Python — global lists that accumulate data, circular references not caught by the GC, C extensions that allocate outside Python's allocator. Use tracemalloc (shown above) to find the source.

Go — goroutine leaks (goroutines that never exit), sync.Pool misuse, []byte buffers that grow but never shrink. Use pprof to identify allocations.

Java — set -XX:MaxRAMPercentage=75.0 instead of a fixed -Xmx. This adapts to the container's cgroup limit. Never set -Xmx equal to the container memory limit — the JVM uses 20-30% of memory for non-heap regions.

Strip unnecessary runtime dependencies

Larger images do not directly cause OOM, but dev dependencies and build tools left in the final image can consume memory at runtime. Use multi-stage builds to keep only what the application needs. See our Docker deploy guide for examples.

#Prevention strategies

  • Set memory limits on every container. Never run without limits. An unbounded process can take down the entire node.
  • Monitor memory over time. A slow leak that grows 1 MB/hour will not show up in a quick smoke test. Use docker stats, Prometheus with container_memory_usage_bytes, or your platform's built-in metrics.
  • Load test before production. Run realistic traffic against your application with memory limits set. If it OOMs during load testing, it will OOM in production.
  • Set application-level limits. Node.js --max-old-space-size, Java -XX:MaxRAMPercentage, Go GOMEMLIMIT (Go 1.19+). These give the runtime a chance to GC aggressively before hitting the hard cgroup limit.
  • Use requests and limits correctly in Kubernetes. Set requests to steady-state usage and limits to peak usage plus 20% headroom. Equal requests and limits give you Guaranteed QoS.
  • Watch for CI-specific issues. Parallelize less aggressively in CI. Set --maxWorkers=2 for Jest, limit webpack parallelism with JOBS=2, and set JVM heap limits below the runner's available memory.

#Managed platforms and OOM

On managed Kubernetes platforms — AZIN (GKE Autopilot), Google Cloud Run, AWS ECS — node-level memory is abstracted. GKE Autopilot right-sizes nodes to match pod requests, so you never manually provision capacity. You still need correct resources.requests and resources.limits in your spec, and the same kubectl describe pod / OOMKilled diagnostics apply. The difference is that node-level OOM (the cluster itself starving) is the platform's problem, not yours.

For language-specific deployment guides, see Deploy Node.js or Deploy Python.

Auto-deploy into your own cloud

Push code, AZIN handles the rest. Auto-detected builds, your cloud account, no vendor lock-in.