Critical severity intermediate · Fix: 15-30 min

RuntimeError

docker.errors.ContainerError

What this error means

The Docker container running the model inference process runs out of memory and crashes, causing inference failure.

Stack trace

traceback

docker.errors.ContainerError: 137 "OOMKilled" - The container was killed because it ran out of memory during model inference.

QUICK FIX

Increase Docker container memory limit with --memory flag to prevent OOM kills during inference.

Why it happens

Docker containers have memory limits set either explicitly or by default. When the model inference process requires more memory than allocated, the Linux OOM killer terminates the container process to free memory, causing the container to crash.

Detection

Monitor container logs and Docker events for exit code 137 or 'OOMKilled' status; set up alerts on container restarts or memory usage spikes during inference.

Causes & fixes

Insufficient memory allocated to the Docker container for the model inference workload

✓ Fix

Increase the container memory limit using Docker's --memory flag or equivalent resource constraints in orchestration platforms.

Model or batch size too large for available container memory

✓ Fix

Reduce the model size, use model quantization, or decrease batch size to fit within memory limits.

Memory leaks or inefficient memory usage in the inference code inside the container

✓ Fix

Profile and optimize the inference code to release unused memory and avoid leaks.

Code: broken vs fixed

Broken - triggers the error

python

import docker
client = docker.from_env()
container = client.containers.run(
    "my-ml-model:latest",
    detach=True,
    # Missing memory limit causes OOM
    command="python inference.py"
)
# This will raise ContainerError if OOM occurs

Fixed - works correctly

python

import os
import docker
client = docker.from_env()
container = client.containers.run(
    "my-ml-model:latest",
    detach=True,
    mem_limit='4g',  # Added 4GB memory limit to prevent OOM
    command="python inference.py"
)
print(f"Container started with ID: {container.id}")

Added mem_limit parameter to allocate sufficient memory to the container, preventing the OOM kill during model inference.

⚠

Workaround

Catch docker.errors.ContainerError exceptions, then restart the container with increased memory or reduce batch size dynamically to avoid OOM.

✓

Prevention

Set explicit memory limits based on profiling inference memory usage; use orchestration tools to monitor and auto-scale resources; optimize model and batch sizes for container constraints.

Python 3.9+ · docker >=5.0.0 · tested on 6.0.0

Verified 2026-04

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.