High severity beginner · Fix: 2-5 min

ConnectionError

requests.exceptions.ConnectionError

What this error means

The Python client failed to connect to the running llama.cpp server endpoint, causing a ConnectionError.

Stack trace

traceback

Traceback (most recent call last):
  File "app.py", line 42, in <module>
    response = client.generate(prompt)
  File "/usr/local/lib/python3.9/site-packages/llamacpp/client.py", line 88, in generate
    resp = requests.post(self.endpoint, json=payload, timeout=10)
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 119, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 643, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8c2a4d2d60>: Failed to establish a new connection: [Errno 111] Connection refused'))

QUICK FIX

Verify the llama.cpp server is running and the client endpoint URL matches the server address and port.

Why it happens

This error occurs because the Python client cannot establish a network connection to the llama.cpp server endpoint. Common reasons include the server not running, incorrect endpoint URL or port, firewall blocking the connection, or network issues.

Detection

Monitor connection exceptions from the client library and log connection failures with endpoint details to detect before the app crashes.

Causes & fixes

llama.cpp server process is not running or crashed

✓ Fix

Start or restart the llama.cpp server process and verify it is listening on the expected port.

Incorrect server endpoint URL or port configured in the client

✓ Fix

Check and correct the endpoint URL and port in your client configuration to match the running server.

Firewall or network rules blocking connection to the server port

✓ Fix

Ensure firewall rules allow traffic on the server port and that no network policies block localhost or remote connections.

Server is overloaded or temporarily unreachable

✓ Fix

Implement retry logic with exponential backoff in the client and monitor server health to handle transient unavailability.

Code: broken vs fixed

Broken - triggers the error

python

import os
import requests

endpoint = "http://localhost:5000/wrongpath"  # Incorrect endpoint path
payload = {"prompt": "Hello"}
response = requests.post(endpoint, json=payload)  # This line raises ConnectionError
print(response.json())

Fixed - works correctly

python

import os
import requests

endpoint = "http://localhost:5000/generate"  # Correct endpoint path
payload = {"prompt": "Hello"}
response = requests.post(endpoint, json=payload)  # Fixed: correct endpoint
print(response.json())  # Should print the server response

Corrected the server endpoint URL to the proper path so the client can connect successfully without ConnectionError.

⚠

Workaround

Wrap the request call in try/except ConnectionError, log the failure, and retry after a short delay to handle temporary server downtime.

✓

Prevention

Implement health checks and monitoring for the llama.cpp server process and validate client endpoint configuration during deployment to avoid connection failures.

Python 3.9+ · requests >=2.0.0 · tested on 2.31.0

Verified 2026-04

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.