How to serve a model API on RunPod
Quick answer
Use the
runpod Python package to serve a model API by setting your RUNPOD_API_KEY environment variable, creating an Endpoint instance with your endpoint ID, and calling run_sync with your input payload. This enables easy synchronous or asynchronous inference calls to your deployed RunPod model.PREREQUISITES
Python 3.8+RunPod API key (set RUNPOD_API_KEY environment variable)pip install runpod
Setup
Install the runpod Python package and set your API key as an environment variable for authentication.
pip install runpod output
Collecting runpod Downloading runpod-1.0.0-py3-none-any.whl (10 kB) Installing collected packages: runpod Successfully installed runpod-1.0.0
Step by step
This example shows how to synchronously call a deployed RunPod model endpoint using the runpod SDK.
import os
import runpod
# Set your RunPod API key in environment variable RUNPOD_API_KEY
runpod.api_key = os.environ["RUNPOD_API_KEY"]
# Replace with your actual RunPod endpoint ID
endpoint_id = "YOUR_ENDPOINT_ID"
# Create an Endpoint instance
endpoint = runpod.Endpoint(endpoint_id)
# Define input payload for the model
input_data = {"prompt": "Hello, RunPod!"}
# Call the endpoint synchronously
result = endpoint.run_sync({"input": input_data})
print("Model output:", result["output"]) output
Model output: Hello, RunPod! This is your model responding.
Common variations
- Asynchronous calls: Use
await endpoint.run_async({"input": ...})inside an async function. - Streaming: RunPod currently supports synchronous and async calls; streaming requires custom implementation.
- Different models: Deploy your preferred model on RunPod and use its endpoint ID.
import asyncio
async def async_call():
runpod.api_key = os.environ["RUNPOD_API_KEY"]
endpoint = runpod.Endpoint("YOUR_ENDPOINT_ID")
input_data = {"prompt": "Async call example"}
result = await endpoint.run_async({"input": input_data})
print("Async model output:", result["output"])
asyncio.run(async_call()) output
Async model output: Async call example response from your model.
Troubleshooting
- If you get authentication errors, verify your
RUNPOD_API_KEYenvironment variable is set correctly. - If the endpoint ID is invalid, confirm it matches your deployed model's endpoint on RunPod dashboard.
- For network timeouts, check your internet connection and RunPod service status.
Key Takeaways
- Set your RunPod API key in the environment variable
RUNPOD_API_KEYbefore using the SDK. - Use
runpod.Endpoint(endpoint_id).run_sync()for simple synchronous model inference calls. - Async calls are supported with
run_async()inside async functions for concurrency. - Always verify your endpoint ID matches the deployed model on RunPod to avoid errors.
- Troubleshoot common issues by checking API key, endpoint ID, and network connectivity.