Replicate predictions API explained
Quick answer
The
Replicate predictions API lets you run inference on machine learning models hosted on Replicate's platform by sending input parameters and receiving output predictions. You use the replicate Python package or HTTP API to create prediction jobs and retrieve results asynchronously or synchronously.PREREQUISITES
Python 3.8+Replicate API token (set REPLICATE_API_TOKEN environment variable)pip install replicate
Setup
Install the official replicate Python package and set your API token as an environment variable for authentication.
pip install replicate output
Collecting replicate Downloading replicate-0.10.0-py3-none-any.whl (20 kB) Installing collected packages: replicate Successfully installed replicate-0.10.0
Step by step
This example shows how to create a prediction job with the Replicate Python SDK, wait for completion, and print the output.
import os
import replicate
# Ensure your API token is set in the environment
# export REPLICATE_API_TOKEN="your_token_here"
client = replicate.Client()
# Specify the model and input parameters
model = client.models.get("stability-ai/stable-diffusion")
version = model.versions.get("db21e45d73e9ab0a5b1a5c6a1b3f9a7f3b9a5e1a4a5a6a7a8a9a0a1a2a3a4a5a")
inputs = {
"prompt": "A futuristic cityscape at sunset",
"width": 512,
"height": 512,
"num_inference_steps": 50
}
# Create a prediction
prediction = client.predictions.create(version=version, input=inputs)
# Wait for the prediction to complete
prediction.wait()
# Print the output URL(s)
print("Prediction output:", prediction.output) output
Prediction output: ['https://replicate.delivery/pbxt/abc123...']
Common variations
You can use the replicate.run() shortcut for synchronous calls, or use the HTTP API directly with requests. Async usage requires custom async wrappers since the SDK is synchronous.
import os
import replicate
client = replicate.Client()
# Synchronous shortcut
output = replicate.run(
"stability-ai/stable-diffusion:db21e45d73e9ab0a5b1a5c6a1b3f9a7f3b9a5e1a4a5a6a7a8a9a0a1a2a3a4a5a",
input={"prompt": "A dragon flying over mountains"}
)
print("Output URL:", output) output
Output URL: ['https://replicate.delivery/pbxt/xyz789...']
Troubleshooting
- If you get authentication errors, verify your
REPLICATE_API_TOKENenvironment variable is set correctly. - For timeout or network errors, check your internet connection and retry.
- If the prediction status is "failed", inspect
prediction.errorfor details.
Key Takeaways
- Use the official
replicatePython package with your API token set inREPLICATE_API_TOKEN. - Create predictions by specifying model version and input parameters, then wait for completion to get results.
- The
replicate.run()method offers a simple synchronous interface for quick predictions. - Check
prediction.errorand environment variables if you encounter failures or authentication issues.