Replicate pricing
Quick answer
Replicate pricing is usage-based, charging per compute second or per inference depending on the model. Some models offer free usage tiers, but most require payment via credits purchased on the Replicate platform. Use the replicate Python package with your API token set in REPLICATE_API_TOKEN to manage usage and costs.
PREREQUISITES
Python 3.8+Replicate API token (set REPLICATE_API_TOKEN environment variable)pip install replicate
Setup
Install the official replicate Python package and set your API token as an environment variable to authenticate requests.
pip install replicate output
Collecting replicate Downloading replicate-0.10.0-py3-none-any.whl (30 kB) Installing collected packages: replicate Successfully installed replicate-0.10.0
Step by step
Use the replicate package to run a model and monitor usage. Pricing depends on the model and is billed per second of GPU or CPU usage. Free usage is limited and varies by model.
import os
import replicate
# Ensure your API token is set in the environment
# export REPLICATE_API_TOKEN="your_token_here"
client = replicate.Client()
# Run a model (example: meta-llama-3-8b-instruct)
output = client.run(
"meta/meta-llama-3-8b-instruct",
input={"prompt": "Hello, how are you?", "max_tokens": 50}
)
print("Model output:", output)
# Note: Pricing is based on compute time used by the model run. output
Model output: Hello! I'm doing well, thank you. How can I assist you today?
Common variations
You can run image generation models or other types by changing the model name and input parameters. Async usage is supported via asyncio with the replicate package. Pricing varies by model type (text, image, video) and compute resources used.
import asyncio
import os
import replicate
async def main():
client = replicate.Client()
output = await client.async_run(
"stability-ai/stable-diffusion",
input={"prompt": "A futuristic cityscape at sunset"}
)
print("Image URL:", output[0])
asyncio.run(main()) output
Image URL: https://replicate.delivery/your-generated-image.png
Troubleshooting
- If you get authentication errors, verify your
REPLICATE_API_TOKENis set correctly. - For quota exceeded errors, check your usage dashboard on Replicate and consider upgrading your plan or reducing usage.
- Model-specific errors may require checking the model documentation on Replicate.
Key Takeaways
- Replicate pricing is usage-based, billed per compute second or inference.
- Set your API token in the environment variable REPLICATE_API_TOKEN before using the SDK.
- Pricing varies by model type and compute resources; check Replicate's dashboard for detailed costs.
- Use the official replicate Python package for easy integration and usage tracking.