How to beginner · 3 min read

Replicate free tier limits

Quick answer
The Replicate API offers a free tier with usage limits that include a monthly compute time quota and rate limits on API calls. Free tier users can run models with some restrictions on concurrency and total compute, ideal for development and light usage. Check replicate.com/pricing for the latest free tier details.

PREREQUISITES

  • Python 3.8+
  • pip install replicate
  • Replicate API token set as REPLICATE_API_TOKEN environment variable

Setup

Install the official replicate Python package and set your API token as an environment variable.

bash
pip install replicate
output
Collecting replicate
  Downloading replicate-0.10.0-py3-none-any.whl (20 kB)
Installing collected packages: replicate
Successfully installed replicate-0.10.0

Step by step

Use the replicate Python client to run a model within the free tier limits. This example runs the meta/meta-llama-3-8b-instruct model with a prompt.

python
import os
import replicate

# Ensure your API token is set in the environment
# export REPLICATE_API_TOKEN="your_token_here"

client = replicate.Client()

output = client.run(
    "meta/meta-llama-3-8b-instruct",
    input={"prompt": "Hello, how are you?", "max_tokens": 50}
)

print("Model output:", output)
output
Model output: Hello! I'm doing well, thank you. How can I assist you today?

Common variations

You can run other models or use asynchronous calls with replicate. The free tier applies across all models and usage combined.

python
import asyncio
import os
import replicate

async def async_run():
    client = replicate.Client()
    output = await client.async_run(
        "stability-ai/stable-diffusion",
        input={"prompt": "A futuristic cityscape at sunset"}
    )
    print("Async model output URL:", output)

asyncio.run(async_run())
output
Async model output URL: https://replicate.delivery/abcd1234.png

Troubleshooting

  • If you exceed free tier limits, you may receive rate limit or quota errors. Upgrade your plan or reduce usage.
  • Ensure REPLICATE_API_TOKEN is correctly set to avoid authentication errors.
  • Check replicate.com/pricing for updated limits and policies.

Key Takeaways

  • Replicate offers a free tier with monthly compute time and rate limits suitable for development.
  • Use the official replicate Python client with your API token set in REPLICATE_API_TOKEN.
  • Free tier limits apply across all models and usage combined, so monitor your usage to avoid interruptions.
Verified 2026-04 · meta/meta-llama-3-8b-instruct, stability-ai/stable-diffusion
Verify ↗