How to beginner · 3 min read

How to run custom models on Replicate

Q: How to run custom models on Replicate

Use the replicate Python package to run custom models by specifying the model's repository name and input parameters in replicate.run(). Authenticate with your API token via the REPLICATE_API_TOKEN environment variable to execute the model and retrieve outputs.

Quick answer

Use the replicate Python package to run custom models by specifying the model's repository name and input parameters in replicate.run(). Authenticate with your API token via the REPLICATE_API_TOKEN environment variable to execute the model and retrieve outputs.

PREREQUISITES

Python 3.8+
pip install replicate
Replicate API token set as REPLICATE_API_TOKEN environment variable

Setup

Install the official replicate Python package and set your Replicate API token as an environment variable for authentication.

bash

pip install replicate

# On Linux/macOS
export REPLICATE_API_TOKEN="your_token_here"

# On Windows (PowerShell)
setx REPLICATE_API_TOKEN "your_token_here"

output

Collecting replicate
  Downloading replicate-0.10.0-py3-none-any.whl (20 kB)
Installing collected packages: replicate
Successfully installed replicate-0.10.0

# No output for environment variable set command

Step by step

Run a custom model on Replicate by specifying the model repository and input parameters. The example below runs the meta/meta-llama-3-8b-instruct model with a prompt and max tokens.

python

import os
import replicate

# Ensure your REPLICATE_API_TOKEN is set in your environment

def run_custom_model():
    model_name = "meta/meta-llama-3-8b-instruct"
    prompt = "Write a short poem about AI."

    output = replicate.run(
        model_name,
        input={"prompt": prompt, "max_tokens": 128}
    )
    print("Model output:", output)

if __name__ == "__main__":
    run_custom_model()

output

Model output: "AI dances in circuits, bright and free,\nCrafting dreams in code's deep sea..."

Common variations

You can run models asynchronously using replicate.async_run() or specify different models by changing the model repository string. For image models, inputs and outputs differ accordingly.

python

import asyncio
import replicate

async def run_async_model():
    model_name = "stability-ai/stable-diffusion"
    prompt = "A futuristic cityscape at sunset"

    output = await replicate.async_run(
        model_name,
        input={"prompt": prompt}
    )
    print("Image URL:", output[0])

if __name__ == "__main__":
    asyncio.run(run_async_model())

output

Image URL: https://replicate.delivery/v1/.../output.png

Troubleshooting

If you get an authentication error, verify your REPLICATE_API_TOKEN is set correctly.
If the model repository name is invalid, check the exact model name on replicate.com/models.
For input validation errors, consult the model's input schema on Replicate.

Key Takeaways

Use the official replicate Python package and set your API token via environment variable.
Run custom models by specifying the model repository and input parameters in replicate.run().
Async execution is supported with replicate.async_run() for non-blocking calls.
Always verify model names and input schemas on replicate.com/models to avoid errors.
Authentication errors usually mean your REPLICATE_API_TOKEN is missing or incorrect.

Verified 2026-04 · meta/meta-llama-3-8b-instruct, stability-ai/stable-diffusion

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.