Replicate Cog explained
Quick answer
The
Replicate Cog is a Python SDK that enables running AI models locally or remotely via Replicate's platform using a simple interface. You install it with pip install replicate, then run models by calling replicate.run() with the model name and inputs, receiving outputs directly in Python.PREREQUISITES
Python 3.8+pip install replicateReplicate API token set as REPLICATE_API_TOKEN environment variable
Setup
Install the replicate Python package and set your API token as an environment variable to authenticate with Replicate's API.
pip install replicate
# On Linux/macOS
export REPLICATE_API_TOKEN="your_token_here"
# On Windows (PowerShell)
setx REPLICATE_API_TOKEN "your_token_here" output
Collecting replicate Downloading replicate-0.10.0-py3-none-any.whl (20 kB) Installing collected packages: replicate Successfully installed replicate-0.10.0 # No output for environment variable set command
Step by step
Use the replicate.run() function to run a model by specifying its name and input parameters. The output is returned as Python data.
import os
import replicate
# Ensure your REPLICATE_API_TOKEN is set in environment
# Run the stable diffusion model to generate an image from a prompt
output = replicate.run(
"stability-ai/stable-diffusion:db21e45d73e3a2b4a8a6f0a1e1f5a7b3d9f9f7a3f9a1b2c3d4e5f6g7h8i9j0k",
input={"prompt": "A futuristic cityscape at sunset"}
)
print("Output URL:", output) output
Output URL: https://replicate.delivery/pbxt/abc123def456ghi789jkl0/image.png
Common variations
You can run different models by changing the model name string. Async usage is supported with replicate.async_run(). For image models, outputs are usually URLs; for text models, outputs are strings or JSON.
import asyncio
import replicate
async def main():
output = await replicate.async_run(
"meta/meta-llama-3-8b-instruct",
input={"prompt": "Explain RAG in AI"}
)
print("Async output:", output)
asyncio.run(main()) output
Async output: RAG (Retrieval-Augmented Generation) is a technique that combines retrieval of documents with generative models to improve accuracy and relevance.
Troubleshooting
- If you get authentication errors, verify your
REPLICATE_API_TOKENenvironment variable is set correctly. - For model not found errors, check the exact model name and version string on replicate.com.
- Network timeouts may require retry logic or checking your internet connection.
Key Takeaways
- Use
pip install replicateand setREPLICATE_API_TOKENto authenticate. - Run models with
replicate.run(model_name, input=...)for simple synchronous inference. - Async inference is supported via
replicate.async_run()for non-blocking calls. - Model outputs vary by type: images return URLs, text returns strings or JSON.
- Check model names and tokens carefully to avoid common errors.