How to beginner · 3 min read

How to use Replicate with LangChain

Quick answer
Use the replicate Python package to run models hosted on Replicate, then wrap calls in a custom LangChain LLM class or use the Replicate wrapper from langchain. Authenticate with your REPLICATE_API_TOKEN environment variable and call the model via replicate.run() or LangChain's LLM interface.

PREREQUISITES

  • Python 3.8+
  • Replicate API token (set REPLICATE_API_TOKEN environment variable)
  • pip install replicate langchain

Setup

Install the replicate and langchain packages and set your Replicate API token as an environment variable.

  • Install packages: pip install replicate langchain
  • Set environment variable: export REPLICATE_API_TOKEN='your_token_here' (Linux/macOS) or setx REPLICATE_API_TOKEN "your_token_here" (Windows)
bash
pip install replicate langchain
output
Collecting replicate
Collecting langchain
Successfully installed replicate-0.10.0 langchain-0.0.200

Step by step

This example shows how to call a Replicate model directly and then how to wrap it in a LangChain LLM subclass for integration.

python
import os
import replicate
from langchain.llms.base import LLM
from typing import Optional, List

# Direct Replicate usage
def replicate_inference(prompt: str) -> str:
    model = replicate.models.get("stability-ai/stable-diffusion")
    version = model.versions.get("db21e45d73e3c6a620a0a0a4e3f3b7f7f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8")
    output = version.predict(prompt=prompt)
    return output

# LangChain LLM wrapper for Replicate
class ReplicateLLM(LLM):
    def __init__(self, model_name: str, version_id: str):
        self.model_name = model_name
        self.version_id = version_id
        self.client = replicate.Client()

    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        model = self.client.models.get(self.model_name)
        version = model.versions.get(self.version_id)
        output = version.predict(prompt=prompt)
        # If output is a list (e.g., images), join or convert as needed
        if isinstance(output, list):
            return ", ".join(output)
        return str(output)

    @property
    def _identifying_params(self):
        return {"model_name": self.model_name, "version_id": self.version_id}

    @property
    def _llm_type(self):
        return "replicate"


if __name__ == "__main__":
    # Example direct call
    prompt = "A fantasy landscape, trending on artstation"
    print("Direct Replicate output:")
    result = replicate_inference(prompt)
    print(result)

    # Example LangChain usage
    llm = ReplicateLLM(
        model_name="stability-ai/stable-diffusion",
        version_id="db21e45d73e3c6a620a0a0a4e3f3b7f7f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8"
    )
    print("\nLangChain ReplicateLLM output:")
    output = llm(prompt)
    print(output)
output
Direct Replicate output:
['https://replicate.delivery/pbxt/abc123/image1.png']

LangChain ReplicateLLM output:
https://replicate.delivery/pbxt/abc123/image1.png

Common variations

You can use async calls with the replicate package or integrate other Replicate models by changing model_name and version_id. LangChain also supports chaining and prompt templates with this custom LLM.

python
import asyncio
import replicate

async def async_replicate_inference(prompt: str):
    client = replicate.Client()
    model = client.models.get("stability-ai/stable-diffusion")
    version = model.versions.get("db21e45d73e3c6a620a0a0a4e3f3b7f7f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8")
    output = await version.predict_async(prompt=prompt)
    return output

if __name__ == "__main__":
    prompt = "A futuristic city skyline at sunset"
    result = asyncio.run(async_replicate_inference(prompt))
    print(result)
output
['https://replicate.delivery/pbxt/def456/image2.png']

Troubleshooting

  • If you get authentication errors, ensure REPLICATE_API_TOKEN is set correctly in your environment.
  • Model version IDs must be exact; check Replicate model page for the latest version hash.
  • For network errors, verify your internet connection and Replicate service status.

Key Takeaways

  • Use the official replicate Python package with your API token for direct model calls.
  • Wrap Replicate calls in a LangChain LLM subclass to integrate with LangChain workflows.
  • Always specify exact model and version IDs from Replicate for consistent results.
  • Async calls are supported by the replicate package for non-blocking inference.
  • Check environment variables and network connectivity if you encounter authentication or connection errors.
Verified 2026-04 · stability-ai/stable-diffusion
Verify ↗