How to beginner · 3 min read

How to use Replicate with LangChain

Q: How to use Replicate with LangChain

Use the replicate Python package to run models hosted on Replicate, then wrap calls in a custom LangChain LLM class or use the Replicate wrapper from langchain. Authenticate with your REPLICATE_API_TOKEN environment variable and call the model via replicate.run() or LangChain's LLM interface.

Quick answer

Use the replicate Python package to run models hosted on Replicate, then wrap calls in a custom LangChain LLM class or use the Replicate wrapper from langchain. Authenticate with your REPLICATE_API_TOKEN environment variable and call the model via replicate.run() or LangChain's LLM interface.

PREREQUISITES

Python 3.8+
Replicate API token (set REPLICATE_API_TOKEN environment variable)
pip install replicate langchain

Setup

Install the replicate and langchain packages and set your Replicate API token as an environment variable.

Install packages: pip install replicate langchain
Set environment variable: export REPLICATE_API_TOKEN='your_token_here' (Linux/macOS) or setx REPLICATE_API_TOKEN "your_token_here" (Windows)

bash

pip install replicate langchain

output

Collecting replicate
Collecting langchain
Successfully installed replicate-0.10.0 langchain-0.0.200

Step by step

This example shows how to call a Replicate model directly and then how to wrap it in a LangChain LLM subclass for integration.

python

import os
import replicate
from langchain.llms.base import LLM
from typing import Optional, List

# Direct Replicate usage
def replicate_inference(prompt: str) -> str:
    model = replicate.models.get("stability-ai/stable-diffusion")
    version = model.versions.get("db21e45d73e3c6a620a0a0a4e3f3b7f7f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8")
    output = version.predict(prompt=prompt)
    return output

# LangChain LLM wrapper for Replicate
class ReplicateLLM(LLM):
    def __init__(self, model_name: str, version_id: str):
        self.model_name = model_name
        self.version_id = version_id
        self.client = replicate.Client()

    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        model = self.client.models.get(self.model_name)
        version = model.versions.get(self.version_id)
        output = version.predict(prompt=prompt)
        # If output is a list (e.g., images), join or convert as needed
        if isinstance(output, list):
            return ", ".join(output)
        return str(output)

    @property
    def _identifying_params(self):
        return {"model_name": self.model_name, "version_id": self.version_id}

    @property
    def _llm_type(self):
        return "replicate"


if __name__ == "__main__":
    # Example direct call
    prompt = "A fantasy landscape, trending on artstation"
    print("Direct Replicate output:")
    result = replicate_inference(prompt)
    print(result)

    # Example LangChain usage
    llm = ReplicateLLM(
        model_name="stability-ai/stable-diffusion",
        version_id="db21e45d73e3c6a620a0a0a4e3f3b7f7f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8"
    )
    print("\nLangChain ReplicateLLM output:")
    output = llm(prompt)
    print(output)

output

Direct Replicate output:
['https://replicate.delivery/pbxt/abc123/image1.png']

LangChain ReplicateLLM output:
https://replicate.delivery/pbxt/abc123/image1.png

Common variations

You can use async calls with the replicate package or integrate other Replicate models by changing model_name and version_id. LangChain also supports chaining and prompt templates with this custom LLM.

python

import asyncio
import replicate

async def async_replicate_inference(prompt: str):
    client = replicate.Client()
    model = client.models.get("stability-ai/stable-diffusion")
    version = model.versions.get("db21e45d73e3c6a620a0a0a4e3f3b7f7f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8")
    output = await version.predict_async(prompt=prompt)
    return output

if __name__ == "__main__":
    prompt = "A futuristic city skyline at sunset"
    result = asyncio.run(async_replicate_inference(prompt))
    print(result)

output

['https://replicate.delivery/pbxt/def456/image2.png']

Troubleshooting

If you get authentication errors, ensure REPLICATE_API_TOKEN is set correctly in your environment.
Model version IDs must be exact; check Replicate model page for the latest version hash.
For network errors, verify your internet connection and Replicate service status.

Key Takeaways

Use the official replicate Python package with your API token for direct model calls.
Wrap Replicate calls in a LangChain LLM subclass to integrate with LangChain workflows.
Always specify exact model and version IDs from Replicate for consistent results.
Async calls are supported by the replicate package for non-blocking inference.
Check environment variables and network connectivity if you encounter authentication or connection errors.

Verified 2026-04 · stability-ai/stable-diffusion

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.