How to beginner · 3 min read

How to use Whisper with LangChain

Q: How to use Whisper with LangChain

Use the openai Python SDK to transcribe audio with the whisper-1 model, then integrate the transcription into a LangChain pipeline using the AudioTranscriptionChain or a custom chain. This enables automated audio-to-text workflows within LangChain.

Quick answer

Use the openai Python SDK to transcribe audio with the whisper-1 model, then integrate the transcription into a LangChain pipeline using the AudioTranscriptionChain or a custom chain. This enables automated audio-to-text workflows within LangChain.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0 langchain>=0.2

Setup

Install the required packages and set your OpenAI API key as an environment variable.

Install packages: pip install openai langchain
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install openai langchain

Step by step

This example shows how to transcribe an audio file using OpenAI's Whisper model via the openai SDK, then use LangChain to process the transcription text.

python

import os
from openai import OpenAI
from langchain.chains import SimpleSequentialChain
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI as LangChainOpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Transcribe audio file with Whisper
with open("audio.mp3", "rb") as audio_file:
    transcript_response = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file
    )
transcription = transcript_response.text
print("Transcription:", transcription)

# Use LangChain to process transcription (e.g., summarize)
prompt = PromptTemplate(
    input_variables=["text"],
    template="Summarize the following transcription:\n{text}"
)

llm = LangChainOpenAI(model_name="gpt-4o-mini", temperature=0)

class TranscriptionChain(SimpleSequentialChain):
    def __init__(self):
        super().__init__(chains=[], input_key="text", output_key="summary")

summary = llm(prompt.format(text=transcription))
print("Summary:", summary)

output

Transcription: Hello, this is a sample audio transcription from Whisper.
Summary: This is a sample audio transcription.

Common variations

Use async calls with asyncio and await for transcription and LangChain.
Stream transcription results if supported by your SDK version.
Use LangChain's AudioTranscriptionChain for more integrated audio workflows.
Swap gpt-4o-mini with other OpenAI models for different LLM capabilities.

Troubleshooting

If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
For file errors, ensure the audio file path is correct and the file format is supported (mp3, wav, m4a, etc.).
If transcription is slow, check your network connection and API rate limits.
Update openai and langchain packages regularly to avoid compatibility issues.

✅

Key Takeaways

Use OpenAI's whisper-1 model via the openai SDK for accurate audio transcription.
Integrate Whisper transcription results into LangChain pipelines for automated audio processing workflows.
Set environment variables and install the latest openai and langchain packages to ensure compatibility.

Verified 2026-04 · whisper-1, gpt-4o-mini

Verify ↗