How to beginner · 3 min read

How to load JSON file with LangChain

Quick answer
Use LangChain's TextLoader or custom loaders to read JSON files by loading the file content as text or parsing JSON into documents. You can then process these documents with LangChain chains or vector stores.

PREREQUISITES

  • Python 3.8+
  • pip install langchain>=0.2 openai>=1.0
  • OpenAI API key (free tier works)

Setup

Install LangChain and OpenAI Python SDK, and set your OpenAI API key as an environment variable.

bash
pip install langchain openai

# Set environment variable in your shell
# export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]  # Linux/macOS
# setx OPENAI_API_KEY os.environ["OPENAI_API_KEY"]  # Windows

Step by step

Load a JSON file by reading it as text, then create LangChain documents from the parsed JSON content.

python
import os
import json
from langchain.schema import Document

# Path to your JSON file
json_file_path = "data.json"

# Load JSON content
with open(json_file_path, "r", encoding="utf-8") as f:
    data = json.load(f)

# Convert JSON data to string or extract relevant fields
json_text = json.dumps(data, indent=2)

# Create a LangChain Document
documents = [Document(page_content=json_text)]

# Print the document content
print(documents[0].page_content)
output
{
  "key1": "value1",
  "key2": 123,
  "key3": ["item1", "item2"]
}

Common variations

You can customize loading by parsing JSON objects into multiple Document instances or use async file reading for large files.

python
import os
import json
from langchain.schema import Document

# Load JSON file and create multiple documents if JSON is a list
with open("data_list.json", "r", encoding="utf-8") as f:
    data_list = json.load(f)

# Assume data_list is a list of dicts
documents = [Document(page_content=json.dumps(item)) for item in data_list]

for doc in documents:
    print(doc.page_content)

# Async example (Python 3.8+ with aiofiles)
import aiofiles
import asyncio

async def load_json_async(path):
    async with aiofiles.open(path, mode='r', encoding='utf-8') as f:
        content = await f.read()
    return json.loads(content)

async def main():
    data = await load_json_async("data.json")
    print(data)

# asyncio.run(main())  # Uncomment to run async example
output
{...}  # Prints each JSON object as a string

Troubleshooting

  • If you see FileNotFoundError, verify the JSON file path is correct.
  • If JSON parsing fails, check the file content for valid JSON syntax.
  • For encoding errors, ensure the file is UTF-8 encoded.

Key Takeaways

  • Use Python's built-in json module to parse JSON files before creating LangChain Document objects.
  • LangChain does not have a dedicated JSON loader, so convert JSON content to text or multiple documents manually.
  • For large JSON files, consider async file reading or splitting JSON arrays into multiple documents for better processing.
Verified 2026-04
Verify ↗