How to use SimpleDirectoryReader in LlamaIndex
Quick answer
Use
SimpleDirectoryReader from llama_index to load all documents from a directory into a list of Document objects. Instantiate it with the directory path, then call load_data() to read files for indexing or processing.PREREQUISITES
Python 3.8+pip install llama-index>=0.6.0Basic knowledge of Python file paths
Setup
Install the llama-index package via pip and prepare your environment.
pip install llama-index>=0.6.0 Step by step
Use SimpleDirectoryReader to load documents from a directory and print their contents.
from llama_index import SimpleDirectoryReader
# Specify the directory containing your text files
directory_path = './data'
# Initialize the reader
reader = SimpleDirectoryReader(directory_path)
# Load documents from the directory
documents = reader.load_data()
# Print the content of each loaded document
for i, doc in enumerate(documents):
print(f"Document {i+1} content:\n{doc.get_text()}\n---") output
Document 1 content: This is the content of the first document. --- Document 2 content: This is the content of the second document. ---
Common variations
- Use
SimpleDirectoryReaderwith afile_extractorparameter to customize file parsing. - Combine with
llama_indexindexes likeGPTVectorStoreIndexfor retrieval. - Use async loading by running
load_data()in an async context if supported.
from llama_index import SimpleDirectoryReader
# Custom file extractor example (e.g., only .txt files)
reader = SimpleDirectoryReader(
'./data',
file_extractor=lambda filename: filename.endswith('.txt')
)
documents = reader.load_data()
print(f"Loaded {len(documents)} text files.") output
Loaded 3 text files.
Troubleshooting
- If no documents load, verify the directory path is correct and contains supported files.
- Ensure files are readable text formats; binary or unsupported files will be skipped.
- Check for permission errors on the directory or files.
Key Takeaways
- Use
SimpleDirectoryReaderto quickly load all documents from a directory for indexing. - Customize file loading with the
file_extractorparameter to filter files by extension or name. - Always verify the directory path and file permissions to avoid empty or failed loads.