High severity beginner · Fix: 2-5 min

FileNotFoundError

langchain_community.document_loaders.pdf.PyPDFLoader: FileNotFoundError: [Errno 2] No such file or directory

What this error means
PyPDFLoader raises FileNotFoundError when the PDF file path doesn't exist, is relative to the wrong working directory, or contains typos.

Stack trace

traceback
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/document.pdf'
  File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/pdf.py", line 45, in load
    with open(self.file_path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/document.pdf'
QUICK FIX
Replace relative paths with absolute paths using pathlib: loader = PyPDFLoader(str(Path(__file__).parent / 'documents' / 'file.pdf'))

Why it happens

PyPDFLoader attempts to open a PDF file at the specified path using Python's open() function. If the path doesn't exist, is misspelled, uses a relative path that doesn't resolve from the current working directory, or the file was moved/deleted between script start and load time, Python raises FileNotFoundError. This is especially common when mixing absolute and relative paths, or when running scripts from different directories than expected.

Detection

Before calling loader.load(), verify the file exists using os.path.exists(file_path) or pathlib.Path(file_path).is_file(). Log the working directory and resolved absolute path to catch path resolution issues early in your pipeline.

Causes & fixes

1

Using a relative file path when the current working directory is different from the script location

✓ Fix

Use an absolute path constructed from __file__: from pathlib import Path; pdf_path = Path(__file__).parent / 'documents' / 'file.pdf'

2

File path contains a typo or the file doesn't exist at that location

✓ Fix

Verify the path exists before loading: if not Path(pdf_path).is_file(): raise FileNotFoundError(f'PDF not found at {pdf_path}')

3

File is located in a different directory (e.g., data/ or downloads/) but path assumes it's in the current directory

✓ Fix

Use absolute paths or construct paths relative to the script: pdf_path = Path(__file__).parent / 'data' / 'documents' / 'file.pdf'

4

Running the script from a different directory than where the relative path assumes (e.g., from project root vs subdirectory)

✓ Fix

Use os.chdir() to set working directory explicitly, or always use absolute paths: os.chdir(Path(__file__).parent); loader = PyPDFLoader('./documents/file.pdf')

Code: broken vs fixed

Broken - triggers the error
python
import os
from langchain_community.document_loaders import PyPDFLoader

# BROKEN: relative path that fails if working directory is wrong
loader = PyPDFLoader('documents/report.pdf')  # FileNotFoundError if not run from correct dir
docs = loader.load()
print(f'Loaded {len(docs)} pages')
Fixed - works correctly
python
import os
from pathlib import Path
from langchain_community.document_loaders import PyPDFLoader

# FIXED: absolute path using __file__ — works from any directory
script_dir = Path(__file__).parent
pdf_path = script_dir / 'documents' / 'report.pdf'

# Verify file exists before loading
if not pdf_path.is_file():
    raise FileNotFoundError(f'PDF not found at {pdf_path}')

loader = PyPDFLoader(str(pdf_path))
docs = loader.load()
print(f'Loaded {len(docs)} pages from {pdf_path}')
Changed from relative path 'documents/report.pdf' to absolute path using Path(__file__).parent, which resolves correctly regardless of where the script is executed from, and added pre-load validation with is_file().

Workaround

If you can't immediately fix the path resolution, use try/except to catch FileNotFoundError, log the attempted path and current working directory for debugging, then check sys.argv[0] or os.getcwd() to understand where the script is actually running from. Alternatively, accept the PDF path as a command-line argument: pdf_path = sys.argv[1] if len(sys.argv) > 1 else 'documents/report.pdf'

Prevention

Always use absolute paths derived from __file__ for file operations in Python packages and command-line tools. For web applications, accept file paths as configuration (environment variables or config files) rather than hardcoding them. For data pipelines, make the input file path an explicit parameter (function argument or CLI flag) rather than assuming a fixed location.

Python 3.9+ · langchain-community >=0.0.10 · tested on 0.1.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.