AttributeError: module 'PyPDF2' has no attribute 'PdfFileReader'
AttributeError: module 'PyPDF2' has no attribute 'PdfFileReader'
Stack trace
Traceback (most recent call last):
File "script.py", line 5, in <module>
pdf = PyPDF2.PdfFileReader(open('document.pdf', 'rb'))
File "/usr/local/lib/python3.x/site-packages/PyPDF2/__init__.py", line 42, in __getattr__
raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
AttributeError: module 'PyPDF2' has no attribute 'PdfFileReader'
Suggestion: use PdfReader instead. For example: from PyPDF2 import PdfReader Why it happens
PyPDF2 v3.0+ (released 2023) removed the deprecated PdfFileReader and PdfFileWriter classes as part of a major refactor. These names were replaced with PdfReader and PdfWriter to align with naming conventions and improve the API. If you're running PyPDF2 v3.0 or later with code written for v2.x, the old class names no longer exist and importing them raises AttributeError immediately.
Detection
Scan your codebase for `PyPDF2.PdfFileReader` and `PyPDF2.PdfFileWriter` imports. Run `pip show PyPDF2` to check your installed version; if >= 3.0.0, the old API is gone. Add a CI check that imports your PDF module early in tests to catch this before production deployment.
Causes & fixes
Using PyPDF2 v3.0+ but importing PdfFileReader from v2.x code
Replace `from PyPDF2 import PdfFileReader` with `from PyPDF2 import PdfReader` and change all `PdfFileReader()` calls to `PdfReader()`
Upgrading PyPDF2 from v2.x to v3.0+ without updating import statements
After upgrading, search-replace all occurrences: `PdfFileReader` → `PdfReader`, `PdfFileWriter` → `PdfWriter`, and `writer.addPage()` → `writer.add_page()`
Using PyPDF2 v3.0+ but expecting file-like object parameter support (old API)
PdfReader now expects a file path string or Path object directly: `PdfReader('document.pdf')` instead of `PdfFileReader(open('document.pdf', 'rb'))`
Dependency pinned to PyPDF2 v2.x in requirements.txt but upgraded globally
Update requirements.txt to `PyPDF2>=3.0.0` and update all code imports and method calls to match the v3.0+ API
Code: broken vs fixed
import PyPDF2
import os
# This code fails with AttributeError in PyPDF2 v3.0+
pdf_file = open('document.pdf', 'rb')
reader = PyPDF2.PdfFileReader(pdf_file) # ❌ PdfFileReader no longer exists
page = reader.getPage(0) # ❌ Also wrong method name in v3.0+
num_pages = reader.numPages # ❌ numPages is deprecated
pdf_file.close() from PyPDF2 import PdfReader
import os
# Fixed code for PyPDF2 v3.0+
pdf_path = 'document.pdf' # ✅ Pass file path directly
reader = PdfReader(pdf_path) # ✅ Use PdfReader instead of PdfFileReader
page = reader.pages[0] # ✅ Use pages list instead of getPage()
num_pages = len(reader.pages) # ✅ Use len(pages) instead of numPages
print(f'PDF has {num_pages} pages') Workaround
If you can't immediately upgrade code, pin PyPDF2 to v2.11.1 in requirements.txt (`PyPDF2==2.11.1`) as a temporary patch. This allows old code to run but is not recommended for production: migrate to v3.0+ within a sprint. Alternatively, create a compatibility wrapper function that detects the PyPDF2 version and calls the correct API.
Prevention
In your CI/CD pipeline, add a test that imports and uses your PDF module early (before merging PRs). Use a `pyproject.toml` or `requirements.txt` with explicit version constraints (`PyPDF2>=3.0.0,<4.0.0`) to prevent accidental downgrades. Document breaking changes from library upgrades in your team's migration checklist. Consider adopting `pypdf` (the official modern successor to PyPDF2) for new projects, which has a cleaner async-ready API.