bs4.FeatureNotFound
bs4.FeatureNotFound
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in <module>
soup = BeautifulSoup(html_chunk, 'html.parser')
File "/usr/local/lib/python3.9/site-packages/bs4/__init__.py", line 312, in __init__
self.builder = builder_registry.lookup(parser)
File "/usr/local/lib/python3.9/site-packages/bs4/builder/_registry.py", line 44, in lookup
raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html.parser. Do you need to install a parser library? Why it happens
BeautifulSoup requires a parser backend like 'html.parser', 'lxml', or 'html5lib' to parse HTML chunks. If the specified parser is not installed or the HTML chunk is malformed, BeautifulSoup raises FeatureNotFound or fails to parse correctly. This often happens when the environment lacks the parser or chunk boundaries split HTML tags.
Detection
Catch bs4.FeatureNotFound exceptions during chunk parsing and log the HTML chunk content to identify missing parsers or malformed HTML before the app crashes.
Causes & fixes
The specified parser (e.g., 'html.parser') is not installed or available in the environment
Install the required parser library (e.g., for 'lxml' run 'pip install lxml') or switch to a parser that is available like 'html.parser' which is built-in for Python 3.4+
HTML chunks are split mid-tag causing malformed HTML input to BeautifulSoup
Ensure chunk boundaries do not split HTML tags by chunking on safe delimiters or reassembling partial tags before parsing
Using an outdated or incompatible BeautifulSoup version that lacks support for the requested parser
Upgrade BeautifulSoup to a recent version with 'pip install --upgrade beautifulsoup4' to ensure parser compatibility
Code: broken vs fixed
from bs4 import BeautifulSoup
html_chunk = '<div><p>Broken chunk'
soup = BeautifulSoup(html_chunk, 'lxml') # Raises FeatureNotFound if lxml not installed
print(soup.prettify()) from bs4 import BeautifulSoup
html_chunk = '<div><p>Broken chunk'
soup = BeautifulSoup(html_chunk, 'html.parser') # Changed to built-in parser to fix error
print(soup.prettify()) Workaround
Wrap BeautifulSoup parsing in try/except bs4.FeatureNotFound and fallback to 'html.parser' or sanitize chunks to avoid malformed HTML before parsing.
Prevention
Use robust chunking logic that preserves HTML tag integrity and ensure required parser libraries are installed and compatible with your BeautifulSoup version.