How to Intermediate · 3 min read

How to build a web scraping MCP server

Q: How to build a web scraping MCP server

Build a web scraping MCP server by implementing the mcp.server interface in Python, exposing web scraping functions as MCP tools. Use mcp.server.stdio.stdio_server to run the server, enabling AI agents to invoke scraping tasks via the MCP protocol.

Quick answer

Build a web scraping MCP server by implementing the mcp.server interface in Python, exposing web scraping functions as MCP tools. Use mcp.server.stdio.stdio_server to run the server, enabling AI agents to invoke scraping tasks via the MCP protocol.

PREREQUISITES

Python 3.8+
pip install mcp requests beautifulsoup4
Basic knowledge of web scraping and Python async programming

Setup

Install the required Python packages for MCP and web scraping:

mcp for the Model Context Protocol server
requests for HTTP requests
beautifulsoup4 for HTML parsing

Set up your environment with Python 3.8 or higher.

bash

pip install mcp requests beautifulsoup4

Step by step

Create a Python MCP server that exposes a web scraping tool. The tool fetches a URL and extracts the page title. Use mcp.server.stdio.stdio_server to run the server over standard IO, which is the recommended transport for MCP.

python

import requests
from bs4 import BeautifulSoup
from mcp.server import Server
from mcp.server.stdio import stdio_server

class WebScraper:
    async def scrape_title(self, url: str) -> str:
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, 'html.parser')
        title = soup.title.string if soup.title else 'No title found'
        return title

async def main():
    scraper = WebScraper()
    server = Server(tools={'scrape_title': scraper.scrape_title})
    await stdio_server(server)

if __name__ == '__main__':
    import asyncio
    asyncio.run(main())

Common variations

You can extend the MCP server to scrape other elements by adding more async methods to the WebScraper class. For example, scrape meta descriptions or links.

Use different MCP transports like SSE or TCP if needed, but stdio_server is simplest for local or containerized deployments.

Integrate with AI agents by connecting the MCP server's stdio to the agent's MCP client.

python

async def scrape_meta_description(self, url: str) -> str:
    response = requests.get(url)
    response.raise_for_status()
    soup = BeautifulSoup(response.text, 'html.parser')
    meta = soup.find('meta', attrs={'name': 'description'})
    return meta['content'] if meta and 'content' in meta.attrs else 'No meta description found'

Troubleshooting

If the server does not start, ensure Python 3.8+ is used and all dependencies are installed.
If scraping fails, check network connectivity and that the target website allows scraping.
Handle exceptions in scraping methods to avoid server crashes.
Use logging inside the MCP server to debug requests and responses.

✅

Key Takeaways

Use the mcp Python package to build MCP servers exposing web scraping tools.
Run the MCP server with stdio_server for simple, robust communication.
Implement async scraping methods to handle web requests and parsing efficiently.
Extend the server with multiple scraping functions to support diverse AI agent queries.
Handle errors gracefully and log activity for easier troubleshooting.

Verified 2026-04

Verify ↗