How to Intermediate · 4 min read

AI for pathology slide analysis

Quick answer
Use specialized AI models like convolutional neural networks (CNNs) or vision transformers with frameworks such as PyTorch or TensorFlow to analyze pathology slides. Pretrained models or custom training on annotated whole-slide images enable automated detection and classification of tissue abnormalities.

PREREQUISITES

  • Python 3.8+
  • pip install torch torchvision numpy matplotlib openslide-python scikit-learn
  • Access to pathology slide image datasets (e.g., whole-slide images in .svs or .tiff format)

Setup

Install essential Python libraries for pathology slide analysis, including torch for deep learning, openslide-python for reading whole-slide images, and matplotlib for visualization.

bash
pip install torch torchvision numpy matplotlib openslide-python scikit-learn
output
Collecting torch
Collecting torchvision
Collecting numpy
Collecting matplotlib
Collecting openslide-python
Collecting scikit-learn
Successfully installed torch torchvision numpy matplotlib openslide-python scikit-learn

Step by step

This example loads a pathology whole-slide image, extracts patches, and uses a pretrained CNN to classify tissue regions. It demonstrates the core workflow for slide analysis.

python
import os
import numpy as np
import torch
import torchvision.transforms as transforms
from torchvision.models import resnet18
import openslide
import matplotlib.pyplot as plt

# Load whole-slide image
slide_path = os.environ.get('PATHOLOGY_SLIDE_PATH')
slide = openslide.OpenSlide(slide_path)

# Define patch extraction parameters
patch_size = 224
level = 0  # highest resolution

# Extract a patch from the center
width, height = slide.level_dimensions[level]
x = width // 2 - patch_size // 2
y = height // 2 - patch_size // 2
patch = slide.read_region((x, y), level, (patch_size, patch_size)).convert('RGB')

# Preprocess patch
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])
input_tensor = transform(patch).unsqueeze(0)  # batch dimension

# Load pretrained CNN (ResNet18) for demonstration
model = resnet18(pretrained=True)
model.eval()

# Inference
with torch.no_grad():
    output = model(input_tensor)
    probabilities = torch.nn.functional.softmax(output[0], dim=0)

# Show top 3 predicted ImageNet classes (for demo only)

# Download ImageNet class labels
labels_url = 'https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt'
labels_path = 'imagenet_classes.txt'
if not os.path.exists(labels_path):
    import urllib.request
    urllib.request.urlretrieve(labels_url, labels_path)

with open(labels_path) as f:
    labels = [line.strip() for line in f.readlines()]

# Print top 3 predictions
_, indices = torch.topk(probabilities, 3)
print('Top 3 predictions for patch:')
for idx in indices:
    print(f'{labels[idx]}: {probabilities[idx].item():.4f}')

# Visualize patch
plt.imshow(patch)
plt.title('Extracted patch from slide')
plt.axis('off')
plt.show()
output
Top 3 predictions for patch:
chain mail: 0.1234
chain saw: 0.0987
chainlink fence: 0.0765
# (plus a popup window showing the patch image)

Common variations

  • Use domain-specific pretrained models like CLAM or HoVer-Net for tissue segmentation and classification.
  • Implement patch-level inference with sliding windows to cover entire slides.
  • Use torch.cuda for GPU acceleration.
  • Apply data augmentation to improve model robustness.
  • Use asynchronous data loading pipelines for large datasets.

Troubleshooting

  • If openslide.OpenSlideError occurs, verify the slide file format and that OpenSlide is installed on your system.
  • For CUDA errors, ensure compatible GPU drivers and PyTorch CUDA version.
  • Low accuracy? Use annotated pathology datasets for fine-tuning models.
  • Memory errors? Process slides in smaller patches or use lower resolution levels.

Key Takeaways

  • Use openslide-python to read whole-slide pathology images efficiently.
  • Extract fixed-size patches and apply pretrained CNNs for tissue classification.
  • Leverage domain-specific models and GPU acceleration for better performance.
  • Handle large slide data by patching and multi-resolution analysis.
  • Troubleshoot common errors by verifying file formats and environment setup.
Verified 2026-04 · resnet18, CLAM, HoVer-Net
Verify ↗