How to Intermediate · 4 min read

How to use TorchServe to serve PyTorch model

Quick answer
Use TorchServe to deploy PyTorch models by first exporting your model to a .mar archive using torch-model-archiver, then start the server with torchserve to serve the model via REST APIs. This enables scalable, production-ready inference for PyTorch models.

PREREQUISITES

  • Python 3.8+
  • PyTorch installed (compatible version)
  • pip install torchserve torch-model-archiver
  • A trained PyTorch model file (.pt or .pth)

Setup TorchServe environment

Install torchserve and torch-model-archiver via pip. Prepare your trained PyTorch model and optionally a handler script for custom inference logic.

bash
pip install torchserve torch-model-archiver
output
Collecting torchserve
Collecting torch-model-archiver
Successfully installed torchserve-0.6.0 torch-model-archiver-0.6.0

Step by step model packaging and serving

1. Export your PyTorch model to a .mar file using torch-model-archiver. 2. Start torchserve to serve the model. 3. Query the model via REST API.

bash
MODEL_NAME=resnet18
MODEL_FILE=resnet18.pt
HANDLER=imagenet_handler.py  # Optional, use default if none

# Step 1: Create MAR file
torch-model-archiver --model-name $MODEL_NAME \
  --version 1.0 \
  --serialized-file $MODEL_FILE \
  --handler $HANDLER \
  --export-path model_store \
  --force

# Step 2: Start TorchServe
torchserve --start --model-store model_store --models $MODEL_NAME.mar

# Step 3: Query the model
curl -X POST http://127.0.0.1:8080/predictions/$MODEL_NAME -T sample_image.jpg
output
Model resnet18 archived successfully to model_store/resnet18.mar
TorchServe started.
Model resnet18 is loaded.
{"class":"tabby_cat","probability":0.95}

Common variations

  • Use custom handlers for preprocessing/postprocessing by specifying --handler.
  • Serve multiple models by listing them in --models parameter.
  • Run TorchServe in Docker for containerized deployment.
  • Use torchserve --stop to stop the server.

Troubleshooting tips

  • If model fails to load, verify the .mar file and handler script paths.
  • Check logs in logs/ directory for detailed errors.
  • Ensure port 8080 is free or specify a different port with --ts-config.
  • Use torchserve --stop before restarting to avoid conflicts.

Key Takeaways

  • Package PyTorch models into .mar archives using torch-model-archiver before serving.
  • Start TorchServe with the model store path and model archive to enable REST API inference.
  • Customize inference logic with handler scripts for preprocessing and postprocessing.
  • Use logs and torchserve commands to troubleshoot common deployment issues.
Verified 2026-04 · torchserve, torch-model-archiver
Verify ↗