How to Intermediate · 4 min read

How to use TorchServe to serve PyTorch model

Q: How to use TorchServe to serve PyTorch model

Use TorchServe to deploy PyTorch models by first exporting your model to a .mar archive using torch-model-archiver, then start the server with torchserve to serve the model via REST APIs. This enables scalable, production-ready inference for PyTorch models.

Quick answer

Use TorchServe to deploy PyTorch models by first exporting your model to a .mar archive using torch-model-archiver, then start the server with torchserve to serve the model via REST APIs. This enables scalable, production-ready inference for PyTorch models.

PREREQUISITES

Python 3.8+
PyTorch installed (compatible version)
pip install torchserve torch-model-archiver
A trained PyTorch model file (.pt or .pth)

Setup TorchServe environment

Install torchserve and torch-model-archiver via pip. Prepare your trained PyTorch model and optionally a handler script for custom inference logic.

bash

pip install torchserve torch-model-archiver

output

Collecting torchserve
Collecting torch-model-archiver
Successfully installed torchserve-0.6.0 torch-model-archiver-0.6.0

Step by step model packaging and serving

1. Export your PyTorch model to a .mar file using torch-model-archiver. 2. Start torchserve to serve the model. 3. Query the model via REST API.

bash

MODEL_NAME=resnet18
MODEL_FILE=resnet18.pt
HANDLER=imagenet_handler.py  # Optional, use default if none

# Step 1: Create MAR file
torch-model-archiver --model-name $MODEL_NAME \
  --version 1.0 \
  --serialized-file $MODEL_FILE \
  --handler $HANDLER \
  --export-path model_store \
  --force

# Step 2: Start TorchServe
torchserve --start --model-store model_store --models $MODEL_NAME.mar

# Step 3: Query the model
curl -X POST http://127.0.0.1:8080/predictions/$MODEL_NAME -T sample_image.jpg

output

Model resnet18 archived successfully to model_store/resnet18.mar
TorchServe started.
Model resnet18 is loaded.
{"class":"tabby_cat","probability":0.95}

Common variations

Use custom handlers for preprocessing/postprocessing by specifying --handler.
Serve multiple models by listing them in --models parameter.
Run TorchServe in Docker for containerized deployment.
Use torchserve --stop to stop the server.

Troubleshooting tips

If model fails to load, verify the .mar file and handler script paths.
Check logs in logs/ directory for detailed errors.
Ensure port 8080 is free or specify a different port with --ts-config.
Use torchserve --stop before restarting to avoid conflicts.

✅

Key Takeaways

Package PyTorch models into .mar archives using torch-model-archiver before serving.
Start TorchServe with the model store path and model archive to enable REST API inference.
Customize inference logic with handler scripts for preprocessing and postprocessing.
Use logs and torchserve commands to troubleshoot common deployment issues.

Verified 2026-04 · torchserve, torch-model-archiver

Verify ↗