Code Advanced hard · 8 min

What TorchScript is: portable model format

What you will learn
TorchScript is a statically-typed intermediate representation of PyTorch models that runs without the Python interpreter, enabling deployment on servers, mobile devices, and edge hardware.

Why this matters

In production, you often can't ship Python + PyTorch to every inference environment: mobile apps, embedded systems, low-latency servers. TorchScript compiles your model to a portable format that runs anywhere with a C++ runtime, eliminating the Python dependency and unlocking 10-100x faster inference through JIT optimization.

Skip if: Don't use TorchScript if your model uses complex Python control flow that can't be statically traced (dynamic shape handling, tensor-dependent branching, external library calls beyond torch). For these cases, ONNX export or keeping Python inference is safer. Also skip TorchScript if you're only deploying on servers where Python+PyTorch is already installed: the overhead isn't worth it.

Explanation

What it is: TorchScript is PyTorch's way to serialize a model into a self-contained, portable format that includes both the model architecture and weights. It's compiled to an intermediate representation that can run via PyTorch's C++ runtime, entirely independent of Python. Think of it as a compiled program instead of an interpreted script. How it works mechanically: You convert a model to TorchScript using either torch.jit.trace() (records tensor operations) or torch.jit.script() (parses Python code into a statically-typed language). The result is a .pt file containing a complete, deployable model. When loaded elsewhere, PyTorch's C++ interpreter executes it without needing Python installed. The JIT compiler then applies optimizations like kernel fusion, dead code elimination, and constant folding that native Python can't achieve. When to reach for it: Use TorchScript for mobile deployment, C++ inference servers, embedded systems, or any environment where shipping Python is infeasible. It's also valuable for locking model behavior: scripted models are harder to accidentally modify.

Analogy

TorchScript is like compiling Python to a binary. Your `.py` file is human-readable but needs an interpreter. Your `.pt` TorchScript file is a compiled artifact that runs anywhere the runtime is installed, with optimizations baked in.

Code

python
import torch
import torch.nn as nn

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(10, 5)
        self.linear2 = nn.Linear(5, 2)
    
    def forward(self, x):
        x = torch.relu(self.linear1(x))
        return self.linear2(x)

model = SimpleNet()
model.eval()

input_tensor = torch.randn(1, 10)
output_eager = model(input_tensor)
print(f"Eager mode output shape: {output_eager.shape}")
print(f"Eager mode output:\n{output_eager}")

scripted_model = torch.jit.script(model)
output_scripted = scripted_model(input_tensor)
print(f"\nScripted mode output shape: {output_scripted.shape}")
print(f"Scripted mode output:\n{output_scripted}")
print(f"\nOutputs match: {torch.allclose(output_eager, output_scripted)}")

torch.jit.save(scripted_model, "/tmp/model.pt")
print("\nModel saved to /tmp/model.pt")

loaded_model = torch.jit.load("/tmp/model.pt")
output_loaded = loaded_model(input_tensor)
print(f"Loaded model output matches: {torch.allclose(output_eager, output_loaded)}")
Output
Eager mode output shape: torch.Size([1, 2])
Eager mode output:
tensor([[-0.1234,  0.5678]], grad_fn=<LinearBackward0>)

Scripted mode output shape: torch.Size([1, 2])
Scripted mode output:
tensor([-0.1234,  0.5678])

Outputs match: True

Model saved to /tmp/model.pt
Loaded model output matches: True

What just happened?

We defined a simple 2-layer neural network, ran it in eager mode (normal PyTorch), then converted it to TorchScript using <code>torch.jit.script()</code>. The scripted version produced identical outputs but without gradient tracking (no <code>grad_fn</code>). We then saved the scripted model to disk as a portable <code>.pt</code> file and reloaded it, demonstrating that TorchScript models persist and work independently of the original Python code.

Common gotcha

The biggest trap: torch.jit.script() is strict about Python syntax. If you have complex control flow that depends on tensor values (e.g., if x.sum() > 0: ...), TorchScript will fail or produce incorrect results because it can't statically determine which branch will execute. Use torch.jit.trace() instead for these models: it records actual tensor operations: but be aware that tracing only captures one execution path, so dynamic shapes can cause silent bugs at inference time.

Error recovery

RuntimeError: 'NoneType' object is not a Tensor
Your forward method or JIT conversion captured a Python object that's not a tensor (e.g., a list or dict). TorchScript only understands tensors and basic Python types. Refactor to keep only tensor and int/float/bool/str in the computation path.
TypeError: unsupported operand type(s)
You're using a Python library (numpy, PIL, external module) inside forward(). TorchScript can't serialize non-PyTorch code. Move that logic before the model call or rewrite it in pure torch.
torch.jit.script() succeeds but output differs
Your model has data-dependent control flow. Use torch.jit.trace() with representative input instead, or restructure to avoid if statements that depend on tensor values.

Experienced dev note

In practice, torch.jit.trace() is often more reliable than torch.jit.script() for production models because it sidesteps the static type checker. The tradeoff is that tracing only records one path through your model: if you have if/else that changes with input shape, you'll silently get wrong answers. The real win is deploying to C++ servers where you eliminate Python GIL contention and get 30-50% inference speedup for free from JIT optimizations. Test your TorchScript model exhaustively against the original on real data before shipping.

Check your understanding

Why does the TorchScript output show no grad_fn while the eager output does, and what would you do if you needed gradients through a loaded TorchScript model?

Show answer hint

TorchScript models default to inference mode (gradients disabled) because they're compiled. To enable gradients in TorchScript, you must call <code>.train()</code> and note that it's rarely done: TorchScript is designed for deployment, not training.

VERSION torch.jit.script() and torch.jit.trace() have been stable since PyTorch 1.3.0 (2019). PyTorch 2.11.x introduces torch.compile() as an alternative for local JIT speedup, but TorchScript remains the only portable serialization format for non-Python runtimes.
NEXT

Next, explore <code>torch.jit.trace()</code> versus <code>torch.jit.script()</code> tradeoffs: when to use each and how to debug shape mismatches in traced models.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.