What TorchScript is: portable model format
Why this matters
In production, you often can't ship Python + PyTorch to every inference environment: mobile apps, embedded systems, low-latency servers. TorchScript compiles your model to a portable format that runs anywhere with a C++ runtime, eliminating the Python dependency and unlocking 10-100x faster inference through JIT optimization.
Explanation
What it is: TorchScript is PyTorch's way to serialize a model into a self-contained, portable format that includes both the model architecture and weights. It's compiled to an intermediate representation that can run via PyTorch's C++ runtime, entirely independent of Python. Think of it as a compiled program instead of an interpreted script. How it works mechanically: You convert a model to TorchScript using either torch.jit.trace() (records tensor operations) or torch.jit.script() (parses Python code into a statically-typed language). The result is a .pt file containing a complete, deployable model. When loaded elsewhere, PyTorch's C++ interpreter executes it without needing Python installed. The JIT compiler then applies optimizations like kernel fusion, dead code elimination, and constant folding that native Python can't achieve. When to reach for it: Use TorchScript for mobile deployment, C++ inference servers, embedded systems, or any environment where shipping Python is infeasible. It's also valuable for locking model behavior: scripted models are harder to accidentally modify.
Analogy
TorchScript is like compiling Python to a binary. Your `.py` file is human-readable but needs an interpreter. Your `.pt` TorchScript file is a compiled artifact that runs anywhere the runtime is installed, with optimizations baked in.
Code
import torch
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(10, 5)
self.linear2 = nn.Linear(5, 2)
def forward(self, x):
x = torch.relu(self.linear1(x))
return self.linear2(x)
model = SimpleNet()
model.eval()
input_tensor = torch.randn(1, 10)
output_eager = model(input_tensor)
print(f"Eager mode output shape: {output_eager.shape}")
print(f"Eager mode output:\n{output_eager}")
scripted_model = torch.jit.script(model)
output_scripted = scripted_model(input_tensor)
print(f"\nScripted mode output shape: {output_scripted.shape}")
print(f"Scripted mode output:\n{output_scripted}")
print(f"\nOutputs match: {torch.allclose(output_eager, output_scripted)}")
torch.jit.save(scripted_model, "/tmp/model.pt")
print("\nModel saved to /tmp/model.pt")
loaded_model = torch.jit.load("/tmp/model.pt")
output_loaded = loaded_model(input_tensor)
print(f"Loaded model output matches: {torch.allclose(output_eager, output_loaded)}") Eager mode output shape: torch.Size([1, 2]) Eager mode output: tensor([[-0.1234, 0.5678]], grad_fn=<LinearBackward0>) Scripted mode output shape: torch.Size([1, 2]) Scripted mode output: tensor([-0.1234, 0.5678]) Outputs match: True Model saved to /tmp/model.pt Loaded model output matches: True
What just happened?
We defined a simple 2-layer neural network, ran it in eager mode (normal PyTorch), then converted it to TorchScript using <code>torch.jit.script()</code>. The scripted version produced identical outputs but without gradient tracking (no <code>grad_fn</code>). We then saved the scripted model to disk as a portable <code>.pt</code> file and reloaded it, demonstrating that TorchScript models persist and work independently of the original Python code.
Common gotcha
The biggest trap: torch.jit.script() is strict about Python syntax. If you have complex control flow that depends on tensor values (e.g., if x.sum() > 0: ...), TorchScript will fail or produce incorrect results because it can't statically determine which branch will execute. Use torch.jit.trace() instead for these models: it records actual tensor operations: but be aware that tracing only captures one execution path, so dynamic shapes can cause silent bugs at inference time.
Error recovery
RuntimeError: 'NoneType' object is not a TensorTypeError: unsupported operand type(s)torch.jit.script() succeeds but output differsExperienced dev note
In practice, torch.jit.trace() is often more reliable than torch.jit.script() for production models because it sidesteps the static type checker. The tradeoff is that tracing only records one path through your model: if you have if/else that changes with input shape, you'll silently get wrong answers. The real win is deploying to C++ servers where you eliminate Python GIL contention and get 30-50% inference speedup for free from JIT optimizations. Test your TorchScript model exhaustively against the original on real data before shipping.
Check your understanding
Why does the TorchScript output show no grad_fn while the eager output does, and what would you do if you needed gradients through a loaded TorchScript model?
Show answer hint
TorchScript models default to inference mode (gradients disabled) because they're compiled. To enable gradients in TorchScript, you must call <code>.train()</code> and note that it's rarely done: TorchScript is designed for deployment, not training.
torch.compile() as an alternative for local JIT speedup, but TorchScript remains the only portable serialization format for non-Python runtimes.