Code beginner · 3 min read

How to use AutoTokenizer in python

Direct answer
Use AutoTokenizer.from_pretrained() to load a tokenizer by model name, then call tokenizer(text) to tokenize input text in Python.

Setup

Install
bash
pip install transformers
Imports
python
from transformers import AutoTokenizer

Examples

inHello, how are you?
out{'input_ids': [101, 7592, 1010, 2129, 2024, 2017, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}
inTransformers are amazing for NLP tasks.
out{'input_ids': [101, 19081, 2024, 6429, 2005, 17953, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1]}
in
out{'input_ids': [101, 102], 'attention_mask': [1, 1]}

Integration steps

  1. Import AutoTokenizer from transformers.
  2. Load a pretrained tokenizer using AutoTokenizer.from_pretrained with the model name.
  3. Call the tokenizer on your input text to get tokenized output.
  4. Use the tokenized output for model input or further processing.

Full code

python
from transformers import AutoTokenizer

# Load tokenizer for a pretrained model, e.g., bert-base-uncased
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Example input text
text = "Hello, how are you?"

# Tokenize the input text
encoded_input = tokenizer(text)

# Print the tokenized output
print(encoded_input)
output
{'input_ids': [101, 7592, 1010, 2129, 2024, 2017, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}

API trace

Request
json
{"model_name_or_path": "bert-base-uncased", "text": "Hello, how are you?"}
Response
json
{"input_ids": [101, 7592, 1010, 2129, 2024, 2017, 102], "token_type_ids": [0, 0, 0, 0, 0, 0, 0], "attention_mask": [1, 1, 1, 1, 1, 1, 1]}
Extractencoded_input = tokenizer(text); use encoded_input['input_ids'] or encoded_input directly

Variants

Batch Tokenization

Use when tokenizing multiple texts at once for efficient batch processing.

python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
texts = ["Hello, how are you?", "Transformers are great!"]
encoded_batch = tokenizer(texts, padding=True, truncation=True, return_tensors='pt')
print(encoded_batch)
Tokenization with Padding and Truncation

Use when you need fixed-length inputs with padding and truncation for model compatibility.

python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
text = "This is a longer text that might need truncation."
encoded = tokenizer(text, padding='max_length', truncation=True, max_length=10)
print(encoded)
Tokenization with Return Tensors

Use when you want the output as PyTorch tensors directly for model input.

python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
text = "Hello, how are you?"
encoded = tokenizer(text, return_tensors='pt')
print(encoded)

Performance

Latency~10-50ms per single text tokenization depending on text length
CostFree; runs locally without API calls
Rate limitsNone; local library usage
  • Use batch tokenization to reduce overhead when processing multiple texts.
  • Enable truncation to limit token length and reduce memory usage.
  • Use fast tokenizers (default in transformers) for better performance.
ApproachLatencyCost/callBest for
Single text tokenization~10-50msFreeQuick tokenization of individual texts
Batch tokenization~20-100msFreeEfficient processing of multiple texts
Tokenization with return_tensors~15-60msFreeDirect input to PyTorch or TensorFlow models

Quick tip

Always specify <code>padding</code> and <code>truncation</code> parameters when tokenizing batches to ensure consistent input sizes.

Common mistake

Forgetting to call <code>from_pretrained()</code> and trying to instantiate <code>AutoTokenizer</code> directly causes errors.

Verified 2026-04 · bert-base-uncased
Verify ↗