Code beginner · 3 min read

How to use AutoTokenizer in python

Q: How to use AutoTokenizer in python

Use AutoTokenizer.from_pretrained() to load a tokenizer by model name, then call tokenizer(text) to tokenize input text in Python.

Direct answer

Use AutoTokenizer.from_pretrained() to load a tokenizer by model name, then call tokenizer(text) to tokenize input text in Python.

Setup

Install

bash

pip install transformers

Imports

python

from transformers import AutoTokenizer

Examples

inHello, how are you?

out{'input_ids': [101, 7592, 1010, 2129, 2024, 2017, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}

inTransformers are amazing for NLP tasks.

out{'input_ids': [101, 19081, 2024, 6429, 2005, 17953, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1]}

out{'input_ids': [101, 102], 'attention_mask': [1, 1]}

Integration steps

Import AutoTokenizer from transformers.
Load a pretrained tokenizer using AutoTokenizer.from_pretrained with the model name.
Call the tokenizer on your input text to get tokenized output.
Use the tokenized output for model input or further processing.

Full code

python

from transformers import AutoTokenizer

# Load tokenizer for a pretrained model, e.g., bert-base-uncased
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Example input text
text = "Hello, how are you?"

# Tokenize the input text
encoded_input = tokenizer(text)

# Print the tokenized output
print(encoded_input)

output

{'input_ids': [101, 7592, 1010, 2129, 2024, 2017, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}

API trace

Request

json

{"model_name_or_path": "bert-base-uncased", "text": "Hello, how are you?"}

Response

json

{"input_ids": [101, 7592, 1010, 2129, 2024, 2017, 102], "token_type_ids": [0, 0, 0, 0, 0, 0, 0], "attention_mask": [1, 1, 1, 1, 1, 1, 1]}

Extractencoded_input = tokenizer(text); use encoded_input['input_ids'] or encoded_input directly

Variants

Batch Tokenization ›

Use when tokenizing multiple texts at once for efficient batch processing.

python

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
texts = ["Hello, how are you?", "Transformers are great!"]
encoded_batch = tokenizer(texts, padding=True, truncation=True, return_tensors='pt')
print(encoded_batch)

Tokenization with Padding and Truncation ›

Use when you need fixed-length inputs with padding and truncation for model compatibility.

python

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
text = "This is a longer text that might need truncation."
encoded = tokenizer(text, padding='max_length', truncation=True, max_length=10)
print(encoded)

Tokenization with Return Tensors ›

Use when you want the output as PyTorch tensors directly for model input.

python

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
text = "Hello, how are you?"
encoded = tokenizer(text, return_tensors='pt')
print(encoded)

Performance

Latency~10-50ms per single text tokenization depending on text length

CostFree; runs locally without API calls

Rate limitsNone; local library usage

Use batch tokenization to reduce overhead when processing multiple texts.
Enable truncation to limit token length and reduce memory usage.
Use fast tokenizers (default in transformers) for better performance.

Approach	Latency	Cost/call	Best for
Single text tokenization	~10-50ms	Free	Quick tokenization of individual texts
Batch tokenization	~20-100ms	Free	Efficient processing of multiple texts
Tokenization with return_tensors	~15-60ms	Free	Direct input to PyTorch or TensorFlow models

✓

Quick tip

Always specify <code>padding</code> and <code>truncation</code> parameters when tokenizing batches to ensure consistent input sizes.

⚠

Common mistake

Forgetting to call <code>from_pretrained()</code> and trying to instantiate <code>AutoTokenizer</code> directly causes errors.

Verified 2026-04 · bert-base-uncased

Verify ↗