How to beginner · 4 min read

How to set training arguments in Hugging Face

Quick answer
Use the TrainingArguments class from transformers to set training parameters like batch size, learning rate, epochs, and output directory. Pass this object to the Trainer to control the training process.

PREREQUISITES

  • Python 3.8+
  • pip install transformers datasets
  • Basic knowledge of PyTorch or TensorFlow

Setup

Install the Hugging Face transformers and datasets libraries to access the training utilities.

bash
pip install transformers datasets

Step by step

Define your training arguments using TrainingArguments and pass them to the Trainer for fine-tuning a model.

python
from transformers import TrainingArguments, Trainer, AutoModelForSequenceClassification, AutoTokenizer
from datasets import load_dataset

# Load dataset and tokenizer
raw_datasets = load_dataset('glue', 'mrpc')
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Tokenize dataset
 def preprocess_function(examples):
     return tokenizer(examples['sentence1'], examples['sentence2'], truncation=True, padding='max_length')

encoded_datasets = raw_datasets.map(preprocess_function, batched=True)

# Load model
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

# Set training arguments
training_args = TrainingArguments(
    output_dir='./results',          # output directory
    evaluation_strategy='epoch',     # evaluate each epoch
    learning_rate=2e-5,               # learning rate
    per_device_train_batch_size=16,  # batch size for training
    per_device_eval_batch_size=16,   # batch size for evaluation
    num_train_epochs=3,               # total epochs
    weight_decay=0.01,                # weight decay
    save_total_limit=2,               # max checkpoints to save
    save_strategy='epoch',            # save checkpoint each epoch
    logging_dir='./logs',             # directory for logs
    logging_steps=10,                 # log every 10 steps
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=encoded_datasets['train'],
    eval_dataset=encoded_datasets['validation'],
)

# Train model
trainer.train()
output
***** Running training *****
  Num examples = 3668
  Num Epochs = 3
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 687
...
Saving model checkpoint to ./results/checkpoint-229
...
Training completed.

Common variations

You can customize TrainingArguments for different use cases:

  • Use fp16=True for mixed precision training on GPUs.
  • Change evaluation_strategy to steps for more frequent evaluation.
  • Adjust learning_rate, batch_size, and num_train_epochs to tune training.
  • Use load_best_model_at_end=True to automatically load the best checkpoint.
python
training_args = TrainingArguments(
    output_dir='./results',
    evaluation_strategy='steps',
    eval_steps=500,
    learning_rate=3e-5,
    per_device_train_batch_size=32,
    num_train_epochs=5,
    fp16=True,
    load_best_model_at_end=True,
    metric_for_best_model='accuracy'
)

Troubleshooting

If training is slow or runs out of memory, reduce per_device_train_batch_size or enable fp16=True for mixed precision. If checkpoints are not saving, verify save_strategy and output_dir are set correctly. For evaluation errors, ensure eval_dataset is provided to Trainer.

Key Takeaways

  • Use TrainingArguments to configure all key training parameters in Hugging Face.
  • Pass TrainingArguments to Trainer to control training, evaluation, and checkpointing.
  • Adjust batch size, learning rate, epochs, and evaluation strategy to optimize training performance.
  • Enable fp16=True for faster training on compatible GPUs with mixed precision.
  • Always specify output_dir to save checkpoints and logs.
Verified 2026-04 · bert-base-uncased
Verify ↗