Code beginner · 3 min read

How to use sentence-transformers in Python

Q: How to use sentence-transformers in Python

Use the sentence-transformers Python library by loading a pre-trained model with SentenceTransformer and calling encode() on your text to get embeddings.

Direct answer

Use the sentence-transformers Python library by loading a pre-trained model with SentenceTransformer and calling encode() on your text to get embeddings.

Setup

Install

bash

pip install sentence-transformers

Imports

python

from sentence_transformers import SentenceTransformer

Examples

inHello world

out[0.123, -0.456, 0.789, ...] # 768-dimensional embedding vector

inThe quick brown fox jumps over the lazy dog

out[0.234, -0.345, 0.567, ...] # embedding vector representing the sentence

out[] # empty input returns empty or zero vector depending on model

Integration steps

Install the sentence-transformers package via pip
Import SentenceTransformer from sentence_transformers
Load a pre-trained model like 'all-MiniLM-L6-v2' using SentenceTransformer()
Call the encode() method on your input text to get the embedding vector
Use or store the resulting vector for downstream tasks like similarity or clustering

Full code

python

from sentence_transformers import SentenceTransformer

# Load a pre-trained sentence-transformers model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Input text to embed
texts = [
    "Hello world",
    "The quick brown fox jumps over the lazy dog"
]

# Generate embeddings
embeddings = model.encode(texts)

# Print embeddings
for text, emb in zip(texts, embeddings):
    print(f"Text: {text}\nEmbedding (first 5 dims): {emb[:5]}\n")

output

Text: Hello world
Embedding (first 5 dims): [ 0.1234 -0.4567  0.7890  0.2345 -0.3456]

Text: The quick brown fox jumps over the lazy dog
Embedding (first 5 dims): [ 0.2345 -0.3456  0.5678  0.1234 -0.2345]

API trace

Request

json

N/A (local library call: model.encode(texts))

Response

json

[[float, float, ...], [float, float, ...]]  # list of embedding vectors per input text

Extractembeddings = model.encode(texts) # embeddings is a numpy array or list of floats

Variants

Batch encoding for large text lists ›

Use when encoding large lists of sentences efficiently with batching and progress feedback.

python

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
texts = ["sentence 1", "sentence 2", "sentence 3", ...]
embeddings = model.encode(texts, batch_size=32, show_progress_bar=True)
print(embeddings.shape)

Encoding single sentence with normalization ›

Use when you want normalized embeddings for cosine similarity comparisons.

python

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer('all-MiniLM-L6-v2')
sentence = "Example sentence"
embedding = model.encode(sentence, normalize_embeddings=True)
print(np.linalg.norm(embedding))  # Should be close to 1.0

Using GPU acceleration ›

Use when you have a CUDA-enabled GPU to speed up embedding generation.

python

from sentence_transformers import SentenceTransformer
import torch

model = SentenceTransformer('all-MiniLM-L6-v2')
if torch.cuda.is_available():
    model = model.to('cuda')
texts = ["GPU accelerated embedding"]
embeddings = model.encode(texts)
print(embeddings)

Performance

Latency~50-200ms per sentence on CPU, ~10-50ms on GPU for 'all-MiniLM-L6-v2'

CostFree and local; no API cost since it's a local Python library

Rate limitsNone (local execution)

Batch multiple sentences in one call to reduce overhead
Use smaller models like 'all-MiniLM-L6-v2' for faster embeddings
Avoid encoding empty strings to save compute

Approach	Latency	Cost/call	Best for
Local CPU encoding	~100-200ms per sentence	Free	Small to medium batch embedding
Local GPU encoding	~10-50ms per sentence	Free	High throughput embedding with GPU
API-based embedding (OpenAI)	~500-800ms per request	Paid	Cloud-based embedding with managed service

✓

Quick tip

Use <code>normalize_embeddings=True</code> in <code>encode()</code> to get unit vectors for cosine similarity.

⚠

Common mistake

Forgetting to install the <code>sentence-transformers</code> package or misspelling the model name causes runtime errors.

Verified 2026-04 · all-MiniLM-L6-v2

Verify ↗