Code beginner · 3 min read

How to use text-embedding-3-small in python

Q: How to use text-embedding-3-small in python

Use the OpenAI Python SDK's client.embeddings.create method with model="text-embedding-3-small" and your input text to generate embeddings.

Direct answer

Use the OpenAI Python SDK's client.embeddings.create method with model="text-embedding-3-small" and your input text to generate embeddings.

Setup

Install

bash

pip install openai

Env vars

OPENAI_API_KEY

Imports

python

import os
from openai import OpenAI

Examples

inThe quick brown fox jumps over the lazy dog.

out[0.0123, -0.0456, 0.0789, ...] # vector of floats representing the embedding

inHow to integrate OpenAI embeddings in Python?

out[0.0345, -0.0234, 0.0567, ...] # embedding vector for the query

out[] # empty input returns an empty or error response depending on API validation

Integration steps

Install the OpenAI Python SDK and set your OPENAI_API_KEY environment variable.
Import the OpenAI client and initialize it with your API key from os.environ.
Call the embeddings.create method with model='text-embedding-3-small' and input text.
Receive the response containing the embedding vector.
Extract the embedding vector from response.data[0].embedding for use in your application.

Full code

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox jumps over the lazy dog."
)

embedding_vector = response.data[0].embedding
print("Embedding vector:", embedding_vector)

output

Embedding vector: [0.012345, -0.045678, 0.078912, ...]

API trace

Request

json

{"model": "text-embedding-3-small", "input": "The quick brown fox jumps over the lazy dog."}

Response

json

{"data": [{"embedding": [0.012345, -0.045678, 0.078912, ...], "index": 0}], "usage": {"prompt_tokens": 9, "total_tokens": 9}}

Extractresponse.data[0].embedding

Variants

Async Embedding Request ›

Use when you want to perform embedding requests concurrently or integrate with async frameworks.

python

import os
import asyncio
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.embeddings.acreate(
        model="text-embedding-3-small",
        input="Async call example text."
    )
    embedding = response.data[0].embedding
    print("Async embedding vector:", embedding)

asyncio.run(main())

Batch Embedding Multiple Inputs ›

Use to embed multiple texts in a single API call for efficiency.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

texts = [
    "First text to embed.",
    "Second text for embedding.",
    "Third example input."
]

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

embeddings = [item.embedding for item in response.data]
for i, emb in enumerate(embeddings):
    print(f"Embedding {i}:", emb)

Alternative Model: text-embedding-3-large ›

Use when you need higher quality embeddings at the cost of higher latency and compute.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.embeddings.create(
    model="text-embedding-3-large",
    input="Using a larger embedding model for better accuracy."
)

embedding = response.data[0].embedding
print("Large model embedding vector:", embedding)

Performance

Latency~300-500ms per embedding call for single input on text-embedding-3-small

Cost~$0.0004 per 1,000 tokens for text-embedding-3-small

Rate limitsTier 1: 600 RPM / 60K TPM

Preprocess text to remove unnecessary whitespace or stopwords to reduce token count.
Batch multiple inputs to amortize overhead per call.
Use smaller embedding models for less critical tasks to save cost.

Approach	Latency	Cost/call	Best for
Single input embedding	~300-500ms	~$0.0004	Quick embeddings for one text
Batch embedding multiple texts	~500-800ms	~$0.0004 per 1k tokens total	Efficient bulk embedding
Using text-embedding-3-large	~700-1000ms	~$0.0012	High quality embeddings with higher cost

✓

Quick tip

Always batch multiple texts in a single embeddings request to reduce latency and cost.

⚠

Common mistake

Passing an empty string or non-string input to the embeddings endpoint causes errors or empty embeddings.

Verified 2026-04 · text-embedding-3-small, text-embedding-3-large

Verify ↗