How to beginner · 3 min read

How to store OpenAI embeddings in a database

Quick answer
Use the OpenAI SDK to generate embeddings with models like text-embedding-3-large. Convert the embedding vector to a storable format (e.g., JSON or binary) and save it in a database column designed for vectors or arrays, such as FLOAT[] in PostgreSQL or BLOB in SQLite.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0
  • Basic knowledge of SQL and a database (e.g., SQLite, PostgreSQL)

Setup

Install the openai Python package and set your API key as an environment variable for secure access.

bash
pip install openai>=1.0

Step by step

This example shows how to generate an embedding using OpenAI's text-embedding-3-large model and store it in a SQLite database as a JSON string.

python
import os
import json
import sqlite3
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Connect to SQLite database (or create it)
conn = sqlite3.connect("embeddings.db")
cursor = conn.cursor()

# Create table to store embeddings
cursor.execute('''
CREATE TABLE IF NOT EXISTS embeddings (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    text TEXT NOT NULL,
    embedding TEXT NOT NULL
)
''')

# Text to embed
text_to_embed = "OpenAI provides powerful embedding models."

# Generate embedding
response = client.embeddings.create(
    model="text-embedding-3-large",
    input=text_to_embed
)
embedding_vector = response.data[0].embedding

# Convert embedding vector to JSON string for storage
embedding_json = json.dumps(embedding_vector)

# Insert into database
cursor.execute(
    "INSERT INTO embeddings (text, embedding) VALUES (?, ?)",
    (text_to_embed, embedding_json)
)
conn.commit()

# Query and print stored embedding
cursor.execute("SELECT id, text, embedding FROM embeddings")
rows = cursor.fetchall()
for row in rows:
    stored_id, stored_text, stored_embedding_json = row
    stored_embedding = json.loads(stored_embedding_json)
    print(f"ID: {stored_id}, Text: {stored_text}, Embedding vector length: {len(stored_embedding)}")

conn.close()
output
ID: 1, Text: OpenAI provides powerful embedding models., Embedding vector length: 1536

Common variations

  • Use PostgreSQL with FLOAT8[] column type for native vector storage.
  • Store embeddings as binary blobs for efficiency.
  • Use async OpenAI client calls for high throughput.
  • Use different embedding models like text-embedding-3-small for smaller vectors.

Troubleshooting

  • If you get API key errors, verify OPENAI_API_KEY is set correctly in your environment.
  • For database insertion errors, ensure your table schema matches the data types.
  • If embeddings seem empty or incorrect, check the model name and API response structure.

Key Takeaways

  • Use the OpenAI SDK's embeddings endpoint with a current model like text-embedding-3-large.
  • Store embeddings as JSON strings or native array types depending on your database capabilities.
  • Always secure your API key via environment variables and never hardcode it.
  • SQLite is good for prototyping; use PostgreSQL or specialized vector DBs for production.
  • Validate your database schema matches the embedding data format to avoid insertion errors.
Verified 2026-04 · text-embedding-3-large
Verify ↗