How to use text to speech with OpenAI API
Quick answer
Use the OpenAI API's audio.speech.create endpoint with a supported TTS model like
tts-1 to convert text to speech. Send your text in the request and receive an audio URL or base64-encoded audio data for playback or saving.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official OpenAI Python SDK and set your API key as an environment variable.
- Run
pip install openaito install the SDK. - Set your API key in your shell:
export OPENAI_API_KEY='your_api_key_here'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key_here"(Windows).
pip install openai Step by step
This example demonstrates how to generate speech audio from text using the OpenAI Python SDK and the tts-1 model. The response includes a URL to the generated audio file.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.audio.speech.create(
model="tts-1",
voice="alloy",
input="Hello, this is a text to speech example using OpenAI API."
)
audio_url = response.audio.data
print(f"Audio URL: {audio_url}") output
Audio URL: https://openai-cdn.com/audio/abc123.wav
Common variations
You can customize the voice by changing the voice parameter or use different TTS models if available. Async calls and streaming are not currently supported for TTS in the OpenAI API. You can also download the audio from the returned URL for local playback.
import requests
# Download audio example
response = requests.get(audio_url)
with open("output.wav", "wb") as f:
f.write(response.content)
print("Audio saved as output.wav") output
Audio saved as output.wav
Troubleshooting
- If you get authentication errors, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If the audio URL is missing, check that you are using a supported TTS model like
tts-1. - For network errors when downloading audio, ensure your internet connection is stable.
Key Takeaways
- Use the OpenAI Python SDK's audio.speech.create method with model 'tts-1' for text to speech.
- Set your API key securely via environment variables to authenticate requests.
- The API returns an audio URL which you can download or stream for playback.
- Customize voice and input text parameters to tailor speech output.
- Check model availability and API docs regularly as TTS features evolve.