How to build a chatbot with OpenAI Assistants API
Setup
pip install openai OPENAI_API_KEY import os
from openai import OpenAI Examples
Integration steps
- Import the OpenAI SDK and initialize the client with the API key from os.environ.
- Construct a messages list with user input and optionally system or assistant messages.
- Call client.chat.completions.create with the model 'gpt-4o' and the messages array.
- Extract the chatbot's reply from response.choices[0].message.content.
- Display or process the chatbot's response as needed.
Full code
import os
from openai import OpenAI
# Initialize OpenAI client with API key from environment
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def chat_with_openai(user_input: str) -> str:
messages = [
{"role": "user", "content": user_input}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
return response.choices[0].message.content
if __name__ == "__main__":
user_message = input("You: ")
bot_reply = chat_with_openai(user_message)
print(f"Assistant: {bot_reply}") You: Hello, who are you? Assistant: I am an AI assistant powered by OpenAI. How can I help you today?
API trace
{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello, who are you?"}]} {"choices": [{"message": {"content": "I am an AI assistant powered by OpenAI. How can I help you today?"}}], "usage": {"total_tokens": 20}} response.choices[0].message.contentVariants
Streaming Chatbot ›
Use streaming to provide real-time token-by-token responses for better user experience in chat interfaces.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def chat_stream(user_input: str):
messages = [{"role": "user", "content": user_input}]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.get("content", ""), end='', flush=True)
if __name__ == "__main__":
user_message = input("You: ")
print("Assistant: ", end='')
chat_stream(user_message) Async Chatbot ›
Use async calls when integrating the chatbot into asynchronous applications or frameworks to improve concurrency.
import os
import asyncio
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def chat_async(user_input: str) -> str:
messages = [{"role": "user", "content": user_input}]
response = await client.chat.completions.acreate(
model="gpt-4o",
messages=messages
)
return response.choices[0].message.content
async def main():
user_message = input("You: ")
bot_reply = await chat_async(user_message)
print(f"Assistant: {bot_reply}")
if __name__ == "__main__":
asyncio.run(main()) Use a Smaller Model for Cost Efficiency ›
Use smaller models like gpt-4o-mini to reduce cost and latency when high fidelity is not critical.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def chat_with_smaller_model(user_input: str) -> str:
messages = [{"role": "user", "content": user_input}]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return response.choices[0].message.content
if __name__ == "__main__":
user_message = input("You: ")
bot_reply = chat_with_smaller_model(user_message)
print(f"Assistant: {bot_reply}") Performance
- Limit conversation history to recent relevant messages to reduce tokens.
- Use smaller models like gpt-4o-mini for less critical tasks.
- Avoid unnecessary system messages or verbose prompts.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard Chat (gpt-4o) | ~800ms | ~$0.003 | High-quality chatbot responses |
| Streaming Chat | Starts immediately, ~800ms total | ~$0.003 | Real-time user interaction |
| Async Chat | ~800ms | ~$0.003 | Concurrent or async app integration |
| Smaller Model (gpt-4o-mini) | ~400ms | ~$0.001 | Cost-sensitive or lightweight tasks |
Quick tip
Always include a clear user message role and keep conversation history concise to optimize token usage and context relevance.
Common mistake
Beginners often forget to set the API key in the environment or use deprecated SDK methods like openai.ChatCompletion.create().