How to classify news articles by topic
Quick answer
Use a large language model like
gpt-4o via the OpenAI Python SDK to classify news articles by topic. Send the article text as a prompt with a clear instruction to the model to return the topic label.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable.
- Run
pip install openaito install the SDK. - Set your API key in your shell:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows).
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example shows how to classify a news article by topic using the gpt-4o model. The prompt instructs the model to return a concise topic label.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
article_text = "The stock market rallied today as tech shares surged amid strong earnings reports."
messages = [
{"role": "user", "content": f"Classify the following news article into a topic category in one word:\n\n{article_text}"}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
topic = response.choices[0].message.content.strip()
print(f"Topic: {topic}") output
Topic: Finance
Common variations
You can use other models like gpt-4o-mini for faster, cheaper classification or claude-3-5-sonnet-20241022 via the Anthropic SDK. Async calls and streaming are also supported for large-scale or real-time classification.
import asyncio
import os
from openai import OpenAI
async def classify_article_async(article: str) -> str:
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "user", "content": f"Classify the news article into a topic in one word:\n\n{article}"}]
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return response.choices[0].message.content.strip()
async def main():
article = "NASA announced a new mission to explore Jupiter's moons."
topic = await classify_article_async(article)
print(f"Async topic: {topic}")
asyncio.run(main()) output
Async topic: Space
Troubleshooting
- If you get an authentication error, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If the model returns unexpected output, refine your prompt to be more explicit about returning a single topic word.
- For rate limits, consider using smaller models like
gpt-4o-minior batching requests.
Key Takeaways
- Use the OpenAI Python SDK with
gpt-4ofor accurate news topic classification. - Craft clear prompts instructing the model to return concise topic labels.
- Async and smaller models like
gpt-4o-minienable scalable classification. - Always set your API key via environment variables to avoid authentication errors.