How to post-edit machine translation
Quick answer
Post-edit machine translation by first generating a raw translation with a model like
gpt-4o or claude-3-5-sonnet-20241022, then refining the output by prompting the model to improve fluency, fix errors, and adapt style. Use chat.completions.create with clear instructions to guide the post-editing process.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable for secure authentication.
pip install openai>=1.0 output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example shows how to generate a machine translation and then post-edit it by prompting the model to improve the translation quality.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Step 1: Generate raw machine translation
source_text = "Hello, how are you today?"
raw_translation_prompt = [
{"role": "user", "content": f"Translate this to French: {source_text}"}
]
raw_response = client.chat.completions.create(
model="gpt-4o",
messages=raw_translation_prompt
)
raw_translation = raw_response.choices[0].message.content
print("Raw translation:", raw_translation)
# Step 2: Post-edit the translation
post_edit_prompt = [
{"role": "user", "content": (
f"Here is a machine translation: '{raw_translation}'. "
"Please improve the translation for fluency, correctness, and natural style. "
"Return only the improved translation."
)}
]
post_edit_response = client.chat.completions.create(
model="gpt-4o",
messages=post_edit_prompt
)
post_edited_translation = post_edit_response.choices[0].message.content
print("Post-edited translation:", post_edited_translation) output
Raw translation: Bonjour, comment êtes-vous aujourd'hui ? Post-edited translation: Bonjour, comment allez-vous aujourd'hui ?
Common variations
- Use
claude-3-5-sonnet-20241022for more nuanced post-editing with the Anthropic SDK. - Implement asynchronous calls with
asynciofor higher throughput. - Stream the post-editing output for real-time feedback using
stream=Trueinchat.completions.create.
import os
import asyncio
from openai import OpenAI
async def async_post_edit():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
raw_translation = "Bonjour, comment êtes-vous aujourd'hui ?"
post_edit_prompt = [
{"role": "user", "content": (
f"Improve this translation for fluency and style: '{raw_translation}'. "
"Return only the improved text."
)}
]
stream = await client.chat.completions.create(
model="gpt-4o",
messages=post_edit_prompt,
stream=True
)
print("Streaming post-edited translation:", end=" ")
async for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
print()
asyncio.run(async_post_edit()) output
Streaming post-edited translation: Bonjour, comment allez-vous aujourd'hui ?
Troubleshooting
- If the post-edited output is too verbose, add instructions like "Return only the translation without explanations."
- If the model output is unchanged, increase the prompt clarity or use a stronger model like
gpt-4o. - For API errors, verify your
OPENAI_API_KEYenvironment variable is set correctly.
Key Takeaways
- Use a two-step approach: generate raw translation, then post-edit with clear instructions.
- Leverage streaming and async calls for efficient post-editing workflows.
- Choose models like
gpt-4oorclaude-3-5-sonnet-20241022for best post-editing quality.