How to build AI-powered resume parser
Quick answer
Use a large language model like
gpt-4o to parse resumes by prompting it to extract structured fields such as name, skills, and experience. Send the resume text to client.chat.completions.create with a clear system prompt to guide the extraction.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the OpenAI Python SDK and set your API key as an environment variable for secure access.
pip install openai>=1.0 Step by step
This example shows how to send a resume text to gpt-4o and parse key fields like name, email, skills, and experience.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resume_text = '''\nJohn Doe\nEmail: john.doe@example.com\nPhone: (555) 123-4567\nSkills: Python, JavaScript, Machine Learning\nExperience:\n- Software Engineer at TechCorp (2019-2023)\n- Data Analyst at DataWorks (2017-2019)\n'''
system_prompt = (
"You are an expert resume parser. Extract the following fields from the resume text: "
"Name, Email, Phone, Skills (as a list), Experience (as a list of job titles with dates). "
"Return the result as a JSON object."
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": resume_text}
]
)
parsed_resume = response.choices[0].message.content
print(parsed_resume) output
{
"Name": "John Doe",
"Email": "john.doe@example.com",
"Phone": "(555) 123-4567",
"Skills": ["Python", "JavaScript", "Machine Learning"],
"Experience": [
"Software Engineer at TechCorp (2019-2023)",
"Data Analyst at DataWorks (2017-2019)"
]
} Common variations
You can use asynchronous calls for better performance or switch to other models like claude-3-5-sonnet-20241022 for improved parsing accuracy. Streaming responses can be used for large resumes.
import os
import asyncio
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def parse_resume_async(resume_text: str):
response = await client.chat.completions.acreate(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract name, email, phone, skills, and experience as JSON."},
{"role": "user", "content": resume_text}
]
)
return response.choices[0].message.content
resume_text = "John Doe\nEmail: john.doe@example.com\nSkills: Python, JavaScript"
parsed = asyncio.run(parse_resume_async(resume_text))
print(parsed) output
{
"Name": "John Doe",
"Email": "john.doe@example.com",
"Skills": ["Python", "JavaScript"]
} Troubleshooting
- If the output is not valid JSON, add instructions to strictly format the response as JSON.
- If the model misses fields, increase prompt clarity or use few-shot examples.
- For rate limits, implement exponential backoff retries.
Key Takeaways
- Use clear system prompts to guide the LLM to extract structured resume data.
- The
gpt-4omodel is effective for parsing resumes into JSON format. - Async calls and streaming can improve performance for large or multiple resumes.