How to integrate Ollama into web app
Quick answer
To integrate
Ollama into a web app, use its local API by sending HTTP requests to the Ollama server running on your machine or server. You can call the API from Python or JavaScript by posting prompts and receiving AI-generated completions in JSON format.PREREQUISITES
Python 3.8+Ollama installed and running locallypip install requestsBasic knowledge of HTTP APIs
Setup
Install Ollama on your local machine from https://ollama.com and ensure the Ollama daemon is running. Install the requests Python package to make HTTP calls.
Run this command to install requests:
pip install requests Step by step
Use Python to send a POST request to the Ollama local API endpoint http://localhost:11434/ollama with your prompt and model name. The API returns the generated text in JSON.
import requests
# Define the Ollama local API URL
OLLAMA_API_URL = "http://localhost:11434/ollama"
# Example prompt
prompt = "Write a short poem about spring."
# Model to use (e.g., 'llama2')
model = "llama2"
# Prepare the payload
payload = {
"model": model,
"prompt": prompt
}
# Send POST request to Ollama API
response = requests.post(OLLAMA_API_URL, json=payload)
# Check response status
if response.status_code == 200:
data = response.json()
# Extract generated text
generated_text = data.get("completion", "")
print("Generated text:\n", generated_text)
else:
print(f"Error: {response.status_code} - {response.text}") output
Generated text: Spring whispers softly, blooms awake, Colors dance on every lake.
Common variations
You can integrate Ollama into JavaScript web apps by using fetch to call the local API. Also, you can specify different models or adjust parameters like temperature if supported.
async function callOllama(prompt) {
const response = await fetch('http://localhost:11434/ollama', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ model: 'llama2', prompt: prompt })
});
if (response.ok) {
const data = await response.json();
console.log('Generated text:', data.completion);
} else {
console.error('Error:', response.status, await response.text());
}
}
callOllama('Explain quantum computing in simple terms.'); output
Generated text: Quantum computing uses quantum bits to perform complex calculations faster than classical computers.
Troubleshooting
- If you get connection errors, ensure the Ollama daemon is running locally and listening on port 11434.
- Check firewall or network settings that might block localhost requests.
- Verify the model name is correct and installed in Ollama.
- Use
curlto test the API endpoint manually for debugging.
curl -X POST http://localhost:11434/ollama -H "Content-Type: application/json" -d '{"model":"llama2","prompt":"Hello"}' output
{
"completion": "Hello! How can I assist you today?"
} Key Takeaways
- Use Ollama's local HTTP API at http://localhost:11434/ollama to integrate AI into your web app.
- Send POST requests with JSON payloads specifying the model and prompt to get completions.
- Test connectivity and model availability before integrating into production.
- You can call Ollama from Python, JavaScript, or any HTTP-capable client.
- Ensure Ollama daemon is running locally and accessible on the expected port.