How to beginner · 3 min read

How to set up local AI for privacy

Quick answer
Use Ollama to run AI models locally on your machine, ensuring data privacy by avoiding cloud data transmission. Install Ollama, download local models, and interact with them via CLI or API without internet dependency.

PREREQUISITES

  • macOS or Linux machine
  • Python 3.8+
  • Ollama CLI installed
  • pip install requests

Setup Ollama CLI

Install the Ollama CLI to run AI models locally. Visit https://ollama.com/download to download and install the CLI for your OS. After installation, verify by running ollama version in your terminal.

bash
brew install ollama
ollama version
output
ollama version 0.1.0

Step by step local AI usage

Download a local model with ollama pull and run it locally to keep data private. Use Python to interact with the local model via Ollama's HTTP API.

python
import os
import requests

# Ensure Ollama daemon is running locally
# Pull a model (e.g., llama2) once via CLI:
# ollama pull llama2

url = 'http://localhost:11434/completions'
headers = {'Content-Type': 'application/json'}
data = {
    'model': 'llama2',
    'prompt': 'Explain how local AI improves privacy.',
    'max_tokens': 100
}

response = requests.post(url, json=data, headers=headers)
print(response.json()['choices'][0]['text'])
output
Local AI improves privacy by processing data entirely on your device, preventing sensitive information from leaving your environment.

Common variations

You can run different models locally by pulling them with ollama pull <model-name>. For asynchronous calls, use Python's asyncio with aiohttp. Ollama supports CLI, HTTP API, and SDK integrations for flexible usage.

python
import asyncio
import aiohttp

async def query_ollama():
    url = 'http://localhost:11434/completions'
    headers = {'Content-Type': 'application/json'}
    data = {
        'model': 'llama2',
        'prompt': 'What are benefits of local AI?',
        'max_tokens': 50
    }
    async with aiohttp.ClientSession() as session:
        async with session.post(url, json=data, headers=headers) as resp:
            result = await resp.json()
            print(result['choices'][0]['text'])

asyncio.run(query_ollama())
output
Local AI benefits include enhanced privacy, no internet dependency, and faster response times.

Troubleshooting

  • If requests.post to localhost:11434 fails, ensure the Ollama daemon is running with ollama daemon start.
  • If the model is not found, pull it first using ollama pull <model-name>.
  • Check firewall settings that might block local HTTP requests.

Key Takeaways

  • Use Ollama to run AI models locally for maximum data privacy.
  • Interact with local models via Ollama's HTTP API or CLI without internet.
  • Pull and manage models locally to avoid cloud data exposure.
Verified 2026-04 · llama2
Verify ↗