How to run Phi-3 locally
Quick answer
To run
Phi-3 locally, install the Ollama CLI and pull the phi-3 model using ollama pull phi-3. Then run the model locally with ollama run phi-3 or integrate it via the Ollama API.PREREQUISITES
macOS or Linux machineOllama CLI installed (https://ollama.com/docs/installation)Python 3.8+ for API integrationpip install requests
Setup Ollama CLI
Install the Ollama CLI on your local machine to manage and run models like Phi-3. Ollama supports macOS and Linux.
Visit the official installation guide at https://ollama.com/docs/installation for the latest instructions.
brew install ollama output
==> Downloading https://github.com/ollama/ollama/releases/download/vX.Y.Z/ollama-darwin.zip ==> Installing Ollama CLI ==> Installation successful
Step by step to run Phi-3 locally
After installing Ollama, pull the phi-3 model and run it locally via CLI or API.
ollama pull phi-3
ollama run phi-3 --prompt "Hello, how are you?" output
Pulling phi-3 model... Model phi-3 downloaded successfully. > Hello, how are you? Hello! I'm Phi-3, your local AI assistant. How can I help you today?
Run Phi-3 locally via Python API
Use Python requests to call the Ollama local API endpoint after running the model.
import os
import requests
# Ensure Ollama daemon is running: ollama daemon
url = "http://localhost:11434/completions"
headers = {"Content-Type": "application/json"}
data = {
"model": "phi-3",
"prompt": "Write a short poem about AI.",
"max_tokens": 100
}
response = requests.post(url, json=data, headers=headers)
print(response.json()["completion"]) output
AI whispers softly, In circuits and in code, Dreams of silicon skies, Where thoughts freely flow.
Common variations
- Run Ollama daemon in background with
ollama daemonfor API access. - Use different prompts or adjust
max_tokensin API calls. - Run models interactively via CLI with
ollama run phi-3.
Troubleshooting
- If
ollama pull phi-3fails, check your internet connection and Ollama version. - If API calls to
localhost:11434fail, ensureollama daemonis running. - For permission errors, run CLI commands with appropriate user privileges.
Key Takeaways
- Install Ollama CLI to manage and run Phi-3 locally.
- Use
ollama pull phi-3to download the model before running. - Run the model via CLI or call the local API at
http://localhost:11434. - Ensure
ollama daemonis running for API access. - Use Python
requeststo integrate Phi-3 into your applications.