How to beginner · 3 min read

How to use Ollama python library

Quick answer
Use the ollama Python library to run local AI models by installing it via pip and calling its Ollama client. Initialize the client, then use client.chat() with your prompt to get AI-generated responses.

PREREQUISITES

  • Python 3.8+
  • pip install ollama
  • Ollama app installed and running locally

Setup

Install the ollama Python package and ensure the Ollama app is installed and running on your machine. The Ollama app manages local AI models and serves requests.

bash
pip install ollama

Step by step

Use the Ollama client from the ollama package to send prompts to a local model like llama2. The example below shows a simple chat completion call.

python
import ollama

client = ollama

response = client.chat(model="llama2", messages=[{"role": "user", "content": "Hello, how are you?"}])
print(response['choices'][0]['message']['content'])
output
Hello! I'm doing great, thank you. How can I assist you today?

Common variations

  • Change the model parameter to use different local models installed in Ollama.
  • Use client.chat_stream() for streaming responses.
  • Pass additional parameters like temperature or max_tokens to customize output.
python
response = client.chat(
    model="llama2",
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms."}],
    temperature=0.7,
    max_tokens=150
)
print(response['choices'][0]['message']['content'])
output
Quantum computing uses quantum bits or qubits that can be in multiple states at once, enabling powerful parallel computations...

Troubleshooting

  • If you get connection errors, ensure the Ollama app is running locally.
  • Verify your model name matches one installed in Ollama by running ollama list in your terminal.
  • Update the ollama Python package if you encounter unexpected errors.

Key Takeaways

  • Install the Ollama app and Python package to run local AI models easily.
  • Use the Ollama client’s chat method to send prompts and receive responses.
  • Customize model, temperature, and tokens for tailored outputs.
  • Ensure the Ollama app is running to avoid connection errors.
Verified 2026-04 · llama2
Verify ↗