How to beginner · 3 min read

How to run LLMs locally on laptop

Quick answer
Use Ollama to run large language models locally on your laptop by installing the ollama CLI and downloading supported models. This enables offline inference without API calls, ideal for privacy and low-latency applications.

PREREQUISITES

  • macOS or Windows laptop
  • Python 3.8+ (optional for scripting)
  • Install <code>ollama</code> CLI from https://ollama.com/download
  • Basic terminal or command prompt usage

Setup Ollama CLI

Install the ollama command-line interface (CLI) to manage and run local LLMs. Ollama supports macOS and Windows. Download the installer from the official site and follow the installation instructions.

bash
brew install ollama
# or download installer from https://ollama.com/download
output
ollama version 0.1.0 installed

Run a local LLM step by step

After installing, you can pull a model and run it locally. Ollama provides models like llama2 and others optimized for local use.

bash
ollama pull llama2
ollama run llama2 --prompt "Hello, how can I run LLMs locally?"
output
Hello! You can run LLMs locally using Ollama by pulling models and running them offline on your laptop.

Common variations

  • Use Ollama Python SDK for programmatic access to local models.
  • Run different models by specifying their names in ollama run.
  • Stream output interactively using the CLI or SDK.
python
import ollama

response = ollama.chat(model="llama2", messages=[{"role":"user","content":"Explain local LLM usage."}])
print(response['choices'][0]['message']['content'])
output
Local LLMs run on your laptop without internet, providing fast and private AI responses.

Troubleshooting

  • If model download fails, check your internet connection and disk space.
  • For performance issues, ensure your laptop meets minimum RAM and CPU requirements.
  • Use ollama help to explore commands and options.

Key Takeaways

  • Install Ollama CLI to manage and run local LLMs easily on your laptop.
  • Pull and run supported models offline for privacy and low latency.
  • Use Ollama Python SDK for integrating local LLMs into your applications.
  • Troubleshoot by verifying system resources and network connectivity.
  • Ollama supports macOS and Windows with simple CLI commands.
Verified 2026-04 · llama2
Verify ↗