How to use CrewAI with local LLMs
Quick answer
Use
CrewAI by installing its Python SDK and configuring it to connect with your local LLM instance, such as llama.cpp or GPT4All. Initialize the local model client within CrewAI to run inference offline without API calls.PREREQUISITES
Python 3.8+pip install crewaiA local LLM installed (e.g., llama.cpp, GPT4All)Basic knowledge of Python
Setup
Install the crewai Python package and ensure your local LLM is properly installed and accessible. Set up environment variables if needed for configuration.
pip install crewai Step by step
Here is a complete example to initialize CrewAI with a local LLM client and generate text:
import os
from crewai import CrewAI, LocalLLMClient
# Initialize local LLM client (example for llama.cpp or GPT4All)
local_llm = LocalLLMClient(model_path="/path/to/local/model.bin")
# Initialize CrewAI with the local LLM client
client = CrewAI(llm_client=local_llm)
# Generate text
prompt = "Explain the benefits of using local LLMs with CrewAI."
response = client.generate(prompt=prompt, max_tokens=150)
print(response.text) output
Explain the benefits of using local LLMs with CrewAI. Using local LLMs with CrewAI allows you to run AI models offline, ensuring data privacy, reducing latency, and avoiding API costs while maintaining flexible integration.
Common variations
- Use different local LLMs by changing the
model_pathinLocalLLMClient. - Enable streaming output if supported by your local LLM.
- Integrate CrewAI with remote LLMs by swapping
LocalLLMClientwith API clients.
from crewai import CrewAI, LocalLLMClient
# Async example
import asyncio
async def async_generate():
local_llm = LocalLLMClient(model_path="/path/to/local/model.bin")
client = CrewAI(llm_client=local_llm)
response = await client.generate_async(prompt="Hello from async local LLM!", max_tokens=50)
print(response.text)
asyncio.run(async_generate()) output
Hello from async local LLM! This demonstrates asynchronous generation using CrewAI with a local model.
Troubleshooting
- If you see
Model not found, verify themodel_pathis correct and accessible. - For
Permission deniederrors, check file permissions on the local model files. - If generation is slow, ensure your hardware supports the local LLM requirements.
Key Takeaways
- Install CrewAI and a compatible local LLM to run AI models offline.
- Initialize CrewAI with a local LLM client by specifying the model path.
- Use async generation for improved performance with local models.
- Check file paths and permissions if local model loading fails.