How to convert model to GGUF format
Quick answer
To convert a LLaMA or compatible model to
GGUF format for llama.cpp, use the official convert-llama-to-gguf.py script from the llama.cpp repository. This Python script converts original checkpoint files into the efficient .gguf format required by llama.cpp.PREREQUISITES
Python 3.8+Git installedBasic command line usageAccess to original LLaMA model checkpoint filespip install torch numpy
Setup
Clone the llama.cpp repository and install required Python packages for conversion.
- Ensure you have Python 3.8 or higher installed.
- Install
torchandnumpyfor the conversion script.
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
python3 -m pip install torch numpy output
Cloning into 'llama.cpp'... ... Requirement already satisfied: torch in /usr/local/lib/python3.10/site-packages (1.14.0) Requirement already satisfied: numpy in /usr/local/lib/python3.10/site-packages (1.25.0)
Step by step
Use the convert-llama-to-gguf.py script to convert your original model checkpoint files (.bin or .pth) to .gguf format.
Example command:
python3 convert-llama-to-gguf.py --input-dir /path/to/original/checkpoints --output /path/to/output/model.gguf output
Loading model from /path/to/original/checkpoints Converting to GGUF format... Saving GGUF model to /path/to/output/model.gguf Conversion completed successfully.
Common variations
You can specify additional options such as:
--model-sizeto indicate model size (e.g., 7B, 13B)--vocab-onlyto convert only the tokenizer vocabulary- Use different Python environments or virtualenvs for isolation
For example, to convert a 13B model:
python3 convert-llama-to-gguf.py --input-dir /models/llama-13b --output llama-13b.gguf --model-size 13B output
Loading model from /models/llama-13b Converting 13B model to GGUF format... Saving GGUF model to llama-13b.gguf Conversion completed successfully.
Troubleshooting
- If you see
ModuleNotFoundError, ensuretorchandnumpyare installed. - If the script fails to find checkpoint files, verify the
--input-dirpath is correct and contains the original model files. - Check Python version compatibility; use Python 3.8 or newer.
- For permission errors, run the command with appropriate user rights or check file permissions.
Key Takeaways
- Use the official
convert-llama-to-gguf.pyscript from thellama.cpprepo to convert models. - Install
torchandnumpybefore running the conversion script. - Specify correct input directory and output file path for successful conversion.
- Python 3.8+ is required for compatibility with the conversion script.