How to beginner · 3 min read

How to convert model to GGUF format

Q: How to convert model to GGUF format

To convert a LLaMA or compatible model to GGUF format for llama.cpp, use the official convert-llama-to-gguf.py script from the llama.cpp repository. This Python script converts original checkpoint files into the efficient .gguf format required by llama.cpp.

Quick answer

To convert a LLaMA or compatible model to GGUF format for llama.cpp, use the official convert-llama-to-gguf.py script from the llama.cpp repository. This Python script converts original checkpoint files into the efficient .gguf format required by llama.cpp.

PREREQUISITES

Python 3.8+
Git installed
Basic command line usage
Access to original LLaMA model checkpoint files
pip install torch numpy

Setup

Clone the llama.cpp repository and install required Python packages for conversion.

Ensure you have Python 3.8 or higher installed.
Install torch and numpy for the conversion script.

bash

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
python3 -m pip install torch numpy

output

Cloning into 'llama.cpp'...
...
Requirement already satisfied: torch in /usr/local/lib/python3.10/site-packages (1.14.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/site-packages (1.25.0)

Step by step

Use the convert-llama-to-gguf.py script to convert your original model checkpoint files (.bin or .pth) to .gguf format.

Example command:

bash

python3 convert-llama-to-gguf.py --input-dir /path/to/original/checkpoints --output /path/to/output/model.gguf

output

Loading model from /path/to/original/checkpoints
Converting to GGUF format...
Saving GGUF model to /path/to/output/model.gguf
Conversion completed successfully.

Common variations

You can specify additional options such as:

--model-size to indicate model size (e.g., 7B, 13B)
--vocab-only to convert only the tokenizer vocabulary
Use different Python environments or virtualenvs for isolation

For example, to convert a 13B model:

bash

python3 convert-llama-to-gguf.py --input-dir /models/llama-13b --output llama-13b.gguf --model-size 13B

output

Loading model from /models/llama-13b
Converting 13B model to GGUF format...
Saving GGUF model to llama-13b.gguf
Conversion completed successfully.

Troubleshooting

If you see ModuleNotFoundError, ensure torch and numpy are installed.
If the script fails to find checkpoint files, verify the --input-dir path is correct and contains the original model files.
Check Python version compatibility; use Python 3.8 or newer.
For permission errors, run the command with appropriate user rights or check file permissions.

✅

Key Takeaways

Use the official convert-llama-to-gguf.py script from the llama.cpp repo to convert models.
Install torch and numpy before running the conversion script.
Specify correct input directory and output file path for successful conversion.
Python 3.8+ is required for compatibility with the conversion script.

Verified 2026-04 · llama-3.1-8b, llama-3.3-70b

Verify ↗