This repository helps the conversion of Large Language Models (LLMs) into GGUF format.
git submodule update --init --recursive
conda create -n gguf python=3.10
conda activate gguf
cd llama.cpp && make && pip install -r requirements.txt
To convert Model to GGUF format, run file gguf.py with arguments:
- --model-name: Specify the model you want to convert
- --methods: Define methods to convert. For example: q2_k, q3_k_m, q4_0, q4_k_m, q5_0, q5_k_m, q6_k, q8_0 Use comma to separate methods. default q4_0
- --local-dir: dir to store model from huggingface-hub, default: ./original_model/
- --quantized-dir: dir to store quantized model, default: ./quantized_model/
For example:
python gguf.py "Qwen/Qwen1.5-1.8B" --methods "q4_0"