Llama 2 (7B) finetuned on 50k instruction-tuning data produced with GPT4. The data is from here. llama-finetune.ipynb has code finetuning 8/32 layers of Llama 2 7B, the rest are frozen. The model stats were logged with Weights and Biases, and a report for the same can be found here. This took ~1hr on an A100 on runpod with 80GB VRAM.
The model can be found on Huggingface.
The GPT4-based eval can be seen here
- GPT4-based eval
- Documentation
- Parameter-efficient finetuning using QLoRA
I followed more than a few tutorials and references for this, starting from Maxime Labonne's LLM Course. Code inspired by different repos - some of it on this Weights and Biases tutorial, some on Maxime Labonne's blog, some of it here