Giter Site home page Giter Site logo

sleekmike / finetune_gpt-j_6b_8-bit Goto Github PK

View Code? Open in Web Editor NEW
74.0 74.0 14.0 36 KB

Fine-tuning GPT-J-6B on colab or equivalent PC GPU with your custom datasets: 8-bit weights with low-rank adaptors (LoRA)

License: MIT License

Python 6.18% Jupyter Notebook 93.82%

finetune_gpt-j_6b_8-bit's People

Contributors

sleekmike avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

finetune_gpt-j_6b_8-bit's Issues

Fine Tuning Pythia 12B

Hi @sleekmike
Great work on the notebook. I just wanted to check on the possibility of fine tuning the pythia 12B or any smaller variant.
I have some specific use cases where I wanted to explore more around Pythia 12B.
Do you have any idea if the same code you published in this repo would work with Pythia variants also. If not then can you help creating a blog or notebook for that as well?

Running without CUDA (bitsandbytes)

Heya,

Almost got this working with DirectML on Windows, but bitsandbytes requires CUDA on Linux. I'm gonna run out of VRAM if I don't do the monkey patching seen in the repo, so I was wondering if you had any ideas to not depend on it.

Thanks!

RuntimeError: Output 0 of DequantizeAndLinearBackward is a view when running finetuning example

When I run this example in Jupyter Lab and start finetuning the codeparrot example I get the following error message:

RuntimeError: Output 0 of DequantizeAndLinearBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function.

Can you please help me?

Finetuning error

Hi,

I get this error when fine tuning the model:

RuntimeError: Output 0 of DequantizeAndLinearBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function.

Do I need to save a checkpoint to avoid this? Not sure how it would work?

Thanks,
a

Wow this is awesome

Able to finetune 6B on a RTX 3080 with this! Had to change batch to 32 but that's about it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.