brevdev / notebooks Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 252.0 47.18 MB

Jupyter Notebook 99.99% Shell 0.01%

notebooks's People

Contributors

Stargazers

Watchers

Forkers

regstuff igoro1975 polya20 bharatr21 anuragsingh28 ddmilado juangon mahadev-hummanagol tokarev-i-v chenhaodev nickydark1 ajayjagota faryaneh portuno m4q4 objectin alleniver rayonx hemanthpulicharla sidharthancr yannael 0xkarasy xdarabseh techthiyanes anarkoic montagao commerceless xc0r majid3399769 jeffara rohit-48 ishaansharma dsp0205 jboktor anak10thn metacritical emehprincewill einfachalex110 christophergs karimjedda josephrp nextdimension prateekchandrajha gearunclear snapxtechnologies rudrodip tuhinmallick davidlanz hertera1 donwany jamiecropley chozzz idoru chesketh76 smarthi azizullah2017 anhlbt aswanthmanoj jofujofu divyashaktii burakince rosssong indu2204 mz0in zeroxclem philippe2803 baixf-xyz aiwagan chan3377 tmc usct01 hemachandirant kazukionodera naderkhalil brevdev stjordanis researchase dineshkumares kbannuru awesome-software pjayasun touhi99 takoyaro fofna srjsunny abdennebi yerkekz m9e tailagency judetsg yukiarimo leo-lee15 hootan09 krishnakumar59 karl-friman ibozkurt79 mc-doxey tjxj habibzadeh schalise

notebooks's Issues

How to push the finetuned Mistral model to HF hub?

Thank you for your notebook" Fine-tuning Mistral on your own data".
However once finetuned the model by our own data,
how to push the finetuned Mistral model to HF hub?

Thanks!
@

Russian characters in the results of inference before finetune

Hi,

I tried to run the mistral fine tuning with qlora notebook, and the first inference of the eval_prompt showed russian characters as followed.

Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']

### Target sentence:
I remember you saying you found Little Big Adventure to be average. Are you not usually that into single-player games on PlayStation?

### Meaning representation:
д

### Meaning representation:
{
  "function": "inform",
  "attributes": {
    "name": "Little Big Adventure",
    "exp_release_date": "1994-01-01",
    "release_year": 1994,
    "developer": "Adeline Software International",
    "esrb": "E",
    "rating": 3,
    "genres": ["Action", "Adventure"],
    "player_perspective": "Third-person",
    "has_multiplayer": false,
    "platforms": ["PlayStation"],
    "available_on_steam": false,
    "has_linux_release": false,
    "has_mac_release": false,
    "specifier": "average"
  }
}

is there something wrong or is it a nomal prediction of the outcome?

Thanks.

Issue w/ PeftModel.from_pretrained

When I run the tutorial here:
https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb
everything works until

ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-950")

which gives me:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 332, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 632, in load_adapter
    load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/utils/save_and_load.py", line 158, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2027, in load_state_dict
    load(self, state_dict)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  [Previous line repeated 5 more times]
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2009, in load
    module._load_from_state_dict(
  File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 256, in _load_from_state_dict
    self.weight, state_dict = bnb.nn.Params4bit.from_state_dict(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 158, in from_state_dict
    data = state_dict.pop(prefix.rstrip('.'))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.base_layer.weight'

I'm running

transformers==4.34.0
torch==2.0.1
...

(not sure what other package versions are relevant, but happy to share)

Anyone have any thoughts? Thanks!

dataset schema

if i want to add input and output to the data in json what should be the format & any changes needed in code?

When I run the notebook that fine-tuning Phi2 on your own data, I met the error as follows when I pip install the required packages
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/home/ubuntu/.pyenv/versions/3.10.14/etc/jupyter/nbconfig'
Consider using the --user option or check the permissions.

I will be grateful if you can teach how to fix this problem!

At inference time, the EOS token never seems to be emitted (mistral-finetune.ipynb)

Re: mistral-finetune.ipynb

First, thanks for the helpful notebooks!

Second, when I run the generation code at the end of this notebook, not matter what I try, I can't seem to get it to properly emit a </s> token ... and thus it just goes on generating text until it reaches the max tokens. Is there any way to fix this?

For example, here's the kind of generations I'm getting:

<s> Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']

### Target sentence:
Earlier, you stated that you didn't have strong feelings about PlayStation's Little Big Adventure. Is your opinion true for all games which don't have multiplayer?

### Meaning representation:
inform(name[Little Big Adventure], developer[PlayStation], esrb[no rating], genres[action, adventure], player_perspective[third person], has_multiplayer[no])

### Meaning representation:
The target sentence is a single function with attributes ['name', 'developer', 'esrb', 'genres', 'player_perspective', 'has_multiplayer'].

### Meaning representation:
The target sentence is

Can't fine tune Phi-2

GPU RAM goes to the max when startig

Encountered errors finetuning Mistral-7B model

Thanks a lot for sharing the recipes, which are extremely helpful. I was able to run the Phi-2 finetuning recipe successfully. However, I have no luck with the Mistral-7B receipt and ran into the following errors:

  File "/home/jou2019/workspace/notebooks/mistral-finetune.py", line 221, in <module>
    trainer.train()
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 1561, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 1895, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 2826, in training_step
    self.accelerator.backward(loss)
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/accelerate/accelerator.py", line 1966, in backward
    loss.backward(**kwargs)
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/autograd/__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/autograd/function.py", line 289, in apply
    return user_fn(self, *args)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 275, in backward
    tensors = ctx.saved_tensors
              ^^^^^^^^^^^^^^^^^
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 512, 512]] is at version 34; expected version 32 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

I appreciate your help!

How to fine tune LLaMA 3 in Google Colab (Pro)?

I have a JSONL dataset like this:

{"text": "This is raw text in 2048 tokens I want to feed in"},
{"text": "This is next line, tokens are also 2048"}

It would be nice to fine-tune in 4, 8, or 16-bit LoRA and then just merge as before!

ogle map image to a shape

What is the transfomers, accelerate and deepspeed version you are using

Hi @harper-carroll

Can you share the version of the following python package when you are quantizing the model?
transfomers
accelerate
deepspeed

To be specific, I am referring this code block

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

base_model_id = "mistralai/Mistral-7B-v0.1"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config, device_map="auto")

Mixtral QLoRA notebook issue with device_map and no Accelerator

Somehow I happened upon your YouTube video on performing QLoRA fine tuningon Mixtral 8x7B this morning, which was serendipitous since I was planning on experimenting with this exact setup this weekend.

I extracted out a python script from your notebook and went to run this setup on my home lab (basically a 4 x P40 node), and I was unable to get Accelerator to function. No matter, as you said it shouldn't be required.

Anyhow, I got stuck for a while because I kept getting cuda OOM errors when I would try to run baseline inference on the model, prior to training. I believe that because I was skipping Accelerator setup, declaring cuda for device_map was causing the model to load only on my first GPU, which would just barely fit at 4 bit quant into 24 GB of VRAM; as soon as I did something with the model like run inference, it exceeded the available VRAM and crashed with an OOM error.

I was able to get this to load to all 4 of my GPUs by declaring device_map as auto instead, and now I am able to train QLoRA across all of my GPUs!

You may want to update the instructions in the notebook to state something to the effect that "if skipping Accelerator setup, set device_map to auto if using multiple GPUs".

Thank you!

Output truncated

how someone solve the problem of output truncated following the finetune of mistral?

Colab is not working

Hello. Colab notebook: llama2-finetune-own-data is not working correctly!

In the code block:

from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=32,
    lora_alpha=64,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
        "lm_head",
    ],
    bias="none",
    lora_dropout=0.05,  # Conventional
    task_type="CAUSAL_LM",
)

model = get_peft_model(model, config)
print_trainable_parameters(model)

# Apply the accelerator. You can comment this out to remove the accelerator.
model = accelerator.prepare_model(model)

I'm getting the following error:

trainable params: 81108992 || all params: 3581521920 || trainable%: 2.264651559077991
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-09c3a8fb39e3> in <cell line: 25>()
     23 
     24 # Apply the accelerator. You can comment this out to remove the accelerator.
---> 25 model = accelerator.prepare_model(model)

/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in prepare_model(self, model, device_placement, evaluation_mode)
   1325             current_device_index = current_device.index if isinstance(current_device, torch.device) else current_device
   1326 
-> 1327             if torch.device(current_device_index) != self.device:
   1328                 # if on the first device (GPU 0) we don't care
   1329                 if (self.device.index is not None) or (current_device_index != 0):

TypeError: device() received an invalid combination of arguments - got (NoneType), but expected one of:
 * (torch.device device)
      didn't match because some of the arguments have invalid types: (!NoneType!)
 * (str type, int index)

Any ideas?

Bug when finetuning Mistral 7B

Hello, Thanks you for great work
I'm follow your code mistral-finetune-own-data.ipynb, but when the code run at:
accelerator = Accelerator(fsdp_plugin=fsdp_plugin)
model = accelerator.prepare_model(model)

It throw an error:
ValueError: Must flatten tensors with uniform requires_grad when use_orig_params=False

How can i fix it.
Thanks you

`mistral-viggo-finetune/checkpoint-1000` Not found (404)

I'm playing with Fine-tuning Mistral 7B using QLoRA 🤙 on Google Colab.

But I cannot finish executing the notebook because it's failing at the following line:
https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb?short_path=bed5736#L757
Which is:

from peft import PeftModel

ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-1000")

The error says that the the repo is not found, 404, etc.

Am I missing something or that repository has been deleted or renamed?

Thank you