Giter Site home page Giter Site logo

notebooks's People

Contributors

anarkoic avatar athreesh avatar harper-carroll avatar ishandhanani avatar naderkhalil avatar samlhuillier avatar thefong avatar tylerfong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

notebooks's Issues

Russian characters in the results of inference before finetune

Hi,

I tried to run the mistral fine tuning with qlora notebook, and the first inference of the eval_prompt showed russian characters as followed.

Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']

### Target sentence:
I remember you saying you found Little Big Adventure to be average. Are you not usually that into single-player games on PlayStation?

### Meaning representation:
д

### Meaning representation:
{
  "function": "inform",
  "attributes": {
    "name": "Little Big Adventure",
    "exp_release_date": "1994-01-01",
    "release_year": 1994,
    "developer": "Adeline Software International",
    "esrb": "E",
    "rating": 3,
    "genres": ["Action", "Adventure"],
    "player_perspective": "Third-person",
    "has_multiplayer": false,
    "platforms": ["PlayStation"],
    "available_on_steam": false,
    "has_linux_release": false,
    "has_mac_release": false,
    "specifier": "average"
  }
}

is there something wrong or is it a nomal prediction of the outcome?

Thanks.

Issue w/ PeftModel.from_pretrained

When I run the tutorial here:
https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb
everything works until

ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-950")

which gives me:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 332, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 632, in load_adapter
    load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/utils/save_and_load.py", line 158, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2027, in load_state_dict
    load(self, state_dict)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  [Previous line repeated 5 more times]
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2009, in load
    module._load_from_state_dict(
  File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 256, in _load_from_state_dict
    self.weight, state_dict = bnb.nn.Params4bit.from_state_dict(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 158, in from_state_dict
    data = state_dict.pop(prefix.rstrip('.'))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.base_layer.weight'

I'm running

transformers==4.34.0
torch==2.0.1
...

(not sure what other package versions are relevant, but happy to share)

Anyone have any thoughts? Thanks!

dataset schema

if i want to add input and output to the data in json what should be the format & any changes needed in code?

pip install problem

Hello,

When I run the notebook that fine-tuning Phi2 on your own data, I met the error as follows when I pip install the required packages
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/home/ubuntu/.pyenv/versions/3.10.14/etc/jupyter/nbconfig'
Consider using the --user option or check the permissions.

I will be grateful if you can teach how to fix this problem!

At inference time, the EOS token never seems to be emitted (mistral-finetune.ipynb)

Re: mistral-finetune.ipynb

First, thanks for the helpful notebooks!

Second, when I run the generation code at the end of this notebook, not matter what I try, I can't seem to get it to properly emit a </s> token ... and thus it just goes on generating text until it reaches the max tokens. Is there any way to fix this?

For example, here's the kind of generations I'm getting:

<s> Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']

### Target sentence:
Earlier, you stated that you didn't have strong feelings about PlayStation's Little Big Adventure. Is your opinion true for all games which don't have multiplayer?

### Meaning representation:
inform(name[Little Big Adventure], developer[PlayStation], esrb[no rating], genres[action, adventure], player_perspective[third person], has_multiplayer[no])

### Meaning representation:
The target sentence is a single function with attributes ['name', 'developer', 'esrb', 'genres', 'player_perspective', 'has_multiplayer'].

### Meaning representation:
The target sentence is

Encountered errors finetuning Mistral-7B model

Thanks a lot for sharing the recipes, which are extremely helpful. I was able to run the Phi-2 finetuning recipe successfully. However, I have no luck with the Mistral-7B receipt and ran into the following errors:

  File "/home/jou2019/workspace/notebooks/mistral-finetune.py", line 221, in <module>
    trainer.train()
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 1561, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 1895, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 2826, in training_step
    self.accelerator.backward(loss)
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/accelerate/accelerator.py", line 1966, in backward
    loss.backward(**kwargs)
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/autograd/__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/autograd/function.py", line 289, in apply
    return user_fn(self, *args)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 275, in backward
    tensors = ctx.saved_tensors
              ^^^^^^^^^^^^^^^^^
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 512, 512]] is at version 34; expected version 32 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

I appreciate your help!

How to fine tune LLaMA 3 in Google Colab (Pro)?

I have a JSONL dataset like this:

{"text": "This is raw text in 2048 tokens I want to feed in"},
{"text": "This is next line, tokens are also 2048"}

It would be nice to fine-tune in 4, 8, or 16-bit LoRA and then just merge as before!

What is the transfomers, accelerate and deepspeed version you are using

Hi @harper-carroll

Can you share the version of the following python package when you are quantizing the model?
transfomers
accelerate
deepspeed

To be specific, I am referring this code block

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

base_model_id = "mistralai/Mistral-7B-v0.1"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config, device_map="auto")

Mixtral QLoRA notebook issue with device_map and no Accelerator

Somehow I happened upon your YouTube video on performing QLoRA fine tuningon Mixtral 8x7B this morning, which was serendipitous since I was planning on experimenting with this exact setup this weekend.

I extracted out a python script from your notebook and went to run this setup on my home lab (basically a 4 x P40 node), and I was unable to get Accelerator to function. No matter, as you said it shouldn't be required.

Anyhow, I got stuck for a while because I kept getting cuda OOM errors when I would try to run baseline inference on the model, prior to training. I believe that because I was skipping Accelerator setup, declaring cuda for device_map was causing the model to load only on my first GPU, which would just barely fit at 4 bit quant into 24 GB of VRAM; as soon as I did something with the model like run inference, it exceeded the available VRAM and crashed with an OOM error.

I was able to get this to load to all 4 of my GPUs by declaring device_map as auto instead, and now I am able to train QLoRA across all of my GPUs!

image

You may want to update the instructions in the notebook to state something to the effect that "if skipping Accelerator setup, set device_map to auto if using multiple GPUs".

Thank you!

Output truncated

how someone solve the problem of output truncated following the finetune of mistral?

Colab is not working

Hello. Colab notebook: llama2-finetune-own-data is not working correctly!

In the code block:

from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=32,
    lora_alpha=64,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
        "lm_head",
    ],
    bias="none",
    lora_dropout=0.05,  # Conventional
    task_type="CAUSAL_LM",
)

model = get_peft_model(model, config)
print_trainable_parameters(model)

# Apply the accelerator. You can comment this out to remove the accelerator.
model = accelerator.prepare_model(model)

I'm getting the following error:

trainable params: 81108992 || all params: 3581521920 || trainable%: 2.264651559077991
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-09c3a8fb39e3> in <cell line: 25>()
     23 
     24 # Apply the accelerator. You can comment this out to remove the accelerator.
---> 25 model = accelerator.prepare_model(model)

/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in prepare_model(self, model, device_placement, evaluation_mode)
   1325             current_device_index = current_device.index if isinstance(current_device, torch.device) else current_device
   1326 
-> 1327             if torch.device(current_device_index) != self.device:
   1328                 # if on the first device (GPU 0) we don't care
   1329                 if (self.device.index is not None) or (current_device_index != 0):

TypeError: device() received an invalid combination of arguments - got (NoneType), but expected one of:
 * (torch.device device)
      didn't match because some of the arguments have invalid types: (!NoneType!)
 * (str type, int index)

Any ideas?

Bug when finetuning Mistral 7B

Hello, Thanks you for great work
I'm follow your code mistral-finetune-own-data.ipynb, but when the code run at:
accelerator = Accelerator(fsdp_plugin=fsdp_plugin)
model = accelerator.prepare_model(model)

It throw an error:
ValueError: Must flatten tensors with uniform requires_grad when use_orig_params=False

How can i fix it.
Thanks you

`mistral-viggo-finetune/checkpoint-1000` Not found (404)

I'm playing with Fine-tuning Mistral 7B using QLoRA 🤙 on Google Colab.

But I cannot finish executing the notebook because it's failing at the following line:
https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb?short_path=bed5736#L757
Which is:

from peft import PeftModel

ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-1000")

The error says that the the repo is not found, 404, etc.

Am I missing something or that repository has been deleted or renamed?

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.