notebooks's People
Forkers
regstuff igoro1975 polya20 bharatr21 anuragsingh28 ddmilado juangon mahadev-hummanagol tokarev-i-v chenhaodev nickydark1 ajayjagota faryaneh portuno m4q4 objectin alleniver rayonx hemanthpulicharla sidharthancr yannael 0xkarasy xdarabseh techthiyanes anarkoic montagao commerceless xc0r majid3399769 jeffara rohit-48 ishaansharma dsp0205 jboktor anak10thn metacritical emehprincewill einfachalex110 christophergs karimjedda josephrp nextdimension prateekchandrajha gearunclear snapxtechnologies rudrodip tuhinmallick davidlanz hertera1 donwany jamiecropley chozzz idoru chesketh76 smarthi azizullah2017 anhlbt aswanthmanoj jofujofu divyashaktii burakince rosssong indu2204 mz0in zeroxclem philippe2803 baixf-xyz aiwagan chan3377 tmc usct01 hemachandirant kazukionodera naderkhalil brevdev stjordanis researchase dineshkumares kbannuru awesome-software pjayasun touhi99 takoyaro fofna srjsunny abdennebi yerkekz m9e tailagency judetsg yukiarimo leo-lee15 hootan09 krishnakumar59 karl-friman ibozkurt79 mc-doxey tjxj habibzadeh schalisenotebooks's Issues
How to push the finetuned Mistral model to HF hub?
Thank you for your notebook" Fine-tuning Mistral on your own data".
However once finetuned the model by our own data,
how to push the finetuned Mistral model to HF hub?
Thanks!
@
Russian characters in the results of inference before finetune
Hi,
I tried to run the mistral fine tuning with qlora notebook, and the first inference of the eval_prompt
showed russian characters as followed.
Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']
### Target sentence:
I remember you saying you found Little Big Adventure to be average. Are you not usually that into single-player games on PlayStation?
### Meaning representation:
д
### Meaning representation:
{
"function": "inform",
"attributes": {
"name": "Little Big Adventure",
"exp_release_date": "1994-01-01",
"release_year": 1994,
"developer": "Adeline Software International",
"esrb": "E",
"rating": 3,
"genres": ["Action", "Adventure"],
"player_perspective": "Third-person",
"has_multiplayer": false,
"platforms": ["PlayStation"],
"available_on_steam": false,
"has_linux_release": false,
"has_mac_release": false,
"specifier": "average"
}
}
is there something wrong or is it a nomal prediction of the outcome?
Thanks.
Issue w/ PeftModel.from_pretrained
When I run the tutorial here:
https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb
everything works until
ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-950")
which gives me:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 332, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 632, in load_adapter
load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/paperspace/.local/lib/python3.11/site-packages/peft/utils/save_and_load.py", line 158, in set_peft_model_state_dict
load_result = model.load_state_dict(peft_model_state_dict, strict=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2027, in load_state_dict
load(self, state_dict)
File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
load(child, child_state_dict, child_prefix)
File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
load(child, child_state_dict, child_prefix)
File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
load(child, child_state_dict, child_prefix)
[Previous line repeated 5 more times]
File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2009, in load
module._load_from_state_dict(
File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 256, in _load_from_state_dict
self.weight, state_dict = bnb.nn.Params4bit.from_state_dict(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 158, in from_state_dict
data = state_dict.pop(prefix.rstrip('.'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.base_layer.weight'
I'm running
transformers==4.34.0
torch==2.0.1
...
(not sure what other package versions are relevant, but happy to share)
Anyone have any thoughts? Thanks!
dataset schema
if i want to add input and output to the data in json what should be the format & any changes needed in code?
pip install problem
Hello,
When I run the notebook that fine-tuning Phi2 on your own data, I met the error as follows when I pip install the required packages
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/home/ubuntu/.pyenv/versions/3.10.14/etc/jupyter/nbconfig'
Consider using the --user
option or check the permissions.
I will be grateful if you can teach how to fix this problem!
At inference time, the EOS token never seems to be emitted (mistral-finetune.ipynb)
First, thanks for the helpful notebooks!
Second, when I run the generation code at the end of this notebook, not matter what I try, I can't seem to get it to properly emit a </s>
token ... and thus it just goes on generating text until it reaches the max tokens. Is there any way to fix this?
For example, here's the kind of generations I'm getting:
<s> Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']
### Target sentence:
Earlier, you stated that you didn't have strong feelings about PlayStation's Little Big Adventure. Is your opinion true for all games which don't have multiplayer?
### Meaning representation:
inform(name[Little Big Adventure], developer[PlayStation], esrb[no rating], genres[action, adventure], player_perspective[third person], has_multiplayer[no])
### Meaning representation:
The target sentence is a single function with attributes ['name', 'developer', 'esrb', 'genres', 'player_perspective', 'has_multiplayer'].
### Meaning representation:
The target sentence is
Can't fine tune Phi-2
GPU RAM goes to the max when startig
Encountered errors finetuning Mistral-7B model
Thanks a lot for sharing the recipes, which are extremely helpful. I was able to run the Phi-2 finetuning recipe successfully. However, I have no luck with the Mistral-7B receipt and ran into the following errors:
File "/home/jou2019/workspace/notebooks/mistral-finetune.py", line 221, in <module>
trainer.train()
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 1561, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 1895, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/transformers/trainer.py", line 2826, in training_step
self.accelerator.backward(loss)
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/accelerate/accelerator.py", line 1966, in backward
loss.backward(**kwargs)
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/_tensor.py", line 522, in backward
torch.autograd.backward(
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/autograd/__init__.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/autograd/function.py", line 289, in apply
return user_fn(self, *args)
^^^^^^^^^^^^^^^^^^^^
File "/home/jou2019/miniconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 275, in backward
tensors = ctx.saved_tensors
^^^^^^^^^^^^^^^^^
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 512, 512]] is at version 34; expected version 32 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
I appreciate your help!
How to fine tune LLaMA 3 in Google Colab (Pro)?
I have a JSONL dataset like this:
{"text": "This is raw text in 2048 tokens I want to feed in"},
{"text": "This is next line, tokens are also 2048"}
It would be nice to fine-tune in 4, 8, or 16-bit LoRA and then just merge as before!
ogle map image to a shape
What is the transfomers, accelerate and deepspeed version you are using
Can you share the version of the following python package when you are quantizing the model?
transfomers
accelerate
deepspeed
To be specific, I am referring this code block
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
base_model_id = "mistralai/Mistral-7B-v0.1"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config, device_map="auto")
Mixtral QLoRA notebook issue with device_map and no Accelerator
Somehow I happened upon your YouTube video on performing QLoRA fine tuningon Mixtral 8x7B this morning, which was serendipitous since I was planning on experimenting with this exact setup this weekend.
I extracted out a python script from your notebook and went to run this setup on my home lab (basically a 4 x P40 node), and I was unable to get Accelerator
to function. No matter, as you said it shouldn't be required.
Anyhow, I got stuck for a while because I kept getting cuda OOM errors when I would try to run baseline inference on the model, prior to training. I believe that because I was skipping Accelerator
setup, declaring cuda
for device_map
was causing the model to load only on my first GPU, which would just barely fit at 4 bit quant into 24 GB of VRAM; as soon as I did something with the model like run inference, it exceeded the available VRAM and crashed with an OOM error.
I was able to get this to load to all 4 of my GPUs by declaring device_map
as auto
instead, and now I am able to train QLoRA across all of my GPUs!
You may want to update the instructions in the notebook to state something to the effect that "if skipping Accelerator setup, set device_map to auto if using multiple GPUs".
Thank you!
Output truncated
how someone solve the problem of output truncated following the finetune of mistral?
Colab is not working
Hello. Colab notebook: llama2-finetune-own-data
is not working correctly!
In the code block:
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=32,
lora_alpha=64,
target_modules=[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
],
bias="none",
lora_dropout=0.05, # Conventional
task_type="CAUSAL_LM",
)
model = get_peft_model(model, config)
print_trainable_parameters(model)
# Apply the accelerator. You can comment this out to remove the accelerator.
model = accelerator.prepare_model(model)
I'm getting the following error:
trainable params: 81108992 || all params: 3581521920 || trainable%: 2.264651559077991
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-09c3a8fb39e3> in <cell line: 25>()
23
24 # Apply the accelerator. You can comment this out to remove the accelerator.
---> 25 model = accelerator.prepare_model(model)
/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py in prepare_model(self, model, device_placement, evaluation_mode)
1325 current_device_index = current_device.index if isinstance(current_device, torch.device) else current_device
1326
-> 1327 if torch.device(current_device_index) != self.device:
1328 # if on the first device (GPU 0) we don't care
1329 if (self.device.index is not None) or (current_device_index != 0):
TypeError: device() received an invalid combination of arguments - got (NoneType), but expected one of:
* (torch.device device)
didn't match because some of the arguments have invalid types: (!NoneType!)
* (str type, int index)
Any ideas?
Bug when finetuning Mistral 7B
Hello, Thanks you for great work
I'm follow your code mistral-finetune-own-data.ipynb, but when the code run at:
accelerator = Accelerator(fsdp_plugin=fsdp_plugin)
model = accelerator.prepare_model(model)
It throw an error:
ValueError: Must flatten tensors with uniform requires_grad
when use_orig_params=False
How can i fix it.
Thanks you
`mistral-viggo-finetune/checkpoint-1000` Not found (404)
I'm playing with Fine-tuning Mistral 7B using QLoRA 🤙
on Google Colab.
But I cannot finish executing the notebook because it's failing at the following line:
https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb?short_path=bed5736#L757
Which is:
from peft import PeftModel
ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-1000")
The error says that the the repo is not found, 404, etc.
Am I missing something or that repository has been deleted or renamed?
Thank you
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.