patil-suraj / exploring-t5 Goto Github PK

View Code? Open in Web Editor NEW

167.0 3.0 43.0 168 KB

A repo to explore different NLP tasks which can be solved using T5

Jupyter Notebook 100.00%

exploring-t5's Introduction

exploring-T5

A repo to explore different NLP tasks which can be solved using T5

exploring-t5's People

Contributors

Stargazers

Watchers

Forkers

nguyenhoan1988 edcog edwardburgin mrm8488 admariner stephennfernandes faizankhan29 salvagimeno-ai nagoudi madhavjk saiprasanth385 dkamalakar marcosfp97 victor0118 cjy02044027 brijow stungkit zeyefkey theword danielschulz vanessadourado amitkayal cadenhowell hoangthangta en-j-a gattilorenz lalax-systems theartpiece bangkyoungrok nduatik patrickzimmermann101 chkp-doleve estkae wccccp andronixs simonlevine itsmecurly arefrazav saibaldasprivate zhuifeng414 vikram687 tyfloving denverbaumgartner

exploring-t5's Issues

AttributeError: can't set attribute

Hi,

I am having an issue about setting hyperparameters as below. I understand due to version change, that's not the way to give the parameters, however I couldn't figure out what to do. Any suggestions would be great.

class T5FineTuner(pl.LightningModule):
def init(self, hparams):
super(T5FineTuner, self).init()

 self.hparams = hparams

Cell In[163], line 5, in T5FineTuner.init(self, hparams)
2 def init(self, hparams):
3 super(T5FineTuner, self).init()
----> 5 self.hparams = hparams
7 self.adam_epsilon=1e-08
8 self.data_dir='aclImdb'

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1313, in Module.setattr(self, name, value)
1311 buffers[name] = value
1312 else:
-> 1313 super().setattr(name, value)

AttributeError: can't set attribute

t5-large does not recognize my GPU

I was able to use the t5-base model without any problems, but when I tried to run the same code just changing the model to t5-large, the following message appears after running trainer = pl.Trainer(**train_params)

MisconfigurationException Traceback (most recent call last)
in
----> 1 trainer = pl.Trainer(**train_params)

~/anaconda3/envs/Transformers/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py in init(self, logger, checkpoint_callback, early_stop_callback, callbacks, default_root_dir, gradient_clip_val, process_position, num_nodes, num_processes, gpus, auto_select_gpus, num_tpu_cores, log_gpu_memory, progress_bar_refresh_rate, overfit_pct, track_grad_norm, check_val_every_n_epoch, fast_dev_run, accumulate_grad_batches, max_epochs, min_epochs, max_steps, min_steps, train_percent_check, val_percent_check, test_percent_check, val_check_interval, log_save_interval, row_log_interval, add_row_log_interval, distributed_backend, precision, print_nan_grads, weights_summary, weights_save_path, num_sanity_val_steps, truncated_bptt_steps, resume_from_checkpoint, profiler, benchmark, reload_dataloaders_every_epoch, auto_lr_find, replace_sampler_ddp, progress_bar_callback, amp_level, default_save_path, gradient_clip, nb_gpu_nodes, max_nb_epochs, min_nb_epochs, use_amp, show_progress_bar, nb_sanity_val_steps, terminate_on_nan, **kwargs)
436 self.gpus = gpus
437
--> 438 self.data_parallel_device_ids = parse_gpu_ids(self.gpus)
439 self.root_gpu = determine_root_gpu_device(self.data_parallel_device_ids)
440 self.root_device = torch.device("cpu")

~/anaconda3/envs/Transformers/lib/python3.6/site-packages/pytorch_lightning/trainer/distrib_parts.py in parse_gpu_ids(gpus)
710 gpus = normalize_parse_gpu_string_input(gpus)
711 gpus = normalize_parse_gpu_input_to_list(gpus)
--> 712 gpus = sanitize_gpu_ids(gpus)
713
714 if not gpus:

~/anaconda3/envs/Transformers/lib/python3.6/site-packages/pytorch_lightning/trainer/distrib_parts.py in sanitize_gpu_ids(gpus)
676 You requested GPUs: {gpus}
677 But your machine only has: {all_available_gpus}
--> 678 """)
679 return gpus
680

MisconfigurationException:
You requested GPUs: [0]
But your machine only has: []

If I run this code:
args = argparse.Namespace(**args_dict)
print(args_dict)
The result is:
{'output_dir': '/home/mydir', 'model_name_or_path': 't5-large', 'tokenizer_name_or_path': 't5-large', 'max_seq_length': 90, 'learning_rate': 0.0003, 'weight_decay': 0.0, 'adam_epsilon': 1e-08, 'warmup_steps': 0, 'train_batch_size': 8, 'eval_batch_size': 8, 'num_train_epochs': 2, 'gradient_accumulation_steps': 8, 'n_gpu': 1, 'early_stop_callback': False, 'fp_16': False, 'opt_level': 'O1', 'max_grad_norm': 1.0, 'seed': 42}

I don't understand why my GPU isn't being found now. I have a RTX 2080 Ti

load checkpoints and general fine tuning advice

I fine-tuned t5-large for paraphrase generation for 2 epoches and the paraphrases generated looks good. When I trained for 11 epochs and the model seems overfitted (the paraphrases generated is similar to the original sentence).

1.I want to check the performance of checkpoints saved, but I don't know how to do it.
I tried
PATH ='./t5_paraphrase/checkpointepoch=10.ckpt'
model =T5ForConditionalGeneration.from_pretrained(PATH)

gives error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

I also tried (https://pytorch-lightning.readthedocs.io/en/latest/weights_loading.html
)
model =T5ForConditionalGeneration.load_from_checkpoint(PATH)
AttributeError: type object 'T5ForConditionalGeneration' has no attribute 'load_from_checkpoint'.

Do you have any recommendation for fine tuning T5? I know this question is too broad. I have explored all the data set I have. For hyperparameters, I read the doc for pytorch lightning and found auto_lr_find, auto_scale_batch_size, fast_dev_run may be fun to try. However, because of the definition of t_total in train_dataloader, these will report error. So maybe there is no more tricks on this side.
For paraphrase generation using T5 as a text-to-text task, I don't know how to utilize the negative examples directly here. Any recommendation? I plan to further fine tune T5-large's paraphrase identification with my data set (with positive and negative examples) and then used this fine tuned version to further fine tune on paraphrase generation. I am still investigating how to do this, so any help will be appreciated.
In your example, you use T5ForConditionalGeneration. https://huggingface.co/transformers/model_doc/t5.html#t5forconditionalgeneration It is not very clear to me when I need to use T5Model rather than T5ForConditionalGeneration. Any resources on this?

Thanks!!

Predict all classes

I have applied your code on the emotion recognition task. I have got a really good result. Thank you for this source. I have 8 emotions and for test data, I would like to print each emotion with its probability, like ['happy': 0.061, 'surprise': 0.148, 'fear': 0.031]. In the prediction, only printed the only 1 emotion. I have increased the parameter of "num_return_sequences" to 8, then I have got some outputs which are not from emotions list. I know my question is a bit stranger :)

T5FineTuner issue "in training_epoch_end avg_train_loss = torch.stack([x["loss"] for x in outputs]).mean() "

Hi, Suraj,
I am trying to use your T5FineTune class to study the fine tune skill.
But, unfortunately, when I tried to run the program on my env, I got this error:

in training_epoch_end
avg_train_loss = torch.stack([x["loss"] for x in outputs]).mean()
RuntimeError: stack expects a non-empty TensorList

I tried to track the cause and found that the "training_step" never be called.
I think it may relate with the "ImdbDataSet" for the train_dataloadder, but I debuged it and it seems all right.
I just begin to contact the DeepLearning, so maybe there is something is obvious but I really don't know.

Do you have any idea about what may cause it?
Thank you and looking forward your any feedback.

Best Regards

t5 training notebook issue

Hello Suraj ,
Thanks for sharing this fine tuning of t5 notebook.
I am trying to use it as it is but when i am running it on colab it throwing following error.

**trainer = pl.Trainer(train_params)
"TypeError: init() got an unexpected keyword argument 'early_stop_callback'"

I will appreciate your time in resolving this issue.

thanks

difference between decoder_input_ids and lm_labels

Hi, Suraj! Thanks for making this notebook.

I am new to this. I couldn't find the clear definition of 'lm_labels' on huggingface. I noticed in your notebook, you put the target ids into 'lm_labels' rather than decoder_input_ids. Could you let me know why? I am trying to use your code to fine tune on paraphrase generation. Thanks!

Adding ByT5 notebook

Hi ! I used your notebook as a starting point for fine-tuning a T5-based model (ByT5) with the latest versions of PyTorch Lightning, Transformers, etc. I also use the Datasets library instead of downloading from Stanford, so it's a little more adaptable. Feel free to update or let me know if this can be added as a new example notebook.

https://colab.research.google.com/drive/1syXmhEQ5s7C59zU8RtHVru0wAvMXTSQ8

fill = pipeline('fill-mask', model='tamil_bert', tokenizer='tamil_bert')

ValueError Traceback (most recent call last)
/tmp/ipykernel_3356/3905977959.py in
----> 1 fill = pipeline('fill-mask', model='tamil_bert', tokenizer='tamil_bert')

~/.local/lib/python3.8/site-packages/transformers/pipelines/init.py in pipeline(task, model, config, tokenizer, feature_extractor, framework, revision, use_fast, use_auth_token, model_kwargs, **kwargs)
452 # Will load the correct model if possible
453 model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
--> 454 framework, model = infer_framework_load_model(
455 model,
456 model_classes=model_classes,

~/.local/lib/python3.8/site-packages/transformers/pipelines/base.py in infer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
156
157 if isinstance(model, str):
--> 158 raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
159
160 framework = "tf" if model.class.name.startswith("TF") else "pt"

ValueError: Could not load model tamil_bert with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForMaskedLM'>,).

TypeError: function() argument 1 must be code, not str

I was using T5_on_TPU.ipynb and was running all the steps in the colab notebook
I was getting below error

While executing Write training script step

AttributeError: 'Trainer' object has no attribute 'proc_rank'

Hi when I try to run the codes I got this error:
AttributeError: 'Trainer' object has no attribute 'proc_rank'
in is_logger:
return self.trainer.proc_rank <= 0

thanks for your help.

Fine-tune Any Models?

Hello Suraj, I have been really enjoying reading your code! I am wondering, for a beginner like me, is there any kind of tutorial I can learn to fine-tune almost any published model to do the certain task? e.g. fine-tuning T5 with SQuAD to answer open-book questions, which is using a context and a question to predict an answer. I think t5 has that task implemented but I am having a very difficult time doing so with non-Transformer models.

Thank you!

RuntimeError: Input, output and indices must be on the current device

Hi, I am using your code to fine tune my language model, so for I saved fine tuned model. After that while trying to evaluate error occured.

outs = model.model.generate(input_ids=batch['source_ids'].cuda(),
attention_mask=batch['source_mask'].cuda(),
max_length=2) # ------------ Error occures here

dec = [tokenizer.decode(ids) for ids in outs]

texts = [tokenizer.decode(ids) for ids in batch['source_ids']]
targets = [tokenizer.decode(ids) for ids in batch['target_ids']]

#-----ERROR

6 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
1914 # remove once script supports set_grad_enabled
1915 _no_grad_embeddingrenorm(weight, input, max_norm, norm_type)
-> 1916 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
1917
1918

RuntimeError: Input, output and indices must be on the current device

I modify a lot to adapt to the new version of pytorch_lightning

The first example' result.
I modify a lot to adapt to the new version of pytorch_lightning (below is t5-small)

TypeError: cannot unpack non-iterable NoneType object

Hi
I am running finetuning for IMDB I am getting this error, thanks for your help


 File "main.py", line 70, in train
    trainer.fit(model)
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 440, in fit
    results = self.accelerator_backend.train()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 54, in train
    results = self.train_or_test()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 68, in train_or_test
    results = self.trainer.train()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 485, in train
    self.train_loop.run_training_epoch()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 544, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 713, in run_training_batch
    self.optimizer_step(optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 453, in optimizer_step
    optimizer, batch_idx, opt_idx, train_step_and_backward_closure
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 122, in optimizer_step
    using_lbfgs=is_lbfgs
TypeError: optimizer_step() got an unexpected keyword argument 'using_native_amp'
Exception ignored in: <function tqdm.__del__ at 0x7fc3337da9e0>
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

patil-suraj / exploring-t5 Goto Github PK

exploring-t5's Introduction

exploring-T5

exploring-t5's People

Contributors

Stargazers

Watchers

Forkers

exploring-t5's Issues

Hi, Suraj, I am trying to use your T5FineTune class to study the fine tune skill. But, unfortunately, when I tried to run the program on my env, I got this error:

in training_epoch_end avg_train_loss = torch.stack([x["loss"] for x in outputs]).mean() RuntimeError: stack expects a non-empty TensorList

Recommend Projects

Recommend Topics

Recommend Org

Hi, Suraj,
I am trying to use your T5FineTune class to study the fine tune skill.
But, unfortunately, when I tried to run the program on my env, I got this error:

in training_epoch_end
avg_train_loss = torch.stack([x["loss"] for x in outputs]).mean()
RuntimeError: stack expects a non-empty TensorList