Giter Site home page Giter Site logo

exploring-t5's Introduction

exploring-T5

A repo to explore different NLP tasks which can be solved using T5

exploring-t5's People

Contributors

patil-suraj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

exploring-t5's Issues

AttributeError: can't set attribute

Hi,

I am having an issue about setting hyperparameters as below. I understand due to version change, that's not the way to give the parameters, however I couldn't figure out what to do. Any suggestions would be great.

class T5FineTuner(pl.LightningModule):
def init(self, hparams):
super(T5FineTuner, self).init()

 self.hparams = hparams

Cell In[163], line 5, in T5FineTuner.init(self, hparams)
2 def init(self, hparams):
3 super(T5FineTuner, self).init()
----> 5 self.hparams = hparams
7 self.adam_epsilon=1e-08
8 self.data_dir='aclImdb'

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1313, in Module.setattr(self, name, value)
1311 buffers[name] = value
1312 else:
-> 1313 super().setattr(name, value)

AttributeError: can't set attribute

t5-large does not recognize my GPU

I was able to use the t5-base model without any problems, but when I tried to run the same code just changing the model to t5-large, the following message appears after running trainer = pl.Trainer(**train_params)

MisconfigurationException Traceback (most recent call last)
in
----> 1 trainer = pl.Trainer(**train_params)

~/anaconda3/envs/Transformers/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py in init(self, logger, checkpoint_callback, early_stop_callback, callbacks, default_root_dir, gradient_clip_val, process_position, num_nodes, num_processes, gpus, auto_select_gpus, num_tpu_cores, log_gpu_memory, progress_bar_refresh_rate, overfit_pct, track_grad_norm, check_val_every_n_epoch, fast_dev_run, accumulate_grad_batches, max_epochs, min_epochs, max_steps, min_steps, train_percent_check, val_percent_check, test_percent_check, val_check_interval, log_save_interval, row_log_interval, add_row_log_interval, distributed_backend, precision, print_nan_grads, weights_summary, weights_save_path, num_sanity_val_steps, truncated_bptt_steps, resume_from_checkpoint, profiler, benchmark, reload_dataloaders_every_epoch, auto_lr_find, replace_sampler_ddp, progress_bar_callback, amp_level, default_save_path, gradient_clip, nb_gpu_nodes, max_nb_epochs, min_nb_epochs, use_amp, show_progress_bar, nb_sanity_val_steps, terminate_on_nan, **kwargs)
436 self.gpus = gpus
437
--> 438 self.data_parallel_device_ids = parse_gpu_ids(self.gpus)
439 self.root_gpu = determine_root_gpu_device(self.data_parallel_device_ids)
440 self.root_device = torch.device("cpu")

~/anaconda3/envs/Transformers/lib/python3.6/site-packages/pytorch_lightning/trainer/distrib_parts.py in parse_gpu_ids(gpus)
710 gpus = normalize_parse_gpu_string_input(gpus)
711 gpus = normalize_parse_gpu_input_to_list(gpus)
--> 712 gpus = sanitize_gpu_ids(gpus)
713
714 if not gpus:

~/anaconda3/envs/Transformers/lib/python3.6/site-packages/pytorch_lightning/trainer/distrib_parts.py in sanitize_gpu_ids(gpus)
676 You requested GPUs: {gpus}
677 But your machine only has: {all_available_gpus}
--> 678 """)
679 return gpus
680

MisconfigurationException:
You requested GPUs: [0]
But your machine only has: []

If I run this code:
args = argparse.Namespace(**args_dict)
print(args_dict)

The result is:
{'output_dir': '/home/mydir', 'model_name_or_path': 't5-large', 'tokenizer_name_or_path': 't5-large', 'max_seq_length': 90, 'learning_rate': 0.0003, 'weight_decay': 0.0, 'adam_epsilon': 1e-08, 'warmup_steps': 0, 'train_batch_size': 8, 'eval_batch_size': 8, 'num_train_epochs': 2, 'gradient_accumulation_steps': 8, 'n_gpu': 1, 'early_stop_callback': False, 'fp_16': False, 'opt_level': 'O1', 'max_grad_norm': 1.0, 'seed': 42}

I don't understand why my GPU isn't being found now. I have a RTX 2080 Ti

load checkpoints and general fine tuning advice

I fine-tuned t5-large for paraphrase generation for 2 epoches and the paraphrases generated looks good. When I trained for 11 epochs and the model seems overfitted (the paraphrases generated is similar to the original sentence).

1.I want to check the performance of checkpoints saved, but I don't know how to do it.
I tried
PATH ='./t5_paraphrase/checkpointepoch=10.ckpt'
model =T5ForConditionalGeneration.from_pretrained(PATH)

gives error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

I also tried (https://pytorch-lightning.readthedocs.io/en/latest/weights_loading.html
)
model =T5ForConditionalGeneration.load_from_checkpoint(PATH)
AttributeError: type object 'T5ForConditionalGeneration' has no attribute 'load_from_checkpoint'.

  1. Do you have any recommendation for fine tuning T5? I know this question is too broad. I have explored all the data set I have. For hyperparameters, I read the doc for pytorch lightning and found auto_lr_find, auto_scale_batch_size, fast_dev_run may be fun to try. However, because of the definition of t_total in train_dataloader, these will report error. So maybe there is no more tricks on this side.

  2. For paraphrase generation using T5 as a text-to-text task, I don't know how to utilize the negative examples directly here. Any recommendation? I plan to further fine tune T5-large's paraphrase identification with my data set (with positive and negative examples) and then used this fine tuned version to further fine tune on paraphrase generation. I am still investigating how to do this, so any help will be appreciated.

  3. In your example, you use T5ForConditionalGeneration. https://huggingface.co/transformers/model_doc/t5.html#t5forconditionalgeneration It is not very clear to me when I need to use T5Model rather than T5ForConditionalGeneration. Any resources on this?

Thanks!!

Predict all classes

I have applied your code on the emotion recognition task. I have got a really good result. Thank you for this source. I have 8 emotions and for test data, I would like to print each emotion with its probability, like ['happy': 0.061, 'surprise': 0.148, 'fear': 0.031]. In the prediction, only printed the only 1 emotion. I have increased the parameter of "num_return_sequences" to 8, then I have got some outputs which are not from emotions list. I know my question is a bit stranger :)

T5FineTuner issue "in training_epoch_end avg_train_loss = torch.stack([x["loss"] for x in outputs]).mean() "

Hi, Suraj,
I am trying to use your T5FineTune class to study the fine tune skill.
But, unfortunately, when I tried to run the program on my env, I got this error:

in training_epoch_end
avg_train_loss = torch.stack([x["loss"] for x in outputs]).mean()
RuntimeError: stack expects a non-empty TensorList

I tried to track the cause and found that the "training_step" never be called.
I think it may relate with the "ImdbDataSet" for the train_dataloadder, but I debuged it and it seems all right.
I just begin to contact the DeepLearning, so maybe there is something is obvious but I really don't know.

Do you have any idea about what may cause it?
Thank you and looking forward your any feedback.

Best Regards

t5 training notebook issue

Hello Suraj ,
Thanks for sharing this fine tuning of t5 notebook.
I am trying to use it as it is but when i am running it on colab it throwing following error.

**trainer = pl.Trainer(train_params)
"TypeError: init() got an unexpected keyword argument 'early_stop_callback'"

I will appreciate your time in resolving this issue.

thanks

difference between decoder_input_ids and lm_labels

Hi, Suraj! Thanks for making this notebook.

I am new to this. I couldn't find the clear definition of 'lm_labels' on huggingface. I noticed in your notebook, you put the target ids into 'lm_labels' rather than decoder_input_ids. Could you let me know why? I am trying to use your code to fine tune on paraphrase generation. Thanks!

fill = pipeline('fill-mask', model='tamil_bert', tokenizer='tamil_bert')


ValueError Traceback (most recent call last)
/tmp/ipykernel_3356/3905977959.py in
----> 1 fill = pipeline('fill-mask', model='tamil_bert', tokenizer='tamil_bert')

~/.local/lib/python3.8/site-packages/transformers/pipelines/init.py in pipeline(task, model, config, tokenizer, feature_extractor, framework, revision, use_fast, use_auth_token, model_kwargs, **kwargs)
452 # Will load the correct model if possible
453 model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
--> 454 framework, model = infer_framework_load_model(
455 model,
456 model_classes=model_classes,

~/.local/lib/python3.8/site-packages/transformers/pipelines/base.py in infer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
156
157 if isinstance(model, str):
--> 158 raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
159
160 framework = "tf" if model.class.name.startswith("TF") else "pt"

ValueError: Could not load model tamil_bert with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForMaskedLM'>,).

Fine-tune Any Models?

Hello Suraj, I have been really enjoying reading your code! I am wondering, for a beginner like me, is there any kind of tutorial I can learn to fine-tune almost any published model to do the certain task? e.g. fine-tuning T5 with SQuAD to answer open-book questions, which is using a context and a question to predict an answer. I think t5 has that task implemented but I am having a very difficult time doing so with non-Transformer models.

Thank you!

RuntimeError: Input, output and indices must be on the current device

Hi, I am using your code to fine tune my language model, so for I saved fine tuned model. After that while trying to evaluate error occured.

outs = model.model.generate(input_ids=batch['source_ids'].cuda(),
attention_mask=batch['source_mask'].cuda(),
max_length=2) # ------------ Error occures here

dec = [tokenizer.decode(ids) for ids in outs]

texts = [tokenizer.decode(ids) for ids in batch['source_ids']]
targets = [tokenizer.decode(ids) for ids in batch['target_ids']]

#-----ERROR

6 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
1914 # remove once script supports set_grad_enabled
1915 _no_grad_embeddingrenorm(weight, input, max_norm, norm_type)
-> 1916 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
1917
1918

RuntimeError: Input, output and indices must be on the current device

TypeError: cannot unpack non-iterable NoneType object

Hi
I am running finetuning for IMDB I am getting this error, thanks for your help


 File "main.py", line 70, in train
    trainer.fit(model)
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 440, in fit
    results = self.accelerator_backend.train()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 54, in train
    results = self.train_or_test()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 68, in train_or_test
    results = self.trainer.train()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 485, in train
    self.train_loop.run_training_epoch()
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 544, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 713, in run_training_batch
    self.optimizer_step(optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 453, in optimizer_step
    optimizer, batch_idx, opt_idx, train_step_and_backward_closure
  File "/opt/conda/envs/test/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 122, in optimizer_step
    using_lbfgs=is_lbfgs
TypeError: optimizer_step() got an unexpected keyword argument 'using_native_amp'
Exception ignored in: <function tqdm.__del__ at 0x7fc3337da9e0>
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/opt/conda/envs/test/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.