Hi, I'm having a bit of a problem trying to reproduce your code. The module optim_orders has not been found, it would be very helpful if you could provide instructions on getting the missing files.Thanks again for making your code public.
python main.py --data_path ../data/ --dataset seed5 --model_name_or_path t5-base --output_dir ../outputs/unified/top5_seed5 --num_train_epochs 20 --save_top_k 0 --task unified --top_k 5 --ctrl_token post --multi_path --num_path 5 --seed 5 --train_batch_size 16 --gradient_accumulation_steps 1 --learning_rate 1e-4 --lowercase --sort_label --data_ratio 1.0 --check_val_every_n_epoch 10 --agg_strategy vote --eval_batch_size 64 --constrained_decode --multi_task --do_train
Some weights of the model checkpoint at t5-base were not used when initializing MyT5ForConditionalGeneration: ['decoder.block.0.layer.1.EncDecAttention.relative_attention_bias.weight']
This IS expected if you are initializing MyT5ForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing MyT5ForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of MyT5ForConditionalGeneration were not initialized from the model checkpoint at t5-base and are newly initialized: ['encoder.embed_tokens.weight', 'lm_head.weight', 'decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/root/miniconda3/envs/mvp/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:67: UserWarning: Starting from v1.9.0, tensorboardX has been removed as a dependency of the pytorch_lightning package, due to potential conflicts with other packages in the ML ecosystem. For this reason, logger=True will use CSVLogger as the default logger, unless the tensorboard or tensorboardX packages are found. Please pip install lightning[extra] or one of them to enable TensorBoard support by default
warning_cache.warn(
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
0 | model | MyT5ForConditionalGeneration | 222 M
222 M Trainable params
0 Non-trainable params
222 M Total params
891.614 Total estimated model params size (MB)
I would like to perform fine-tuning training on a custom dataset. Could you please let me know which parameters and files I need to modify? What does the 'const.py' file represent? Do I need to build it based on my custom dataset?
I really want to test the performance of the model without having to fine-tune it for a specific task.
I tried to follow your code, something like this:
tokenizer=T5Tokenizer.from_pretrained(model_path)
tfm_model=MyT5ForConditionalGeneration.from_pretrained(model_path)
model=T5FineTuner(config, tfm_model, tokenizer)
text="I will be back, I love the sushi badly!"input_tokenized=tokenizer(text, return_tensors="pt")
summary_ids=model.model.generate(input_tokenized['input_ids'])
output=tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(output)
# Output: [I will be back [e] love sushi [I love badly sushi
But I'm not 100% sure about the config file and I'm getting weird results.
If you could provide an example, it would be fantastic!