Giter Site home page Giter Site logo

multi-view-prompting's Introduction

Hi there, this is Zhibin!👋

  • 🎓 M.S. student at THU, with a CS background from BUPT.
  • 🔭 Diving into large language models and reasoning, with a long-term aim at AGI.
  • 🏃 Avid runner, fitness buff, and badminton player.
  • 💬 Open for chat and collaboration – don't hesitate to reach out!

multi-view-prompting's People

Contributors

beeevita avatar zubingou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

multi-view-prompting's Issues

prefix_allowed_tokens_fn

您好,请问这里的def prefix_allowed_tokens_fn(self, task, data_name, source_ids, batch_id,input_ids):这一步是什么意思,是加入到T5的每一层中吗,还是只是embeding部分

No module named 'optim_orders'

Hi, I'm having a bit of a problem trying to reproduce your code. The module optim_orders has not been found, it would be very helpful if you could provide instructions on getting the missing files.Thanks again for making your code public.

Code Issue

I would like to perform fine-tuning training on a custom dataset. Could you please let me know which parameters and files I need to modify? What does the 'const.py' file represent? Do I need to build it based on my custom dataset?

模型预测

请问,怎么用自己训练好的模型进行一句话的预测

Please provide an easier model inference method

I really want to test the performance of the model without having to fine-tune it for a specific task.

I tried to follow your code, something like this:

tokenizer = T5Tokenizer.from_pretrained(model_path)
tfm_model = MyT5ForConditionalGeneration.from_pretrained(model_path)
model = T5FineTuner(config, tfm_model, tokenizer)

text = "I will be back, I love the sushi badly!"

input_tokenized = tokenizer(text, return_tensors="pt")
summary_ids = model.model.generate(input_tokenized['input_ids'])
output = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print(output)

# Output: [I will be back [e] love sushi [I love badly sushi

But I'm not 100% sure about the config file and I'm getting weird results.

If you could provide an example, it would be fantastic!

中文数据集训练

作者您好,如果换成中文数据集训练的话,该如何调整网络架构

code issue

作者你好,我在运行bash scripts/run_unified.sh这条命令的时候代码就没反应了请问是什么情况,也没有报错,也没有运行gpu资源

code and data

Hello author, I have read your paper, your experimental results are amazing, and I look forward to your published code.

看不到训练进程

训练的时候卡在这一步,看不到训练进程
(mvp) root@autodl-container-a34a11a952-3cb61709:~/autodl-tmp/absa/multi-view-prompting# bash scripts/run_unified.sh

  • export CUDA_VISIBLE_DEVICES=0
  • CUDA_VISIBLE_DEVICES=0
  • cd src
  • for SEED in 5 10 15 20 25
  • K=5
  • INFER_PATH=5
  • CTRL_TOKEN=post
  • TASK=unified
  • OUT_DIR=../outputs/unified/top5_seed5
  • mkdir -p ../outputs/unified/top5_seed5
  • python main.py --data_path ../data/ --dataset seed5 --model_name_or_path t5-base --output_dir ../outputs/unified/top5_seed5 --num_train_epochs 20 --save_top_k 0 --task unified --top_k 5 --ctrl_token post --multi_path --num_path 5 --seed 5 --train_batch_size 16 --gradient_accumulation_steps 1 --learning_rate 1e-4 --lowercase --sort_label --data_ratio 1.0 --check_val_every_n_epoch 10 --agg_strategy vote --eval_batch_size 64 --constrained_decode --multi_task --do_train
    Some weights of the model checkpoint at t5-base were not used when initializing MyT5ForConditionalGeneration: ['decoder.block.0.layer.1.EncDecAttention.relative_attention_bias.weight']
  • This IS expected if you are initializing MyT5ForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).

  • This IS NOT expected if you are initializing MyT5ForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Some weights of MyT5ForConditionalGeneration were not initialized from the model checkpoint at t5-base and are newly initialized: ['encoder.embed_tokens.weight', 'lm_head.weight', 'decoder.embed_tokens.weight']
    You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
    GPU available: True (cuda), used: True
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    /root/miniconda3/envs/mvp/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:67: UserWarning: Starting from v1.9.0, tensorboardX has been removed as a dependency of the pytorch_lightning package, due to potential conflicts with other packages in the ML ecosystem. For this reason, logger=True will use CSVLogger as the default logger, unless the tensorboard or tensorboardX packages are found. Please pip install lightning[extra] or one of them to enable TensorBoard support by default
    warning_cache.warn(
    You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

    | Name | Type | Params


0 | model | MyT5ForConditionalGeneration | 222 M

222 M Trainable params
0 Non-trainable params
222 M Total params
891.614 Total estimated model params size (MB)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.