Giter Site home page Giter Site logo

fudan_mtl_reviews's People

Contributors

frankwork avatar jaysonalbert avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fudan_mtl_reviews's Issues

关于bach数目设定的问题

您好,我已经看这份代码很多遍了,今天突然发现一个问题,我主要想知道下面这个循环:


for batch in range(82):
train_fetch = [m_train.tensors, m_train.train_ops, merged_train]


其中,“82”是如何得到的呢?
谢谢,希望您能和我交流!

命令行传参数无法识别

使用 python src/main.py --build_data True --adv True 这类tensorflow的标准格式传参数,总是报参数的unknown错误,请问正确的命令行形式是怎样设置的?谢谢!

questions of code

Hi, i am confused bypad_id = -1 word_embed[pad_id] = np.zeros([word_dim])
Please tell me why let pad_id = -1? I think these two lines should be removed.

note: this piece of code is in util.py ---> trim_embeddings function

关于main.pyu中的train函数

你好,train函数中
for batch in range(82):
for i in range(n_task):

请问这个batch行是什么含义?acc_loss和all_acc是求的所有样本上的均值是么?
谢谢!

验证集缺失问题

作者您好,我在解压后发现多任务数据集中每个类的数据都缺失了验证集,原论文中是有验证集的。能否请您补充一下呢。万分感谢!

domain_loss的一些问题

您好,在我们的代码中,利用gradient reverse得到对抗损失,但是这个损失一直不变稳定在0.7左右,显然是对抗不起作用,而作者您的实验结果中domain_loss是从0.6下降到0.2,我想问下这是什么问题,对抗训练有什么技巧吗?谢谢

About data missing

Good Afternoon,Author!First, thanks for your contribution to implement the paper,which help me to learn more about the Adversarial Multi-task Learning.During learning the code, I have found that there have missed two documents named embed300.google.npy and google_words.lst in data/pretrained document.Due to missing them, I couldn't run this program,also couldn't learn the right result about it.If it's convient to you, I sincerely hope that you can share the two data on this to let more students learn about it.Thanks, I'm sorry to bother you in week day~

Performance issues in fudan_mtl_reviews/blob/master/src/inputs/util.py(P2)

Hello,I found a performance issue in the definition of read_tfrecord ,
fudan_mtl_reviews/blob/master/src/inputs/util.py,
dataset = dataset.map(parse_func) was called without num_parallel_calls.
I think it will increase the efficiency of your program if you add this.

Here is the documemtation of tensorflow to support this thing.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

UnrecognizedFlagError!!

Traceback (most recent call last):
File "src/main.py", line 7, in
from inputs import util
File "/home/young/Downloads/fudan_mtl_reviews-master/src/inputs/util.py", line 50, in
def write_vocab(vocab, vocab_file=FLAGS.vocab_file):
File "/home/young/Downloads/ENTER/envs/deeplearning/lib/python3.5/site-packages/tensorflow/python/platform/flags.py", line 84, in getattr
wrapped(_sys.argv)
File "/home/young/Downloads/ENTER/envs/deeplearning/lib/python3.5/site-packages/absl/flags/_flagvalues.py", line 630, in call
name, value, suggestions=suggestions)
absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'build_data'

any idea how to fix it?

请问多任务学习中的测试集是怎么样工作的?

请问训练出来一个多任务学习文本分类模型后,测试集里面的数据需不需要先确定是那个任务的数据在进行测试;还是一个完全无标签的数据,通过多任务学习模型会先给它确定在那个任务,再进行分类?

Loss linear combination

Are the coefficients used while summing up the losses fixed? How did you choose them?
In the original paper, they would be alpha_t, where t is the particular task.

Thank you!

关于代码

请问有torch版的代码吗?我尝试将代码修改为torch版但是模型效果一直不好

关于adv+diff与adv的实验结果

你好,两个问题

  1. 我重跑三个实验得到的error rate %结果是
    mtl 13.77 / 您的结果 13.75
    mtl +adv 12.73 /您的结果 12.79
    mtl + adv + diff 12.86 /您的结果 12.70

    可以看到加上diff的结果反而略差于不加的。根据您的代码我在mtl_model.py的三种实现:
    mtl : loss = loss_ce + FLAGS.l2_coefloss_l2
    mtl +adv : loss = loss_ce + 0.05 * loss_adv + FLAGS.l2_coef * (loss_l2 + loss_adv_l2)
    mtl + adv +diff : loss = loss_ce + 0.05
    loss_adv + FLAGS.l2_coef*(loss_l2+loss_adv_l2) + loss_diff
    请问只需要设置这三个目标函数即可么?
    2.由于实验数据没有validation,实际上train的时候得到的best acc也就是在test数据集上得到的acc,两者应该是完全一致的。但我用自己处理的两份数据跑完train之后,重新载入ckpt 进行test, test与train的acc结果总是不一致,找不到原因,可否请您提示下原因?【载入的ckpt文件确实是我每次train后生成的best epoch的结果】
    非常感谢!

I wonder to know why when batch size=512(16 in code), loss is increasing ?

Line 872: Epoch 0 all_batch_num 510  loss 2.18 acc 0.61 0.8349 time 225.83
Line 1418: Epoch 1 all_batch_num 510  loss 2.70 acc 0.75 0.8872 time 224.98
Line 1964: Epoch 2 all_batch_num 510  loss 2.43 acc 0.83 0.9069 time 221.33
Line 2510: Epoch 3 all_batch_num 510  loss 2.96 acc 0.87 0.9104 time 221.77
Line 3056: Epoch 4 all_batch_num 510  loss 3.74 acc 0.88 0.8987 time 216.13
Line 3602: Epoch 5 all_batch_num 510  loss 4.38 acc 0.89 0.9288 time 210.82
Line 4148: Epoch 6 all_batch_num 510  loss 5.18 acc 0.90 0.9268 time 218.51
Line 4694: Epoch 7 all_batch_num 510  loss 6.28 acc 0.90 0.9312 time 210.94
Line 5240: Epoch 8 all_batch_num 510  loss 7.62 acc 0.91 0.9326 time 218.76
Line 5786: Epoch 9 all_batch_num 510  loss 8.98 acc 0.91 0.9245 time 221.23
Line 6332: Epoch 10 all_batch_num 510  loss 10.36 acc 0.91 0.9182 time 212.32
Line 6878: Epoch 11 all_batch_num 510  loss 11.74 acc 0.92 0.9292 time 211.89

typical patterns captured by task-specific layer

Hi, I want to extract the typical patterns captured by task-specific layer, How can I do it?
In the paper :Task-Movie's typical patterns are well-directed,pointless,cut,cheap,infantile
ans Task-Baby: cute, safety, mild, broken, simple
I want to get typical patterns like those,Thanks!

Adv Loss is not supported by the paper???

Hi, I want to know where the adv loss is different from the domain loss??
In another word, the adv loss in the paper "Adversarial Multi-task Learning for Text Classification" has not described clearly. So i want to know what the equation is??

关于single task的训练问题

您好,您的代码中有去除adv和diff的MTL的实现,
但根据代码:
#TODO NOTICE 非对抗时只使用private,即single task训练,并不是所有数据放在一起训练
if self.adv:
feature = tf.concat([conv_out, shared_out], axis=1)
else:
feature = conv_out
实际上MTL中没有用到share structure,请问这样模型与single task有区别么?我只输入一个数据集进行single task的结果与使用MTL的版本跑出来的结果还要差一些【initial lr均为0.01】请问大概是什么原因呢?谢谢!

Performance issue in /src/inputs/util.py (by P3)

Hello! I've found a performance issue in /src/inputs/util.py: dataset.batch(batch_size)(line 229) should be called before dataset.map(parse_func)(line 224), which could make your program more efficient.

Here is the tensorflow document to support it.

Besides, you need to check the function parse_func called in dataset = dataset.map(parse_func) whether to be affected or not to make the changed code work properly. For example, if parse_func needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

diff_loss中的转置是否存在问题?

correlation_matrix = tf.matmul(
task_feat, shared_feat, transpose_a=True)

task_feat的size是(bsz, feature_size), shared_feat的size也是(bsz, feature_size), 这样乘出来的结果岂不是(feature, bsz) x (bsz, feature) --> (feature, feature)。
但是我个人感觉这个size应该是(bsz, bsz)更合理些?

是我搞错了, task_feat & shared_feat 的feature_size不一定相同。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.