Giter Site home page Giter Site logo

Comments (23)

whqwill avatar whqwill commented on May 1, 2024 5

I know, but ... the same problem ... my memory is limited .. so ...

PS. I am Chinese

from transformers.

waynedane avatar waynedane commented on May 1, 2024 4

是不是语料的问题,bert是在wiki上训练的。我用kp20k训练了一个mini bert,在测试集上的accuracy目前是80%,你要不要试试用我这个作为encoder?

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024 2

Hi guys,
I would like to keep the issues of this repository focused on the package it-self.
I also think it's better to keep the conversation in english so everybody can participate.
Please move this conversation to your repository: https://github.com/memray/seq2seq-keyphrase-pytorch or emails.
Thanks, I am closing this discussion.
Best,

from transformers.

waynedane avatar waynedane commented on May 1, 2024

have u tried transformer decoder ?instead of rnn decoder.

from transformers.

whqwill avatar whqwill commented on May 1, 2024

not yet, I will try. But I think rnn decoder should not be such bad.

from transformers.

waynedane avatar waynedane commented on May 1, 2024

not yet, I will try. But I think rnn decoder should not be such bad.

emmm,maybe u should used mean of last layer to initialize decoder, not the last token representation of last layer.
I am also very concerned about the results of using transformer decoder. If you are done, can you tell me? Thank you.

from transformers.

waynedane avatar waynedane commented on May 1, 2024

I think the batch size of RNN with BERT is too small. pleas see

https://github.com/memray/seq2seq-keyphrase-pytorch/blob/master/pykp/dataloader.py
line 377-378

from transformers.

whqwill avatar whqwill commented on May 1, 2024

I don't know what you mean by giving me this link. I set to 10 really because of the memory problem. Actually, when sentence length is 512, the max batch size is only 5, if it is 6 or bigger there will be memory error for my GPU.

from transformers.

whqwill avatar whqwill commented on May 1, 2024

not yet, I will try. But I think rnn decoder should not be such bad.

emmm,maybe u should used mean of last layer to initialize decoder, not the last token representation of last layer.
I am also very concerned about the results of using transformer decoder. If you are done, can you tell me? Thank you.

You are right. Maybe the mean is better, I will try as well. Thanks.

from transformers.

waynedane avatar waynedane commented on May 1, 2024

May i ask a question? R u chinese?23333

from transformers.

waynedane avatar waynedane commented on May 1, 2024

Cause for one example, it has N targets. We wanna put all targets in the same batch. 10 is too small that the targets of one example would be in different batches probably.

from transformers.

waynedane avatar waynedane commented on May 1, 2024

I know, but ... the same problem ... my memory is limited .. so ...

PS. I am Chinese

i am as well hahaha

from transformers.

whqwill avatar whqwill commented on May 1, 2024

from transformers.

waynedane avatar waynedane commented on May 1, 2024

accuracy 是masklm和nextsentence两个任务的,不是key phrase generation,我没说清楚,抱歉。我的算力有限,两块p100, 快一个月了,目前还没训练完。80%是当前的表现。

from transformers.

whqwill avatar whqwill commented on May 1, 2024

你提到的mini bert 是什么意思?

from transformers.

whqwill avatar whqwill commented on May 1, 2024

我大概理解你的意思了,你相当于是用kp20重新预训练一个bert,不过这样做... 感觉确实蛮麻烦。

from transformers.

waynedane avatar waynedane commented on May 1, 2024

我大概理解你的意思了,你相当于是用kp20重新预训练一个bert,不过这样做... 感觉确实蛮麻烦。

是的,用的是 Junseong Kim的代码:https://github.com/codertimo/BERT-pytorch ,模型规模比谷歌的BERT-Base Uncased都小很多。这个是L-8 H-256 A-8.我把目前训练的checkpoint和vocab文件发给你

from transformers.

whqwill avatar whqwill commented on May 1, 2024

但是你这个checkpoint,我的这个版本能直接用吗,还是说我必须装你的那个版本的代码?

from transformers.

whqwill avatar whqwill commented on May 1, 2024

你可以发到我邮箱 [email protected] , 谢

from transformers.

waynedane avatar waynedane commented on May 1, 2024

但是你这个checkpoint,我的这个版本能直接用吗,还是说我必须装你的那个版本的代码?

可以根据Junseong Kim 的代码创建一个bert model然后加载参数,不一定得安装

from transformers.

whqwill avatar whqwill commented on May 1, 2024

好的把。那你把checkpoint 发给我试试。

from transformers.

InsaneLife avatar InsaneLife commented on May 1, 2024

accuracy 是masklm和nextsentence两个任务的,不是key phrase generation,我没说清楚,抱歉。我的算力有限,两块p100, 快一个月了,目前还没训练完。80%是当前的表现。
你好,能把mini版模型发我一下吗,[email protected],谢谢啦。

from transformers.

Accagain2014 avatar Accagain2014 commented on May 1, 2024

hi, @whqwill I have some doubts about the usage manner of bert with RNN.
In bert with RNN method, I see you only consider the last term's representation (I mean the TN's) as the input to RNN decoder, why not use the other term's representation, like T1 to TN-1 ? I think the last term's information is too less to represent all the context information.

from transformers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.