Giter Site home page Giter Site logo

Comments (11)

artemisart avatar artemisart commented on May 1, 2024 6

Technically it is possible but BERT was not pretrained to handle multiple SEP tokens between sentences and does not have a third token_type, so I think it won't be easy to make it work. You may also want to use a new token for the second separation.

from transformers.

mikelkl avatar mikelkl commented on May 1, 2024 4

@thedrowsywinger maybe u should try Transformer-XL

from transformers.

mikelkl avatar mikelkl commented on May 1, 2024 1

Technically it is possible but BERT was not pretrained to handle multiple SEP tokens between sentences and does not have a third token_type, so I think it won't be easy to make it work. You may also want to use a new token for the second separation.

Hi artemisart,

Thanks for your reply.

So, if someone wanna take multiple sentences as input of BertForSequenceClassification, let's say a whole passage, an alternative way is to concatenate them into a single "sentence" and then fit it in, right?

from transformers.

artemisart avatar artemisart commented on May 1, 2024 1

No it shouldn't

from transformers.

artemisart avatar artemisart commented on May 1, 2024

I you don't have a separation (like question/answer) then yes you can just concatenate them (but you are still limited to 512 tokens).

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

@mikelkl I would also go with the solution and answer of @artemisart.

from transformers.

alphanlp avatar alphanlp commented on May 1, 2024

@artemisart hi, if i have a single sentence classification task, should the max length of sentence limited to half of 512, that is to say 256?

from transformers.

artemisart avatar artemisart commented on May 1, 2024

No, it will be better if you use the full 512 tokens.

from transformers.

thedrowsywinger avatar thedrowsywinger commented on May 1, 2024

wouldn't concatenating the whole passage into a single sentence mean losing context of each sentence? @artemisart

from transformers.

thedrowsywinger avatar thedrowsywinger commented on May 1, 2024

What if I want to check on a huge corpus, that even concatenating into one sentence exceeds the 512 token limit? @artemisart

from transformers.

sid8491 avatar sid8491 commented on May 1, 2024

I you don't have a separation (like question/answer) then yes you can just concatenate them (but you are still limited to 512 tokens).

I have 3 inputs, 1 of the input contains conversation (QUERY, ANSWER).
QUERY: I want to ask a question.

ANSWER: Sure, ask away.
QUERY: How is the weather today?
ANSWER: It is nice and sunny.
QUERY: Okay, nice to know.
ANSWER: Would you like to know anything else?

How can I tell the model to separate the turns of conversation? Model is classification model.
I was thinking to add a new special token between the turns but could not get it work.

from transformers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.