Comments (11)
Technically it is possible but BERT was not pretrained to handle multiple SEP tokens between sentences and does not have a third token_type, so I think it won't be easy to make it work. You may also want to use a new token for the second separation.
from transformers.
@thedrowsywinger maybe u should try Transformer-XL
from transformers.
Technically it is possible but BERT was not pretrained to handle multiple SEP tokens between sentences and does not have a third token_type, so I think it won't be easy to make it work. You may also want to use a new token for the second separation.
Hi artemisart,
Thanks for your reply.
So, if someone wanna take multiple sentences as input of BertForSequenceClassification, let's say a whole passage, an alternative way is to concatenate them into a single "sentence" and then fit it in, right?
from transformers.
No it shouldn't
from transformers.
I you don't have a separation (like question/answer) then yes you can just concatenate them (but you are still limited to 512 tokens).
from transformers.
@mikelkl I would also go with the solution and answer of @artemisart.
from transformers.
@artemisart hi, if i have a single sentence classification task, should the max length of sentence limited to half of 512, that is to say 256?
from transformers.
No, it will be better if you use the full 512 tokens.
from transformers.
wouldn't concatenating the whole passage into a single sentence mean losing context of each sentence? @artemisart
from transformers.
What if I want to check on a huge corpus, that even concatenating into one sentence exceeds the 512 token limit? @artemisart
from transformers.
I you don't have a separation (like question/answer) then yes you can just concatenate them (but you are still limited to 512 tokens).
I have 3 inputs, 1 of the input contains conversation (QUERY, ANSWER).
QUERY: I want to ask a question.
ANSWER: Sure, ask away.
QUERY: How is the weather today?
ANSWER: It is nice and sunny.
QUERY: Okay, nice to know.
ANSWER: Would you like to know anything else?
How can I tell the model to separate the turns of conversation? Model is classification model.
I was thinking to add a new special token between the turns but could not get it work.
from transformers.
Related Issues (20)
- try eval befor train gives ValueError with deepspeed Zero2
- `BartForConditionalGeneration` has no attribute `shared` HOT 1
- OPRO-FT- config.json file not loaded -Andyrasika/Mistral7b-ORPO HOT 3
- EncoderDecoderModel with XLM-R
- Mamba: which tokenizer has been saved and how to use it? HOT 1
- Create panoptic segmentation task guide
- Error at the generation stage by MusicGen stereo model HOT 3
- Trying to stack tensors from different devices in `_pad_to_max_length` in Whisper batched inference
- [Whisper] Word-level timestamps broken for short-form audio HOT 2
- [BUG] Load StarCoder2 AWQ using Transformers HOT 5
- `import transformers` accidentally initializing both torch and jax/xla at startup time HOT 5
- FSDP Doesn't Work with model.generate() HOT 2
- Nondeterministic behavior from GPT with MPS backend HOT 6
- LlamaRMSNorm() Dtype Casting Error HOT 1
- Trainer do not move the model to GPU when doing evaluation with FSDP
- [i18n-PL] Translating docs to Polish HOT 3
- PEFT models donot "override" user's argument for return_full_text. HOT 2
- There is a probability that a bug will be triggered when tracing the llama model: torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow HOT 1
- Couldn't connect to `https://huggingface.co`. HOT 1
- MPS memory leak?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.