System Info transformers ve

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Understood! Went through some other <a href="https://stackoverflow.com/a/77774322/2471

BART generate with min_new_tokens exceeds maximum length about transformers HOT 4 CLOSED

vsocrates commented on June 15, 2024

BART generate with min_new_tokens exceeds maximum length

from transformers.

Comments (4)

younesbelkada commented on June 15, 2024 1

Hi @vsocrates
Thanks for the issue !
You are getting that warning because the model's maximum positional embedding stops at 1024 tokens, some models have fixed positional embeddings where you can't exceed the maximum number of tokens by design (e.g. by having a nn.Embedding layer), for some models it is possible to exceed that, at your own risk as the model has not been trained to exceed that many tokens. If you are getting consistent / nice generations, I would say that there is nothing to worry about, otherwise you might need to use other models that support longer context length

from transformers.

vsocrates commented on June 15, 2024 1

Understood! Went through some other issues and and it looks like T5 might use relative position embeddings so in theory, should be able to extend beyond its max context length (512 tokens), but potentially with some loss of accuracy/weird generations, is that correct?

from transformers.

younesbelkada commented on June 15, 2024 1

Yes this is correct, from my experience with flan-T5 that was possible but with some potential loss of accuracy / unconsistent generation

from transformers.

vsocrates commented on June 15, 2024 1

Great, thanks, closing this issue!

from transformers.

BART generate with min_new_tokens exceeds maximum length about transformers HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent