Comments (1)
Hey! The generate function is not supposed to work for training. That is why we don't test past key values and output router logits. Though it's actually not that incompatible (you could want to look at the distribution of the router logits during generation).
Do you want to open a PR for a fix?
from transformers.
Related Issues (20)
- Add siglip flashattention support? HOT 4
- Unable to load t5-small tokenizer saved with latest packages in older versions
- Adding Special Tokens to GPT2 doesn't have any effect HOT 1
- TypeError: 'NoneType' object cannot be interpreted as an integer HOT 5
- AttributeError: 'LlamaForCausalLM' object has no attribute '_setup_cache' HOT 2
- Conversation pipeline example doesn't work HOT 6
- Finetuning OPT models with 8bit and LoRA on QA tasks leads to NAN weight in `model.qa_outputs` HOT 4
- LLaVA-NeXT-Video support HOT 4
- transformers 4.41.2 breaks paligemma inference HOT 4
- ModuleNotFoundError: No module named 'distutils' in Python 3.12 HOT 2
- Uniswap
- Uniswapv2
- Uniswapv3
- Etherscanio
- Erc
- Ww
- Wws HOT 1
- Speculative Decoding for chunked audios HOT 2
- Original Llama-3 tokenizer behaves differently from `transformers` version HOT 2
- MusicGen fails when being used with pipeline HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.