Comments (2)
StaticCache.get_seq_length()
should not even exist. We ought to rely on cache_positions
from transformers.
Roadmap is to deprecate all calls to get_seq_length
, cc @gante
from transformers.
Related Issues (20)
- Batch inputs get different result to single input for llama model. HOT 8
- Weights of LlamaForQuestionAnswering were not initialized from the model checkpoint HOT 1
- data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 12564 column 3 HOT 4
- [BLIP-2] BitsAndBytes 4 and 8 bit give empty string HOT 8
- Update `make-fixup` to make sure image processor tests present
- Add Llama 3 support to `convert_llama_weights_to_hf()` HOT 6
- Reccurent Gemma forward pass does not match with Google DeepMind original implementation HOT 3
- Idefics2 - sdpa vs flash attn 2 HOT 2
- Add HelpingAI-3B-v2.2: Emotionally Intelligent Conversational AI
- tranformers.Trainer.train() hangs with Llama3 base model HOT 9
- Llama generation with static cache fails in certain sequence lengths HOT 1
- [FSDP] redundant additional allgather during backward when using FSDP FULL_SHARD with gradient checkpointing HOT 1
- Assistant model not working for different sized openai models when using pipeline for ASR HOT 1
- Error trying to use https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized HOT 1
- Assisted decoding results are not correct HOT 7
- Why do the implementation behaviors of official llava and transformers differ? HOT 3
- Gemma's Tokenizer fails to split on spaces
- `StaticCache` Bad generation results with Llama after v4.39.0 HOT 2
- [SegGPT] Loss calculation is broken
- Remove `mps` workaround for `isin()` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.