Comments (2)
It turns out that it is not me added this modeling file, it's @NielsRogge 😄
In such composite modeling cases, the encoder and decoder are usually from 2 different models (well, we could use the same model if it is EncoderDecoderModel
), and the input/output tie embedding doesn't really make sense as @amyeroberts mentioned.
You can definitely tweak the code to suit your custom modeling logic and needs.
from transformers.
Hi, thanks for raising an issue!
This is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.
cc @ydshieh who I believe added this model originally
since the input and output embeddings are from the decoder and the decoder is an XForCausalLM class's object,
I'm not sure this is completely correct - the input to the model are pixel values, and the vision encoder will have its own embeddings, which hare not related to the decoder's embeddings.
from transformers.
Related Issues (20)
- Mixtral past_key_values and output_router_logits incompatible HOT 1
- Disable Progress Bar? HOT 1
- Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B HOT 2
- [DOCS] - Model outputs of RecurrentGemmaCausalLM doesn't align with the documentation HOT 1
- [Batched Whisper] ValueError on input mel features HOT 3
- use_reentrant=False can't be set properly HOT 6
- Bug: InformerModel, decoder_input torch.cat size of tensor mismatch error otherwise HOT 7
- BitsNBytes 4 bit quantization error message typo and logical errors in error message handling HOT 3
- train_new_from_iterator does not properly modify the tokenizer's postprocessor's ids when using a Sequence postprocessor
- recent version of Transformers seems to mess with forward/__call__. Breaks patching loss function HOT 3
- TypeError: 'list' object is not callable || Resume from checkpoint HOT 3
- Failed to import transformers.models.vit.feature_extraction_vit because of the following error (look up to see its traceback): No module named 'ml_dtypes._custom_floats' HOT 1
- TokenClassificationPipeline support is_split_into_words tokeniser parameter HOT 2
- Implement kv cache sparsity like H2O with attention score HOT 1
- BART generate with min_new_tokens exceeds maximum length HOT 4
- Convert Helsinki-NLP model to huggingface
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained HOT 3
- Grounding DINO missing custom kernels HOT 2
- For multiple GPUs: torch.cuda.empty_cache() stuck forever
- Issues occuring during parallel evaluation (using Trainer.evaluate)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.