Comments (3)
Hi @bobox2997, thanks for opening this PR!
What I would suggest is adding checkpoints, configs and possibly updated modeling files directly on the hub, and have as much support as we can there. It will be easier to integrate than directly into transformers. Here is a tutorial if that sound good to you!
from transformers.
I'm not sure that I understood that correctly... What checkpoint should I add?
On TSDAE implementation the decoder portion is tied to the encoder and is not used at inference, I just need an "is_decoder" argument (and related configs of course) in the config of DeBERTa as there is for BERT, RoBERTa et similia models...
I'm sorry if those are naive or dumb questions, I'm still learning.
Thank you so much for your time!
from transformers.
@bobox2997 Ah, OK, I thought there were checkpoints available trained with this method.
In terms of changes in the transformers library, we're very unlikely to accept changes to the architecture or configurations files to add new features like this, especially for older, popular models and anything which doesn't have official checkpoints available.
The great thing about open source is that you are free to build upon and adapt the available code (license permitting) for your own projects. It should be possible to add as a new architecture on the hub, keeping compatibility with the transformers library and allowing you to use the same API.
If you or anyone else in the community would like to implement this, feel free to share you project here!
from transformers.
Related Issues (20)
- Is there any way to update the parameters of embedding model? HOT 2
- A bug that may cause device inconsistency HOT 4
- Gemma 2 Inference with BF16 fails HOT 5
- cuda device is wrongly requested instead of xpu running pipeline(device_map="auto", max_memory": {0: 1.0e+10}) HOT 2
- Incorrect Whisper long-form decoding timestamps HOT 2
- Very different output depending on whether an attention mask is passed when using caching HOT 3
- `last_hidden_state` has a different shape than `hidden_states[-1]` in the output of `SeamlessM4Tv2SpeechEncoder` if adapter layers are present HOT 1
- [GroundingDino] - GroundingDinoProcessor kwargs is Broken HOT 2
- Flash Attention with Gemma 2 HOT 5
- FX tracer doen't work when requesting non-default input argument HOT 2
- Keep Tuple of past key values as an option HOT 9
- How to manually stop the LLM output? HOT 2
- Pipeline's "num_return_sequences" > greater than 1 causes a runtime error with Gemma-2-9B. HOT 6
- WavLM returns empty hidden states when loaded directly to GPU HOT 1
- "TypeError: Object of type device is not JSON serializable" when saving the model on TPU HOT 3
- Add Depth Anything v2 metric depth HOT 2
- `attention_mask` must be in the same device as model? HOT 1
- `Gemma2Model` not returning cache HOT 8
- the attention output from llama2 generate differs from other llama models HOT 3
- Whisper + Torch.Compile: torch._dynamo.exc.Unsupported: reconstruct: UserDefinedObjectVariable(EncoderDecoderCache) HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.