Comments (2)
Thanks for the clarification! This addressed all my questions.
I will give this a try!
from adapters.
I agree, this is a quite useful feature of AdapterFusion!
For instance, we leverage the recent_attention
in our AdapterDrop paper for pruning AdapterFusion (ยง4.2). Hence, in #84, we will clean this up and add documentation on how to read out the fusion weights.
To answer your questions:
Should I understand this to be the attention displayed in the above figure? But how do I get something of shape [num_adapters, num_adapters]?
That is correct. If you consider only one downstream task you could obtain a tensor of shape [n_layers, n_adapters, seq_len]
. Averaging over the last dimension gives you: [n_layers, n_adapters]
. In the AdapterFusion paper, I believe we also averaged over n_layers resulting in [n_adapters]
. If you now repeat this for several tasks, you get [n_tasks, n_adapters]
, where n_tasks ==n_adapters
in the special case you are referring to.
Accessing the stored attention tensors [...] How should I do this?
One way could be:
model.roberta.encoder.layer[layer_i].output.adapter_fusion_layer['<name of the fusion layer>'].recent_attention
We will add a cleaned-up variant of this soon.
Would that address your issue?
from adapters.
Related Issues (20)
- Par_bn and mam_adapter got "eval_matthews_correlation': 0.0" on GLUE-Cola HOT 2
- AdapterFusion | AttributeError: 'NoneType' object has no attribute 'dropout_prob' HOT 2
- Adapter Configuration - Expert of Mixture HOT 1
- Looking forward to support for OpenAI whisper HOT 1
- Adapter Training for regression HOT 1
- AttributeError: 'RobertaModel' object has no attribute 'layer_idx'
- Using adapters in a multimodal model HOT 1
- Better embeddings documentation. HOT 2
- Support for Yi Models
- Logits are changing in old adapters-transformer models if used by the new library HOT 1
- Transformers 4.35.0 Snyk Vulnerability HOT 1
- Error when trying to do torch.save HOT 2
- multi GPU setup causes error HOT 1
- Seed for Adapter Initialization?
- export onnx HOT 1
- How to add an adapter to a quantized model without peft? HOT 1
- load adapters from safetensor files (enhancement) HOT 2
- Cannot automatically convert prediction head of model class RobertaAdapterModel to flex head. HOT 2
- Adding adapters and fusion layer to UNIPELT
- Error when trying to set up adapter training for CodeLlama HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adapters.