Comments (6)
In the run_squad.py
script, I added the following lines after the training loop:
logger.info(***** Saving fine-tuned model *****)
output_model_file = os.path.join(args.output_dir, "pytorch_model.bin")
if n_gpu > 1:
torch.save(model.module.bert.state_dict(), output_model_file)
else:
torch.save(model.bert.state_dict(), output_model_file)
The code runs and I was able to load the model to test on the Adversarial SQuAD datasets.
I do not use the other run_*
scripts but this may be applicable as well.
Edit: the files have been modified in the latest commits so I think it's now necessary to check the loading of fine-tuned models in the script.
from transformers.
You are right this argument was not used. I removed it, thanks. These examples are provided as starting point to write training scripts for the package module. I don't plan to update them any further (except fixing bugs).
from transformers.
In the
run_squad.py
script, I added the following lines after the training loop:logger.info(***** Saving fine-tuned model *****) output_model_file = os.path.join(args.output_dir, "pytorch_model.bin") if n_gpu > 1: torch.save(model.module.bert.state_dict(), output_model_file) else: torch.save(model.bert.state_dict(), output_model_file)
The code runs and I was able to load the model to test on the Adversarial SQuAD datasets.
I do not use the other
run_*
scripts but this may be applicable as well.Edit: the files have been modified in the latest commits so I think it's now necessary to check the loading of fine-tuned models in the script.
what is your result on adversarial-squad?
from transformers.
At that time I got:
AddSent
BERT base 58.7 EM / 66.2 F1
BERT large 65.5 EM / 71.9 F1
AddOneSent
BERT base 67.0 EM / 74.7 F1
BERT large 72.7 EM / 79.1 F1
from transformers.
At that time I got:
AddSent
BERT base 58.7 EM / 66.2 F1
BERT large 65.5 EM / 71.9 F1AddOneSent
BERT base 67.0 EM / 74.7 F1
BERT large 72.7 EM / 79.1 F1
Thanks a lot! Do you release your paper? i want to cite your result and paper in my paper.
from transformers.
Unfortunately it was not part of a paper, just preliminary results.
from transformers.
Related Issues (20)
- Weights of LlamaForQuestionAnswering were not initialized from the model checkpoint HOT 1
- [BLIP-2] BitsAndBytes 4 and 8 bit give empty string HOT 6
- Update `make-fixup` to make sure image processor tests present
- Add Llama 3 support to `convert_llama_weights_to_hf()` HOT 6
- Reccurent Gemma forward pass does not match with Google DeepMind original implementation HOT 3
- Idefics2 - sdpa vs flash attn 2 HOT 2
- Add HelpingAI-3B-v2.2: Emotionally Intelligent Conversational AI
- tranformers.Trainer.train() hangs with Llama3 base model HOT 7
- Llama generation with static cache fails in certain sequence lengths
- [FSDP] redundant additional allgather during backward when using FSDP FULL_SHARD with gradient checkpointing HOT 1
- Assistant model not working for different sized openai models when using pipeline for ASR HOT 1
- Error trying to use https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized HOT 1
- Assisted decoding results are not correct HOT 6
- Why do the implementation behaviors of official llava and transformers differ? HOT 3
- Gemma's Tokenizer fails to split on spaces
- `StaticCache` Bad generation results with Llama after v4.39.0 HOT 2
- [SegGPT] Loss calculation is broken
- Remove `mps` workaround for `isin()` HOT 1
- Mamba Models - MambaForMaskedLM and MambaForSequenceClassification HOT 3
- Resize(feature_extractor.size) is a dictionary, not an int or sequence HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.