Comments (2)
@thomwolf On SQuAD v1.1, BERT (single) scored 85.083 EM and 91.835 F1 as reported in their paper but when I fine-tuned BERT using run_squad.py
I got {"exact_match": 81.0975, "f1": 88.7005}. Why there is a difference? What I am missing?
from transformers.
Thanks for the details.
This PyTorch repo is starting to be used by a larger community so we would have to be a little more precise than just rough numbers if we want to include such pre-trained weights.
If you want to add your weights to the repo, you should convert the weights in the PyTorch repo model and get evaluation results on SQuAD with the PyTorch model so everybody has a clean knowledge of what they are using. Otherwise I think it's better that people do their own training and know what are the capabilities of the fine-tuned model they are using.
Feel free to come back and re-open the issue if this something you would like to do.
from transformers.
Related Issues (20)
- CUDA out of memory error during PEFT training on A100 GPU
- TypeError: The current model class (Wav2Vec2ForCTC) is not compatible with `.generate()`, as it doesn't have a language model head. HOT 1
- Make model compatible with `torch.func`
- Implementing the features of the TextStreamer into the pipeline
- "addmm_impl_cpu_" not implemented for 'Half' HOT 3
- BASED
- Issue with sentence_transformers pointing to underlying transformers and tensors manipulations
- Documentation Inconsistency for Trainer.place_model_on_device Parameter HOT 5
- Suddenly unable to create BERT encodings HOT 13
- Whisper Inference in BF16 precision.
- feat: Add data class for fsdp config and use it along with argument parser HOT 1
- ReBASED HOT 3
- Large World Model and Ring Attention
- Tokenizer `use_fast=True` encode has fatal bug HOT 4
- Upgrade from 4.37.2 to 4.38.2 causes CUDA out of memory error with identical configuration. HOT 7
- Stop Sequence stopping criteria HOT 1
- [Tokenizer] Inconsistent behavior when decoding a single ID and a list of the single ID HOT 4
- Add Microsoft's Code Reviewer model as a dedicated model
- batch_size, seq_length = input_shape ValueError: too many values to unpack (expected 2) Transformer Sentence Similarity Classification HOT 1
- AutoGPTQ quantization stucks without any progress HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.