Comments (10)
Hello,
To clarify, are you loading the model at step 67? Is the performance of the model when you load the checkpoint 53? And is the performance of the checkpoint in the log 58?
from t-few.
Hi @dptam, the step is actually 75. As we see from the log here, in line 20(epoch 19), the log is 0.5812
And when I enter this code, it run accuracy is 0.5848
BTW step 79 is 0.5631
Same thing in COPA dataset , line 221 is 0.62
So when I tried to run step 883 887 it result is 0.54 , and 879 is 0.55
from t-few.
I'm not sure the issue. If you don't mind, could you rerun and add self.global_step
to the metrics
dictionary here. This should output the global step in the log that matches the global step used to save the model just to make sure the line number corresponds to the correct ckpt.
from t-few.
Hi @dptam , actually when I tried to run the finish.pt, it can not match the last accuracy in log.
from t-few.
Is there something wrong with the code? @muqeeth @jmohta @HaokunLiu
from t-few.
@dptam I have add global step as your suggestion, but it still can not match
from t-few.
What is in pl_test.py? Mind share with us what you have there?
from t-few.
Hello,
Thanks for rerunning the code. I'm still not sure why loading and rerunning the model doesn't match the log performance - could you share the command used to train the model?
Regarding the issue of finish.pt
not matching the last accuracy in log, see #11 for more details why.
from t-few.
Hi @HaokunLiu @dptam , actually pl_test is just a copy of train, except for loading method. See I was use both your save model method and checkpoing method of pytorch ligetning. See,
And I change a little bit in encoderdecoder.py
But here is the thing, the train command is as bellow
And the test code is as bellow, actually pl_train/test run the same result
And the log here, not use finish.pt but the 51 as suggestion of @dptam
from t-few.
Hi,
I tried to look into a bit and couldn't figure out the cause but found one issue for me at least(not sure if it will be the same for you). Sorry I don't have more time to look into it currently, but maybe you can.
When using t5-small
and printing out self.model.lm_head.weight()
, the norm is 94070 in the train_step function but 94072 in the predict function. This is due some precision issues when moving from CPU to GPU and one remedy was adding
self.weight = torch.clone(self.model.lm_head.weight).double().cuda().float()
at the end of the init function for EncoderDecoder.py
and adding self.model.lm_head.weight = torch.nn.Parameter(self.weight)
at the beginning of the training_step
function.
This causes the self.model.lm_head.weight()
to consistently be 94070 in the train_step and predict function, but the accuracy from the log and from loading a validation checkpoint still do not match. I'm not sure why, but one potential further analysis is to look at the other weights of the model.
from t-few.
Related Issues (20)
- What is the meaning of score_gt and score_cand? HOT 6
- Validation score on WSC decreases with training HOT 3
- Sum of logprobs in the probability space adds up to values above 1 HOT 2
- What does the multi_lora_a and multi_lora_b mean in the code? HOT 3
- AttributeError: 'DistributedDataParallel' object has no attribute 'save_checkpoint' HOT 1
- save dev_pred.txt and test_pred.txt for RTE and ANLI HOT 2
- How is l_ff created? HOT 1
- How long it will take for pretraining the model using A100(80G)? HOT 2
- Where are the loss function changes in the codebase? HOT 1
- IA3 implementation doesn't add parameters for feedforward layers HOT 5
- questions from your paper HOT 1
- results for LoRA HOT 1
- t-few for decoder only models
- Multi-task batching HOT 4
- question about intrinsic.py HOT 1
- Creation of the `decoder_attention_mask` while evaluating HOT 1
- Could your please give a detailed explanation for the "rank classification"? HOT 1
- Make use of the model, datasets and strategy to classify sentences as urgent not urgent HOT 1
- Issue on the install of first experiment and deepspeed in Windows HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from t-few.