Comments (2)
Thanks a lot for posting, interesting findings!
I've checked Perplexity implementation, code-wise I haven't found any issues.
For Dpo, validation loss and perplexity may exhibit different behavior, as dpo calculation is using
policy_chosen_logps - policy_rejected_logps - (reference_chosen_logps - reference_rejected_logps)
whereas Perplexity is using chosen_logits
, only.
It is probably a good idea to add additional train/validation metrics (that will be logged to neptune) such as CE loss to better track the experiment.
I've created this branch where SampleAveragedCrossEntropyLoss
is used as a validation loss and ran some experiments on it. So far, val loss is in sync with Perplexity.
from h2o-llmstudio.
I digged a bit deeper into the loss curves in this branch; cross entropy is logged for both accepted as well as rejected samples; alongside with the corresponding perplexity.
Regarding the different behavior of loss vs. perplexity, my explanation for this is that evaluation samples the model struggles to predict can disproportionately impact the overall perplexity.
As an example, suppose we have a dataset with 4 samples where the third sample is out-of-distribution of the SFT model.
- After epoch 1, suppose the model has the following cross entropy loss for each sample:
(1, 1, 9, 1)
(third answer is hard to predict).
Mean cross-entropy (cross-entropy per sample / num_samples) will be12/4 = 3
Mean perplexity (perplexity per sample / num_samples) will bemean([exp(1), exp(1), exp(9), exp(1)]) ~ 2027.8
. - After epoch 2, suppose the model has the following cross entropy loss for each sample:
(4, 4, 4, 4)
(model adapts to third sample at the cost of predicting the other samples worse)
Mean cross-entropy will be16/4 = 4
Mean perplexity will beexp(4) / 4 ~ 13.6
My assumption is that this explains the behavior observed (and it is not caused by a bug).
from h2o-llmstudio.
Related Issues (20)
- [FEATURE] Option for not saving checkpoint HOT 1
- [FEATURE] Use local LLM deployment as Judge HOT 1
- Compare Zero-Epoch Prediction with Fine-Tuned Prediction as well as Validation Score Comparison HOT 1
- [FEATURE] Option to plot train/eval plots with epoch instead of step on x-axis
- Data Format section has a broken link HOT 4
- [BUG] HuggingFace export does not preserve bfloat16 weights but converts to float16 silently when using CPU for upload HOT 9
- [FEATURE] add support for multigpu, splitting model across gpus without using deepspeed/fsdp HOT 3
- [CODE IMPROVEMENT] Custom HF model for classification
- [BUG] Code rendering in the validation prediction insights replaces characters
- Can't access meta-llama/Llama-2-7b using llmstudio cli tool HOT 1
- [FEATURE] Freezing layers
- [UX] Screen hangs when you click Download Model
- [FEATURE] Implement SimPO
- [FEATURE] Select multiple training dataframes
- [DOCS] Duplicate Questions in the FAQ's
- [FEATURE] Connection with LLM DataStudio HOT 1
- [BUG] Memory allocation left resident in GPU(s) after model upload to HuggingFace
- [BUG] Default model is added twice to the LLM Backbone list
- [BUG] Deepseek tokenizer error on main (worked before) HOT 4
- [FEATURE] Add (experimental) FP8 support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2o-llmstudio.