Comments (5)
Saving checkpoint preserves the original parameter dtype. Do you mean you want to train model with fp32, but save it with fp16? If you train a model with fp16, the model will be save with fp16 by default.
from swissarmytransformer.
Yes, if support, i can chose what i need, cause that when i using cogvlm to finetune, it only support save checkpoint in fp32, which need 60GB+ storage to save one model, fp16 maybe enough. offical api seems doesn'y support
from swissarmytransformer.
cogvlm finetune saves model in bf16, unless you train it with fp32.
from swissarmytransformer.
It is memory consuming to save the model with a different dtype with that you train, because you need a copy of the whole model to complete that.
from swissarmytransformer.
thanks a lot
from swissarmytransformer.
Related Issues (20)
- Cannot use torch.compile with SAT
- ore.exceptions.ResponseStreamingError HOT 1
- How to load and initialize llama2 models downloaded from Huggingface HOT 2
- FileLock - out of date? HOT 1
- 请问如何使用hf加载icetk_glm_130B的tokenizer和GLM130B的模型? HOT 6
- Can you help to confirm if chatglm3 model is same as GPT or it's original from GLM architecture? HOT 3
- How to embed video encoder module from pytorch? HOT 3
- deepspeed分布式训练出现sat ValueError inconsistent HOT 1
- SwissArmyTransformer可以读bin权重文件吗?visualglm-6b项目里就没见pt文件,只有bin。难以微调 HOT 5
- 单机多卡训练时内存占用过高 HOT 2
- deepspeed 分布式训练 loss nan or inf HOT 1
- Questions about your LoRA codes HOT 7
- 请问针对样本数量不均衡的数据集怎么做样本均衡呢 HOT 1
- 怎么从断点恢复微调训练 HOT 1
- AutoModel.from_pretrained()里面不能加载hf的权重吗
- AutoModel.from_pretrained()里面不能加载hf版本的权重吗 HOT 1
- MixtralMlpMixin()这个函数里面moe只是计算专家的logits但是没看到分发逻辑 HOT 1
- Using CogVLM - KeyError (MODEL_URLS) - Google Colab HOT 1
- 如果想绕过deepspeed做finetune,可以在train的时候直接model.step()来实现吗? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from swissarmytransformer.