Comments (2)
知乎的介绍是:https://zhuanlan.zhihu.com/p/546245070
from fengshenbang-lm.
hi,各位作者好
我在zhihu看到了项目的简介,以及放出的榜单,很感兴趣。
这几天我正在尝试复现这个工作,但是我在base 模型下,coco-cn的评估数据结果和目前公布的数据还有较大的差距,后续会放出训练的细节吗?
我可以先说一下我的训练细节:我是用moco + 对比学习,adam优化器,初始学习率e-4,学习率warm_up + polydecay,4 * 8 a100 多机训练,bs256,大约训练了80w步,目前coco-cn只能到80+。
我这边没有用MOCO,就是直接的原版的对比学习。base版,learning rate 是5e-4(当然,大点模型的版本lr要小十倍)。bs为512,2*8 a100训练。warmup和cosince decay(感觉这个问题不大)。大概训练24个epoch能收敛。我是基于open_clip这个库训的,你可以参考一下~
from fengshenbang-lm.
Related Issues (20)
- from modeling_deltalm import DeltalmForConditionalGeneration报错 no known parent package HOT 2
- ziya2预训练的语料拼接是如何通过attention mask规避的 HOT 5
- 配合ziya-reader使用的ziya-embedding和ziya-searching-agent在哪 HOT 1
- Anyone can help me with the "Randeng-BART-139M-SUMMARY" fine-tuning? HOT 1
- 燃灯-denoising系列没有开源出来预训练代码吗? HOT 1
- OSError: Can't load tokenizer for 'IDEA-CCNL/Randeng-Transformer-1.1B-Denoise'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'IDEA-CCNL/Randeng-Transformer-1.1B-Denoise' is the correct path to a directory containing all relevant files for a TransfoXLDenoiseTokenizer tokenizer.
- 合并ziya模型后生成safetensors格式文件,应该如何转化为.bin的格式呢
- Lyrics 啥时候公开 HOT 1
- Taiyi-CLIP-Roberta-102M-Chinese Finetuning报错
- Deberta 预训练的输出如何使用
- 按照 README 微调太乙的 stable-diffusion 模型后,利用from_pretrained() 方法加载模型出错.
- 太乙IDEA-CCNL/Taiyi-CLIP-Roberta-102M-Chinese在训练的过程如何做到中英文兼顾的?
- ziya_llama 切分tp后模型卡死,各位大佬有解决方法吗
- 当我使用Ubert进行实体抽取的微调后,测试集指标问题。 HOT 1
- ziya2-13b-base协议问题
- 用dreambooth训练模型后,加载stable-diffusion-webui日志出现'BertEmbeddings' object has no attribute 'token_embedding'错误 HOT 1
- 使用Randeng-Pegasus-523M-Summary-Chinese生成摘要报错PegasusTokenizer' object has no attribute 'vocab HOT 1
- finetune_taiyi_stable_diffusion 最简单的办法 - 是否可以把 CLIPTextModel 与 CLIPTokenizer 训练模型单独训练并发布在 huggingface上 , 后面用下面代码即可调用
- Sentiment Evaluation Metric (F1 vs. Accuracy)
- 大佬可以训练一个像jina-clip-v1的中文版这样的话,以后做搜索或是 构建多模态 RAG 应用 就牛了~~
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fengshenbang-lm.