Comments (3)
https://blog.csdn.net/bqw18744018044/article/details/129016658
基于这个文章所说的性能下降,引出了这个想法。另外自己在使用load_in_8bit的时候也出现了类似的问题。
from belle.
我们没有做过严格的load_in_8bit=True后模型推理速度测试,欢迎提供测试数据。
from belle.
我们没有做过严格的load_in_8bit=True后模型推理速度测试,欢迎提供测试数据。
的确变慢,而且慢的不少。同时,如果是finetune,训练的时间也差好多
from belle.
Related Issues (20)
- 输出多个结果 HOT 1
- how did you get category about dataset train_3.5M_CN_With_Category? HOT 1
- 使用run_pt.sh进行lora的非指令微调,没有效果。
- 【模型训练】我想请问模型输出的文件应该怎么看,每个文件都是什么作用呢?
- 【模型多轮对话推理】想请问多轮对话推理时,上下文如何拼接送入模型 HOT 1
- 12块8G显存可以用来训练吗
- What's the correct form of prompt in the scenario of multi-turn conversations HOT 2
- 预训练报错 NCCL Socket Timeout HOT 2
- 作者能否给一套最新代码适配的requirement.txt呢? HOT 1
- 13类数据聚类或分类的方法 HOT 1
- 如何训练vl模型 HOT 4
- VL模型效果非常差 HOT 10
- 哪里可以找到VL模型的权重 HOT 3
- provide BELLE-VL tech report ? HOT 3
- deepspeed warmup_num_steps:"Auto" 报错 HOT 3
- Is there any prompt examples of using BELEE for persona-based dialogue generation?
- 请问:论文中预训练采用的MIP训练方式的效果,大家在其他领域有验证过效果吗? HOT 1
- BELLE/train/src/models/generation_utils.py这个文件不是只有transformers<=4.30.2时使用自定义的trainer.py才会触发吗?但里面generation_utils.py为啥有4.30.2版本以上的东西,训练报错:ImportError: cannot import name 'SequenceBiasLogitsProcessor' from 'transformers.generation.logits_process'
- 微信二维码已过期 HOT 1
- 请问Chathome中,提到的数据比例1:5,指的是token数量1比5,还是数据条数1比5? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from belle.