Comments (2)
Level up the --gpu-memory-utilization . vllm works differently as it will take a lot of memory of the device
from qwen1.5.
Level up the --gpu-memory-utilization . vllm works differently as it will take a lot of memory of the device
already use --gpu-memory-utilization 0.95, how to level up?
from qwen1.5.
Related Issues (20)
- 如何使用通意思千文去理解文档和图片? HOT 1
- Qwen2-72b-instruct gguf HOT 3
- Qwen2-57B-A14B-Instruct-GPTQ-Int4模型生成tokens很慢 HOT 1
- Qwen2-7B-Instruct 使用3090显卡推理示例代码耗时异常!
- [代码生成or自动化用例生成场景BUG] 这模型似乎很固执简直有直男癌??prompt里明确了不要怎么怎么样,每次输出还是不按要求去 HOT 3
- 是否可以出一版不使用deepspeed的lora微调(因为deepspeed环境一直出问题)? HOT 1
- CUDA extension not installed. HOT 2
- Qwen2-57B-A14B-Instruct-GPTQ-Int4推理极慢 HOT 2
- Qwen2_7B模型generate提示Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation HOT 1
- 您好,请问使用C-Eval(5-shot)数据集测试Qwen2-1.5B-Instruct模型,为什么模型会把shot内容输出在回答中?
- [QA] Number of training tokens
- 关于SWA的实现 HOT 1
- Fine tuning for another language HOT 1
- Qwen-7b模型问答的过程中,回答问题同时存在回答无关问题,原因是什么? HOT 1
- QWEN2-72B-instruct,zero3,报错 HOT 1
- Qwen2-7B-Instruct-GPTQ-Int8 用Transformer+AutoGPTQ运行无法正常推理,但是vllm运行就正常。
- 用qwen1.5-32B-chat测试langgraph的官方例子时无法调用function,所有代码均按照langgraph的notbook写,仅仅替换了模型,模型是fschat本地服务器部署的,请问问题出在哪?
- Qwen2-7B-instruct在HumanEval上的表现达到了79.9,但是在qwen1提示词的情况下,复现只有62,能否公开qwen2的部分测试提示词?
- 为什么logits要在softmax前变为fp32
- TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo' When using Qwen2-1.5B-Instruct
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qwen1.5.