Comments (3)
把模型下载到本地,
brew install git-lfs
git lfs install
git clone https://huggingface.co/baichuan-inc/Baichuan-13B-Chat
如果网络问题,可以收工下载模型文件
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/baichuan-inc/Baichuan-13B-Chat
模型使用本地路径
使用mps加速model = model.to('mps')
量化不支持Mac,暂时无解
不管cl_demo.py 还是web_demo.py 都如下修改,
pretrained_model_name = "./model/Baichuan-13B-Chat/"
st.set_page_config(page_title="Baichuan-13B-Chat")
st.title("Baichuan-13B-Chat")
@st.cache_resource
def init_model():
model = AutoModelForCausalLM.from_pretrained(
pretrained_model_name,
torch_dtype=torch.float16,
# device_map="auto",
trust_remote_code=True
)
model = model.to('mps')
model.generation_config = GenerationConfig.from_pretrained(
pretrained_model_name
)
tokenizer = AutoTokenizer.from_pretrained(
pretrained_model_name,
use_fast=False,
trust_remote_code=True
)
return model, tokenizer
我的系统是:
MacBook Pro M2 Max(12核CPU, 38核GPU, 32G内存)
使用情况,能跑起来,但没有啥意义,太慢。
内存使用最多超过32G,都使用虚拟内存了。
from baichuan-13b.
感谢楼上兄弟, 我用了你的设置在Mac上成功跑起了,M1 Max + 64G内存速度可以接受。
不过今天看到有人搞了ggml的版本:
https://huggingface.co/xuqinyang/baichuan-13b-chat-ggml-int8
from baichuan-13b.
感谢楼上兄弟, 我用了你的设置在Mac上成功跑起了,M1 Max + 64G内存速度可以接受。 不过今天看到有人搞了ggml的版本:
https://huggingface.co/xuqinyang/baichuan-13b-chat-ggml-int8
does it work well? My mac is mac book m1max 64G, could it work well?
from baichuan-13b.
Related Issues (20)
- 调用api.py流式代码谁能分享一下啊 HOT 2
- 对baichuan13b还没有开始微调,仅仅是对话就自言自语?总是泄露Human: Assistant:对话数据 HOT 1
- baichuan-13b-chat sft微调loss不下降 HOT 1
- 如何离线部署? HOT 1
- ValueError: Tokenizer class BaichuanTokenizer does not exist or is not currently imported. HOT 2
- ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.
- 这个模型不支持多gpu模式吗
- 请问下,大家都是租用GPU服务器来运行大模型吗
- baichuan2 mmlu结果复现的问题
- baichuan-13b-chat批量生成示例
- 本地部署版本问题 HOT 2
- v100能部署Baichuan-13B-Base么?
- 各位大佬,微调baichuan2-13b后得到pth文件,该如何推理
- 各位大佬,请问采用官网给出的fine-tune文件做微调大概需要多少显存,使用A6000(48G)显示内存溢出。
- 将训练好的模型进行放入到web.demo中报错
- feat: function calling
- Baichuan13B vllm 效果很差 HOT 1
- 想问一下百川2量化版本的算法是什么?
- 解决
- 如何加速模型推理速度?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from baichuan-13b.