Comments (8)
更新这个文件:https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py
from glm-4.
Updating the modeling_chatglm.py file did not solve the error. If I run the model on two gpus, it's still throwing the same error. If I run this on a single GPU, it gives the following error:
def forward(
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, **kwargs)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, **kwargs)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 3905, in forward
return compiled_fn(full_args)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1482, in g
return f(*args)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2533, in runtime_wrapper
all_outs = call_func_with_args(
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1506, in call_func_with_args
out = normalize_as_list(f(args))
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1594, in rng_functionalization_wrapper
return compiled_fw(args)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/codecache.py", line 374, in __call__
return self.get_current_callable()(inputs)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/codecache.py", line 401, in _run_from_cache
return compiled_graph.compiled_artifact(inputs)
File "/tmp/torchinductor_ashmal.vayani/r5/cr5ok37tr3sngvqz753xvfrrbhakfkffw35lawlukvvbfuy6p2tb.py", line 109, in call
triton_poi_fused_embedding_1.run(arg1_1, arg0_1, buf1, 200704, grid=grid(200704), stream=stream0)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 401, in run
self.autotune_to_one_config(*args, grid=grid)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 326, in autotune_to_one_config
timings = self.benchmark_all_configs(*args, **kwargs)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 302, in benchmark_all_configs
timings = {
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 303, in <dictcomp>
launcher: self.bench(launcher, *args, **kwargs)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 282, in bench
return do_bench(kernel_call, rep=40, fast_flush=True)
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/utils.py", line 75, in do_bench
return triton_do_bench(*args, **kwargs)[0]
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/triton/testing.py", line 104, in do_bench
fn()
File "/home/ashmal.vayani/anaconda3/envs/arabicmmlu_fsdp2/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 276, in kernel_call
launcher(
File "<string>", line 13, in launcher
ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)
from glm-4.
更新这个文件:https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py
我用这个方法成功了,原先的问题是出在哪了?
from glm-4.
更新这个文件:https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py
我用这个方法成功了,原先的问题是出在哪了?
你到底做了什么来解决这个问题?我是在提到这个问题后下载的模型,所以我应该下载了正确的模型。
from glm-4.
我遇到了相同的问题,我是通过修改GLM-4/composite_demo/src/clients/hf.py文件夹中class HFClient(Client):
def init(self, model_path: str):
self.tokenizer = AutoTokenizer.from_pretrained(
model_path, trust_remote_code=True,
)
self.model = AutoModelForCausalLM.from_pretrained(
model_path,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="cuda",
#修改成可以多卡跑
device_map="auto",
).eval()
这一段将device_map="cuda",=>device_map="auto",然后模型就可以在多张GPU上跑起来,但是我也更新了
https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py
这个文件,对我来说不起效果,请问您是怎么让模型在多张GPU上跑起来的?
from glm-4.
https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py
你在这个文件中更新了什么?正常推理工作正常,但我试图在 https://github.com/mbzuai-nlp/ArabicMMLU 上评估这个模型,但一旦加载模型,它就会在多个 GPU 和单个 GPU 上抛出错误。
我遇到了相同的问题,我是通过修改GLM-4/composite_demo/src/clients/hf.py文件夹中class HFClient(Client): def init(self, model_path: str): self.tokenizer = AutoTokenizer.from_pretrained( model_path, trust_remote_code=True, ) self.model = AutoModelForCausalLM.from_pretrained( model_path, trust_remote_code=True, torch_dtype=torch.bfloat16,
device_map="cuda",
#修改成可以多卡跑 device_map="auto", ).eval() 这一段将device_map="cuda",=>device_map="auto",然后模型就可以在多张GPU上跑起来,但是我也更新了 https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py 这个文件,对我来说不起效果,请问您是怎么让模型在多张GPU上跑起来的?
from glm-4.
评
我并没有修改https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/modeling_chatglm.py这个文件
我是通过修改GLM-4/composite_demo/src/clients/hf.py这个文件来使模型可以在多张GPU上加载,但是在推理的时候会报错这个问题。
from glm-4.
但在 github 上修改它不能解决一般问题,比如直接用 huggingface 加载
from glm-4.
Related Issues (20)
- You: bug GLM-4:The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. HOT 7
- 返回json格式时是中带有markdown的```json HOT 4
- 大佬想问一下glm4是encode-decode还是decode-only? HOT 1
- GLM-4 and Dify Response Reception Issue
- 本地glm4-9B模型使用function call 功能的问题 HOT 1
- GLM4如何一次性回复多个候选答案,类似于gpt参数设置“n”? HOT 2
- api_server运行报错,请教一下如何解决 HOT 2
- Rust Candle Framework Support
- 上传文件识别怎么做 HOT 1
- 可以写一个支持mlx环境运行的模型代码吗 HOT 1
- 工具调用时,promp回答和工具无关时stream=true设置无效
- finetune_demo.py中第277行代码是不是有问题?input_ids.append(151336) # EOS for chat,这个token_id并不是<eos,而是<user>. HOT 2
- 微调报错 TypeError: GenerationMixin._extract_past_from_model_output() got an unexpected keyword argument 'standardize_cache_format' HOT 2
- finetune_vision.py中的process_batch_eval Split the conversation into multiple dialogue segments 作用是什么
- glm4v可以支持一下vllm推理吗
- 这个是什么问题 HOT 1
- glm-4v-9b微调eval阶段出错,attention_mask未考虑visual tokens
- 8卡lora微调训练,数据集AdvertiseGen,loss从3.2613开始到1000多个step之后一直在2.8 2.9左右震荡,请问这是正常的吗 HOT 1
- 请问api_server支持GLM4v吗?
- compitable issue on transformer 4.44
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glm-4.