Giter Site home page Giter Site logo

ChatGLM2 lora finetuning 加载 lora 参数:RuntimeError: Expected 4-dimensional input for 4-dimensional weight [3072, 32, 1, 1], but got 3-dimensional input of size [1, 64, 4096] instead about zero_nlp HOT 4 OPEN

yilong2001 avatar yilong2001 commented on June 1, 2024
ChatGLM2 lora finetuning 加载 lora 参数:RuntimeError: Expected 4-dimensional input for 4-dimensional weight [3072, 32, 1, 1], but got 3-dimensional input of size [1, 64, 4096] instead

from zero_nlp.

Comments (4)

yuanzhoulvpi2017 avatar yuanzhoulvpi2017 commented on June 1, 2024

你这个加载方式不对,查看一下我的这个文件https://github.com/yuanzhoulvpi2017/zero_nlp/blob/main/chatglm_v2_6b_lora/infer_lora.ipynb

from zero_nlp.

yilong2001 avatar yilong2001 commented on June 1, 2024

用这种方式加载,也是一样的问题:

model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='auto', torch_dtype=torch.bfloat16)

model = PeftModel.from_pretrained(model, peft_model_id)
model = model.eval()

from zero_nlp.

yilong2001 avatar yilong2001 commented on June 1, 2024

如果这样加载(先做一次 eval):

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='auto', torch_dtype=torch.bfloat16)
model = model.eval()

model = PeftModel.from_pretrained(model, peft_model_id)

在这一步会出现如下问题:

ValueError: We need an `offload_dir` to dispatch this model according to this `device_map`, the following submodules need to be offloaded: base_model.model.transformer.encoder.layers.1,
base_model.model.transformer.encoder.layers.2, base_model.model.transformer.encoder.layers.3, base_model.model.transformer.encoder.layers.4

错误位置:
/home/beeservice/.conda/envs/pt/lib/python3.10/site-packages/peft/peft_model.py:177 in           │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   174 │   │   │   │   device_map = infer_auto_device_map(                                        │
│   175 │   │   │   │   │   model, max_memory=max_memory, no_split_module_classes=no_split_modul   │
│   176 │   │   │   │   )                                                                          │
│ ❱ 177 │   │   │   model = dispatch_model(model, device_map=device_map)                           │
│   178 │   │   │   hook = AlignDevicesHook(io_same_device=True)                                   │
│   179 │   │   │   if model.peft_config.peft_type == PeftType.LORA:                               │
│   180 │   │   │   │   add_hook_to_module(model.base_model.model, hook)                           │

from zero_nlp.

yuanzhoulvpi2017 avatar yuanzhoulvpi2017 commented on June 1, 2024

不知道你是不是用我的代码训练的。也有可能是transformers和peft包的版本问题。建议更新一下试一试。

from zero_nlp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.