Giter Site home page Giter Site logo

baichuan-inc / baichuan2 Goto Github PK

View Code? Open in Web Editor NEW
4.0K 40.0 276.0 4.64 MB

A series of large language models developed by Baichuan Intelligent Technology

Home Page: https://huggingface.co/baichuan-inc

License: Apache License 2.0

Python 100.00%
artificial-intelligence benchmark ceval chatgpt chinese gpt gpt-4 huggingface large-language-models llama2

baichuan2's Issues

微调baichuan2时提示no attribute named "future_mask"

我是用transformers的trainer类去做的微调训练,每次一到eval的步骤就会报错,信息如下:
AttributeError: Caught AttributeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 64, in _worker
output = module(*input, **kwargs)
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/peft/peft_model.py", line 931, in forward
return self.base_model(
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 94, in forward
return self.model.forward(*args, **kwargs)
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/uos/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Chat/modeling_baichuan.py", line 692, in forward
outputs = self.model(
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/uos/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Chat/modeling_baichuan.py", line 404, in forward
alibi_mask = self.get_alibi_mask(inputs_embeds, seq_length_with_past)
File "/home/uos/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Chat/modeling_baichuan.py", line 354, in get_alibi_mask
mask = self.future_mask[
File "/home/uos/miniconda3/envs/llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'BaichuanModel' object has no attribute 'future_mask'

之后我又改用Llam-efficient-tuning用和调baichuan1一样的方法去调baichuan2,使用了deepspeed,同样是在eval步骤出错。报错:
AttributeError: 'Parameter' object has no attribute 'ds_status'
求问是什么原因

量化版本报错

Exception in thread Thread-2:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/opt/conda/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 1648, in generate
return self.sample(
File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 2766, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

预训练

大佬好,想问一下,baichuan2基于文本的预训练代码有吗?会开放吗?想在做IFT之前做一下预训练。

训练所需的硬件配置

这个模型,如果要自己进行微调,需要大概什么样的硬件配置,大概需要多久。

OOM问题

同样的推理代码。我把llama2-7b-chat的checkpoint换成了baichuan2-7b-chat。就提示显存不够了,是否有什么魔法

使用 Baichuan2-13B-Chat-4bits 模型报错

1、python3.10
2、按照要求安装了requirements.txt
3、git clone https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat-4bits
4、修改cli_demo.py里面的模型加载代码:
model = AutoModelForCausalLM.from_pretrained(
"D:\2-huggingface\Baichuan2-13B-Chat-4bits",
device_map="auto",
trust_remote_code=True
)
model.generation_config = GenerationConfig.from_pretrained(
"D:\2-huggingface\Baichuan2-13B-Chat-4bits"
)
tokenizer = AutoTokenizer.from_pretrained(
"D:\2-huggingface\Baichuan2-13B-Chat-4bits",
use_fast=False,
trust_remote_code=True
)
5、执行python cli_demo.py报错:
Exception has occurred: ImportError
Needs import model weight init func to run quantize.
File "C:\Users\WX.cache\huggingface\modules\transformers_modules\Baichuan2-13B-Chat-4bits\modeling_baichuan.py", line 606, in from_pretrained
from .quantizer import init_model_weight_int4
File "C:\Users\WX.cache\huggingface\modules\transformers_modules\Baichuan2-13B-Chat-4bits\quantizer.py", line 1, in
import bitsandbytes as bnb
ModuleNotFoundError: No module named 'scipy'

During handling of the above exception, another exception occurred:

File "C:\Users\WX.cache\huggingface\modules\transformers_modules\Baichuan2-13B-Chat-4bits\modeling_baichuan.py", line 611, in from_pretrained
raise ImportError(f"Needs import model weight init func to run quantize.")
File "D:\1-github\Baichuan2\cli_demo.py", line 13, in init_model
model = AutoModelForCausalLM.from_pretrained(
File "D:\1-github\Baichuan2\cli_demo.py", line 47, in main
model, tokenizer = init_model()
File "D:\1-github\Baichuan2\cli_demo.py", line 86, in
main()
ImportError: Needs import model weight init func to run quantize.

相比一代,显存占用更多了

目前v100 32G似乎很难跑起来baichuan2这个模型了,经常报oom的错误。
我看词表似乎增加了一倍,是这个导致显存占用变大的吗?

[Feature Support] XTuner 已支持百川 2 QLora 微调,单卡可训

XTuner 已支持 百川2 QLora 单卡微调,欢迎加入 Wechat 群交流

git clone https://github.com/internLM/xtuner
cd xtuner
pip install -e .
 
xtuner train configs/baichuan/baichuan2_7b_base/baichuan2_7b_base_qlora_alpaca_e3.py
# 等同于
# python xtuner/tools/train.py  configs/baichuan/baichuan2_7b_base/baichuan2_7b_base_qlora_alpaca_e3.py

image

支持多种alpaca、arxiv_gentitle、codealpaca 等数据集开箱即训,会自动下载数据集~

https://github.com/internLM/xtuner

ceval结果

请教下ceval结果怎么计算的,跟榜单上对不上呢

咨询商业授权问题

我们是一家金融公司,我看申请商业授权里面要求法人的身份证复印件+营业执照,这个有什么替代方法么,公司的法人是董事长,目前比较困难,之前发了邮件,没有收到回复,所以就在这里询问~~

IndentationError: unindent does not match any outer indentation level

在huggingface上下载了baichuan2-13b-chat到服务器,加载模型的时候,发现他会在:
~/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-chat/modeling_baichuan.py:823)代码最后自动append一段代码,从而引起error:IndentationError: unindent does not match any outer indentation level

截屏2023-09-08 16 53 46

对齐的框架和数据

看了论文,baichuan2 chat版本做了rlhf流程,采集了类似于hh_rlhf的数据,请问有开源rlhf数据和训练框架的计划吗?或者可以先开源一部分reward model训练数据?

Can not add new tokens.

As the lm_head in modeling_baichuan.py is not an instance of <class 'torch.nn.modules.linear.Linear'>, running model.resize_token_embeddings() will result in an error. And thus new tokens can not be added to the tokenizer.

TypeError: Old language model head is of type <class 'transformers_modules.13B.modeling_baichuan.NormHead'>, which is not an instance of <class 'torch.nn.modules.linear.Linear'>.
You should either use a different resize function or make sure that `old_lm_head` are an instance of <class 'torch.nn.modules.linear.Linear'>.

报错 AttributeError: 'str' object has no attribute 'to'

CUDA SETUP: Loading binary /root/anaconda3/envs/llama2/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Traceback (most recent call last):
File "/usr/local/Baichuan2/cli_demo.py", line 88, in
main()
File "/usr/local/Baichuan2/cli_demo.py", line 49, in main
model, tokenizer = init_model()
File "/usr/local/Baichuan2/cli_demo.py", line 14, in init_model
model = AutoModelForCausalLM.from_pretrained(
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 488, in from_pretrained
return model_class.from_pretrained(
File "/root/.cache/huggingface/modules/transformers_modules/Baichuan2-13B-Chat-4bits/modeling_baichuan.py", line 664, in from_pretrained
dispatch_model(model, device_map=device_map)
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/big_modeling.py", line 371, in dispatch_model
attach_align_device_hook_on_blocks(
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 507, in attach_align_device_hook_on_blocks
attach_execution_device_hook(module, execution_device[module_name])
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 347, in attach_execution_device_hook
attach_execution_device_hook(child, execution_device)
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 347, in attach_execution_device_hook
attach_execution_device_hook(child, execution_device)
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 347, in attach_execution_device_hook
attach_execution_device_hook(child, execution_device)
[Previous line repeated 2 more times]
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 340, in attach_execution_device_hook
add_hook_to_module(module, AlignDevicesHook(execution_device, skip_keys=skip_keys))
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 155, in add_hook_to_module
module = hook.init_hook(module)
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/hooks.py", line 253, in init_hook
set_module_tensor_to_device(module, name, self.execution_device)
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 292, in set_module_tensor_to_device
new_value = old_value.to(device)
File "/root/anaconda3/envs/llama2/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 191, in to
s[-2][0] = s[-2][0].to(device) # offset
AttributeError: 'str' object has no attribute 'to'

谢谢

账号封禁问题

1

你们百川把chat的体验页面开放出来,我除了想了解你们基本能力肯定还要看你们模型的安全性,我昨天给你们测试安全性(可能有涉及政治敏感问题),你们就把我封了?这是几个意思?其他模型包括文心一言,智谱清言,360智脑,讯飞星火等,我都会测试他们的安全性,从来没遇到像你们这样处理用户的!!!

关于NormHead的疑问

class NormHead(nn.Module):
    def __init__(self, hidden_size, vocab_size, bias=False):
        super().__init__()
        self.weight = nn.Parameter(torch.empty((vocab_size, hidden_size)))
        nn.init.kaiming_uniform_(self.weight, a=math.sqrt(5))
        self.first_flag = True

    def forward(self, hidden_states):
        if self.training:
            norm_weight = nn.functional.normalize(self.weight)
        elif self.first_flag:
            self.first_flag = False
            self.weight = nn.Parameter(nn.functional.normalize(self.weight))
            norm_weight = self.weight
        else:
            norm_weight = self.weight
        return nn.functional.linear(hidden_states, norm_weight)

1.当eval的时候,weight会被重新赋值,但是如果我eval后继续训练,是不是会有问题
2.在eval的过程中新创建这个weight是不是只是为了加速,其实train和eval都用norm_weight = nn.functional.normalize(self.weight)是不是也可以,不会有什么影响

baichuan2-13B评测任务性能突变

技术报告中,从baichuan2-13B不同step评测结果看到,在预训练到1000B左右时,评测任务性能发生了二次突变(不同于初始时在25%左右震荡后上升),请问这是跟模型参数有关系吗?

fastchat 里面的baichuan config还能用吗?

现在是这个 (https://github.com/lm-sys/FastChat/blob/56744d1d947ad7cc94763e911529756b17139505/fastchat/conversation.py#L782)

register_conv_template(
    Conversation(
        name="baichuan-chat",
        roles=("<reserved_102>", "<reserved_103>"),
        sep_style=SeparatorStyle.NO_COLON_SINGLE,
        sep="",
        stop_token_ids=[],
    )
)

但是我看baichuan2里的roles应该改成下面这样?

        roles=("<reserved_106>", "<reserved_107>")
>>> model.generation_config.user_token_id
195
>>> model.generation_config.assistant_token_id
196
>>> tokenizer.decode([195])
'<reserved_106>'
>>> tokenizer.decode([196])
'<reserved_107>'

13b-chat模型输出结果异常

第一次尝试baichuan2-13b-chat模型,使用web_demo.py运行。
刚开始问你好,回复的内容就很不好,如下图,请问一下是什么情况?
image

加载8bit量化/离线量化模型报错:RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

背景:可以正常使用无量化模型。保存8bit量化模型过程无报错。按照官方文档中加载8bit量化/离线量化模型报错:RuntimeError: probability tensor contains either inf, nan or element < 0

代码如下:
model = AutoModelForCausalLM.from_pretrained(r".\Baichuan2-13B-Chat",
load_in_8bit=True, device_map="auto", trust_remote_code=True)
model.save_pretrained(r'.\8bit')
model = AutoModelForCausalLM.from_pretrained(r'.\8bit', device_map="auto", trust_remote_code=True)
model.generation_config = GenerationConfig.from_pretrained(r".\Baichuan2-13B-Chat")
tokenizer = AutoTokenizer.from_pretrained(r".\Baichuan2-13B-Chat",
use_fast=False, trust_remote_code=True)

messages = []
messages.append({"role": "user", "content": "解释一下“温故而知新”"})
response = model.chat(tokenizer, messages)
print(response)

报错如下:
in GenerationMixin.sample(self, input_ids, logits_processor, stopping_criteria, logits_warper, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
2676 # sample
2677 probs = nn.functional.softmax(next_token_scores, dim=-1)
-> 2678 next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
2680 # finished sentences should have their next token be a padding token
2681 if eos_token_id is not None:

RuntimeError: probability tensor contains either inf, nan or element < 0

求API

非常感谢你们的贡献!

使用 transformers 8bit 在线量化的时候会报 `'BitsAndBytesConfig' object is not subscriptable` 错误

错误复现

>>> from transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("pretrained/baichuan2-7b/base,, load_in_8bit=True, device_map="auto", trust_remote_code=True)
[2023-09-06 16:03:27,691] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-09-06 16:03:28.999241: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-06 16:03:30.314803: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/liutianwei/.conda/envs/starwhale/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 488, in from_pretrained
    return model_class.from_pretrained(
  File "/home/liutianwei/.cache/huggingface/modules/transformers_modules/base/modeling_baichuan.py", line 779, in from_pretrained
    return super(BaichuanForCausalLM, cls).from_pretrained(
  File "/home/liutianwei/.conda/envs/starwhale/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/liutianwei/.cache/huggingface/modules/transformers_modules/base/modeling_baichuan.py", line 638, in __init__
    and config.quantization_config["load_in_4bit"]
TypeError: 'BitsAndBytesConfig' object is not subscriptable
>>> 

Baichuan2 7b-base, 7b-chat, 13b-base 和 13b-chat 都会报这个错误

解决方法

https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/modeling_baichuan.py#L537 中需要判断 config.quantization_config 的类型

sft代码疑问

想知道为什么 labels 这个地方要加eos_token_id 而不是ignore_index
if from_ == "human":
input_ids += self.user_tokens + value_ids
labels += [self.tokenizer.eos_token_id] + [self.ignore_index] * len(
value_ids
)

使用 transformers 8bit 在线量化的时候会报 `'BitsAndBytesConfig' object is not subscriptable` 错误

复现示例

>>> from transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("pretrained/baichuan2-7b/base,, load_in_8bit=True, device_map="auto", trust_remote_code=True)
[2023-09-06 16:03:27,691] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-09-06 16:03:28.999241: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-06 16:03:30.314803: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/liutianwei/.conda/envs/starwhale/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 488, in from_pretrained
    return model_class.from_pretrained(
  File "/home/liutianwei/.cache/huggingface/modules/transformers_modules/base/modeling_baichuan.py", line 779, in from_pretrained
    return super(BaichuanForCausalLM, cls).from_pretrained(
  File "/home/liutianwei/.conda/envs/starwhale/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/liutianwei/.cache/huggingface/modules/transformers_modules/base/modeling_baichuan.py", line 638, in __init__
    and config.quantization_config["load_in_4bit"]
TypeError: 'BitsAndBytesConfig' object is not subscriptable
>>> 

解决方法

https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/modeling_baichuan.py#L537 中需要判断 config.quantization_config 的类型

bugreport: need to add "tokenizer_class": "BaichuanTokenizer" to model config

In the first place, thanks for creating and open sourcing the model!

Symptom

While @bddppq and I were looking into running the model with a standard huggingface pipeline, we found a minor bug that prevents it from loading the model. Specifically, if you directly create a pipeline, for example using code like

pipeline = pipeline(task=task, model=model, revision=revision, **kwargs)

it will produce a pipeline, but when you run it, it tells you that it encountered a None object - if you look deeper, the tokenizer is missing.

Note that the tutorial code (using AutoModel and AutoTokenizer) is correct, but ideally, one would want to just use a single-line pipeline to load the model.

Diagnosis

Basically, when loading a pipeline and the underlying model, huggingface uses the following strategy to determine the tokenizer:

Right now, Baichuan does not have a TOKENIZER_MAPPING manually committed to the transformers code repo, and the model config.json does not have the tokenizer_class defined. As a result, when loading the model via the pipeline interface, the tokenizer is simply not loaded.

What to do

Of course, one way is to basically manually send a pull request to huggingface/transformers and update the TOKENIZER_MAPPING. This comes with two shortcomings, though:

  • existing transformer users have to wait for transformer to push a new version, update to the new version, and then use it.
  • Every new model release from Baichuan, if comes with model type changes, need to repeat the above process.

Thus, it is actually easier to simply add one line to the model config.json (https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat/blob/main/config.json) file as follows:

"tokenizer_class": "BaichuanTokenizer",

and everything will be all good.

Why do we care about this - as part of our work, we are automating the process of launching LLM models, and we would love to see Baichuan being used in a much smoother way via hf pipeline instead of having to explicitly write the python code - thanks so much for looking into it!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.