lianjiatech / belle Goto Github PK

View Code? Open in Web Editor NEW

7.6K 107.0 733.0 18.44 MB

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

License: Apache License 2.0

Python 40.59% C++ 0.15% Cuda 1.03% HTML 44.51% Jupyter Notebook 11.54% Shell 1.58% Dockerfile 0.60%

bloom instruction-set llama open-models gpt-q instruct-gpt gpt-evaluation chinese-nlp lora instruct-finetune

belle's People

Stargazers

Watchers

Forkers

chenchongyuan marscrazy ailty co-simulation fxlp aigeorgeli flyrainkey dumpmemory trigrass2 luzhongqiu hertera1 zhangsanfeng86 alex-songs zehebi29 animebing ryanran92 zurichrain sadpig1993 447428054 yjcyxky yehx1 julyhcw dendihust madehong lxsyz syno8 xuexidi nanjingruixun kiminh tkone2018 janglichao jboru zcfrank1st jackusa jqk6 chavesliu shadowkun pandaupc haojiepan1 goy0695 clcarwin tengben0905 qiang2100 pscd-354 wulijun leedaga hellodannyliu justonehe zqiang2008 tinker713 gogit2194 guruace gitbenxing zhanghonglishanzai dingjianfei techthiyanes victorshawfan shenjianfeng-2020 curious-chen paris0120 edmundyanj guhaifudeng zzmjohn benkang-chen minghsuanwu hhuangshao zhangatao sheli00 zxf864823150 mengyanggithub lanrri macielyoung rongruosong vpegasus neutron-1114 xfg0913 yuimo hhy5277 srcao-bingo pustar eltociear faithliu lyw306 jk521236 ganxiaozhe slidersun mumu-lily yangedai qzl164 liuxiaoy18-tsinghua xinjiayu holyhain zoudong fucheng830 teng1996 zhaopeng0103 zhuge11 veryquant kismit defmyself

belle's Issues

测试了一下，感觉模型的常识还不够

context length 2049 我请求的3643token 请问在哪里设置

WARNING:root:Reducing target length to 0, Retrying...
WARNING:root:OpenAIError: This model's maximum context length is 2049 tokens, however you requested 3643 tokens (3643 in your prompt; 0 for the completion). Please reduce your prompt; or completion length..
WARNING:root:Reducing target length to 0, Retrying...
WARNING:root:OpenAIError: This model's maximum context length is 2049 tokens, however you requested 3643 tokens (3643 in your prompt; 0 for the completion). Please reduce your prompt; or completion length..
WARNING:root:Reducing target length to 0, Retrying...
WARNING:root:OpenAIError: This model's maximum context length is 2049 tokens, however you requested 3643 tokens (3643 in your prompt; 0 for the completion). Please reduce your prompt; or completion length..
WARNING:root:Reducing target length to 0, Retrying...
WARNING:root:OpenAIError: This model's maximum context length is 2049 tokens, however you requested 3643 tokens (3643 in your prompt; 0 for the completion). Please reduce your prompt; or completion length..
WARNING:root:Reducing target length to 0, Retrying...
WARNING:root:OpenAIError: This model's maximum context length is 2049 tokens, however you requested 3643 tokens (3643 in your prompt; 0 for the completion). Please reduce your prompt; or completion length..
WARNING:root:Reducing target length to 0, Retrying...
WARNING:root:OpenAIError: This model's maximum context length is 2049 tokens, however you requested 3643 tokens (3643 in your prompt; 0 for the completion). Please reduce your prompt; or completion length..
WARNING:root:Reducing target length to 0, Retrying...

Finetuning bloom with stanford_alpaca repo problem

In the #26 it said that the finetuning script is from the stanford_alpaca. I want to ask a simple question:
What is the fsdp_transformer_layer_cls_to_wrap for bloom?

When I tried to fine tune with bloomz-7b1, the training stuck on 0%. And it's most likely because I dont set the right fsdp_transformer_layer_cls_to_wrap . But I cant find it in the bloom config.

Kindly need a help on this.
Thank you

请问有尝试过bloom其他参数规模的模型进行finetune吗？效果如何？

研发者你好，我们对这个工作非常感兴趣，想要进行复现，但受限于算力和显存问题，bloom-7b可能train不动，所以想请教下你们是否有尝试过bloom小参数规模的模型进行finetune吗？效果如何？
不确定更小的模型是否会有拟合能力不足的问题导致复现失败。

多大的gpu 能跑起这个模型，4个12g的gpu能跑起这个模型么？

请问如何加载呢

https://github.com/cocktailpeanut/dalai 基于斯坦福的我运行起来了，windows环境。是通过npx dalai serve运行起来的。请问作者您这个是如何运行的呢，pip install -r requirements.txt
export OPENAI_API_KEY=YOUR_API_KEY
python generate_instruction.py generate_instruction_following_data都运行了。下一步不知道该如何操作了

175个中文种子任务这数据在哪里？能让我们看下嘛？

RM和PPO的部分

Hi @mabaochang

请问有RM和PPO相关的数据和代码分享吗？

可以用 alpaca.cpp 运行吗

可能需要量化。降低门槛才能有更多的人体验到。

TypeError: 'type' object is not subscriptable

File "/mnt1/wcp/BEELE/BELLE-main/utils.py", line 41, in
prompts: Union[str, Sequence[str], Sequence[dict[str, str]], dict[str, str]],
TypeError: 'type' object is not subscriptable

WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).

ARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions). | 0/1 [00:00<?, ?it/s]
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...
WARNING:root:OpenAIError: Invalid URL (POST /v1/chat/completions).
WARNING:root:Hit request rate limit; retrying...

BELLE 7B-2M的安全性评测

hi，感谢你们的开源工作！我们做了BELLE 7B-2M的安全评测，结果在http://115.182.62.166:18000/public

代理设置无效生成不了数据

WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0f65b2a250>: Failed to establish a new connection: [Errno 111] Connection refused'))': /v1/completions
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0f65b2a400>: Failed to establish a new connection: [Errno 111] Connection refused'))': /v1/completions
WARNING:root:OpenAIError: Error communicating with OpenAI: HTTPSConnectionPool(host='api.openai.com', port=443): Max retries exceeded with url: /v1/completions (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0f65b2a040>: Failed to establish a new connection: [Errno 111] Connection refused'))).
WARNING:root:Hit request rate limit; retrying...
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0f65b5ffa0>: Failed to establish a new connection: [Errno 111] Connection refused'))': /v1/completions
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0f65b5f4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))': /v1/completions
WARNING:root:OpenAIError: Error communicating with OpenAI: HTTPSConnectionPool(host='api.openai.com', port=443): Max retries exceeded with url: /v1/completions (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0f65b5f460>: Failed to establish a new connection: [Errno 111] Connection refused'))).
WARNING:root:Hit request rate limit; retrying...

bigscience/bloomz-7b1怎么如何下载？

bigscience/bloomz-7b1对应的模型文件在https://huggingface.co/BelleGroup上没有找到下载的入口？

请问在本地下载好了模型，还需要openai_api_key吗？

bloom 二次训练

请问一下，bloom 二次训练是不是用bloom 官方的代码二次训练，还是要自己更改stanford_alpaca 的训练。

请问0.5M和1M数据的prompt来源是什么？0.5M数据是做了什么样的数据质量控制吗？

谁有量化后的版本？

https://github.com/cocktailpeanut/dalai，的Alpaca 7B 量化的版本在我的Mac OS M1上可以跑起来。

generate_instruction.py报错

Traceback (most recent call last):
File "generate_instruction.py", line 24, in
import utils
File "/mnt/amj/chatgpt/BELLE/utils.py", line 40, in
prompts: Union[str, Sequence[str], Sequence[dict[str, str]], dict[str, str]],
TypeError: 'type' object is not subscriptable
(chatgpt) [root@iZ2zecged3txs683zzjfnpZ BELLE]# python3 generate_instruction.py generate_instruction_following_data --api=chat --model_name=gpt-3.5-turbo
Traceback (most recent call last):
File "generate_instruction.py", line 24, in
import utils
File "/mnt/amj/chatgpt/BELLE/utils.py", line 40, in
prompts: Union[str, Sequence[str], Sequence[dict[str, str]], dict[str, str]],
TypeError: 'type' object is not subscriptable

麻烦看下如何解决

generate_instruction.py生成的数据集与Belle.train.json的格式不一致么

使用generate_instruction.py生成的regen.json文件的格式，与Belle.train.json的格式完全不一样，regen.json的字段更多，且包含了instruction、input、output字段，但是Belle.train.json文件中只有input与target字段。【Stanford Alpaca】做微调的数据格式与regen.json格式相似，Belle.train.json还得再重新调整下格式才能做模型微调吗

如何fintune呀

有人知道如何fintune这个模型吗

请问理论上对于基于bloom的belle使用load_in_8bit会让推理速度变慢吗

load_in_8bit=True之后体感上感觉推理变慢了

利用bloomz.cpp转化模型的时候出错

OSError: Unable to load weights from pytorch checkpoint file for './bigscience/bloomz-7b1/pytorch_model.bin' at './bigscience/bloomz-7b1/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

如果将from_tf设置为true的话，又会收到以下错误：
Loading model: ./bigscience/bloomz-7b1
Traceback (most recent call last):
File "/home/ubuntu/bloomz.cpp/convert-hf-to-ggml.py", line 84, in
model = AutoModelForCausalLM.from_pretrained(model_name, config=config, torch_dtype=torch.float16 if ftype == 1 else torch.float32, low_cpu_mem_usage=True, from_tf=True)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 471, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2613, in from_pretrained
model, loading_info = load_tf2_checkpoint_in_pytorch_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_tf_pytorch_utils.py", line 407, in load_tf2_checkpoint_in_pytorch_model
tf_model_class = getattr(transformers, tf_model_class_name)
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1119, in getattr
raise AttributeError(f"module {self.name} has no attribute {name}")
AttributeError: module transformers has no attribute TFBloomForCausalLM

Exception: expected value at line 1 column 1

File "/mnt1/wcp/BEELE/BELLE-main/generate_instruction.py", line 28, in
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained
return cls._from_pretrained(
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/models/bloom/tokenization_bloom_fast.py", line 118, in init
super().init(
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 111, in init
fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: expected value at line 1 column 1

后面有计划开源finetuning的代码吗，以及会尝试LoRA吗

加上多轮对话后belle-7b-2m模型会生成自问自答的内容。

问答内容如下：

请输入:你是人工智能哪个方向的? ---------------我输入的
response: 是的，我属于自然语言处理领域的人工智能 -----------------生成的
Human:哇，这个领域很厉害啊 -------------------生成的
Assistant:是啊，它能够帮助人们1更好地理解和使用语言 -----------------生成的

关于bloom_inference的使用问题

大神们好。我在运行bloom_inference.py文件的时候，加载的模型是量化后的8bit模型（bloom7b-2m-8bit-128g.pt），但是加载的时候，transformer报错：

请问下这个是啥情况啊

关于generate_instruction.py：如果prompt里要求输出10，输入例子3的话经常超出max_token，生成效率极低，想问下你们生产模式下是和现有代码相同设置生成数据的吗？

具体error

同级别参数量，模型大小差这么多？

小白问个问题，ChatGLM和BELLE都是6~7B参数量，怎么ChatGLM权重不到14G，BELLE需要28G之多？

想询问一下你们的训练环境是什么样的配置？

比如平台，GPU 型号，数量，其它硬件环境参数等？

全局和Lora微调脚本参考

Hi, 非常感谢作者提供的数据集和模型。

这里提供一下模型的全局和Lora的微调脚本参考：https://github.com/feizc/MLE-LLaMA

我看咱们使用的模型是bloomz 7b1-mt 没有使用斯坦福提到的llama 这是纯出于bloom 是多语言模型考虑？还是有做性能测试后的结论？

请问基于BLOOM和基于LLaMA的哪个效果好点？有做过测试吗？

有对比过llama-7B和Bloom-7B在中文上的finetune后的效果吗

export OPENAI_API_KEY=YOUR_API_KEY

为什么要填写openai的key呢，这样的话就不是本地化部署了把

能用llama.cpp 4位量化出来跑跑嘛

没GPU T_T

训练时的max_len

请问训练时，模型的最长输入是多少

模型大小

您好，bigscience/bloomz-7b1-mt中pytorch_model.bin是14.1GB，为啥BelleGroup/BELLE-7B-2M中pytorch_model.bin是28.3GB？

后续有开源全部数据集的计划吗

有开源完整数据集的计划吗

feature-request: publish half-precision models

The original bigscience/bloomz-7b1-mt model was released in half-precision (torch.HalfStorage), so its weight file is only 14.1 GB in size. I noticed that the current Belle weights are released intorch.FloatStorage, so the file size is twice the size of the foundation model.

Is it possible to publish a variant of Belle in half-precision? It would make it easier for everyone to try it out.

请问模型运行时的内存和显存需要多少？

我这边60G的内存在AutoModelForCausalLM.from_pretrained过程溢出了

语料生成相关

有两点没明白，麻烦大佬老师帮解释下吧：
1、为什么需要种子任务 zh_seed_tasks.json？
种子任务的作用是什么？

2、生成数据时

　　pip install -r requirements.txt
export OPENAI_API_KEY=YOUR_API_KEY
python generate_instruction.py generate_instruction_following_data
最后的这个参数 generate_instruction_following_data 是什么大佬老师？是表示生成数据的存储文件吗？
非常感谢大佬老师

FT 7B1 模型需要多少资源啊？

8张A100 80G，够用吗

关于数据集描述的问题

您好，我下载了数据集之后看到json内容的描述是"input"和"target"，stanford中的是"instruction"/"input"/"output"，我想请教一下这里"input"送训练的时候会用//n来分成"instruction"和“input”吗，还是你们默认"input"就是“instruction”

BELLE-LLAMA-7B-2M load ERROR

BelleGroup/BELLE-LLAMA-7B-2M模型是否还未发布

模型后期是否会开放，目前在huggingface上暂时没有看到权重文件

运行generate_instruction.py会报错

KeyError Traceback (most recent call last)
Cell In[4], line 73
71 instruction_data = []
72 for result in results:
---> 73 new_instructions = post_process_gpt3_response(num_prompt_instructions, result)
74 instruction_data += new_instructions
76 total = len(instruction_data)

直接运行generate_instruction.py文件会报错，显示key不存在，这是啥原因呢
运行命令：
python -m generate_instruction generate_instruction_following_data
--output_dir ./
--num_instructions_to_generate 10
--model_name="text-davinci-003" \

python环境：3.9

Cell In[1], line 52, in post_process_gpt3_response(num_prompt_instructions, response)
50 if response is None:
51 return []
---> 52 raw_instructions = response["message"]["content"]
53 if '指令:' not in raw_instructions[0: 10] and '指令：' not in raw_instructions[0: 10]:
54 raw_instructions = f"{num_prompt_instructions+1}. 指令:" + raw_instructions

KeyError: 'message'

执行报错！

按照
pip install -r requirements.txt
export OPENAI_API_KEY=xxxx
python generate_instruction.py generate_instruction_following_data
执行时，报错如下：
Traceback (most recent call last):
File "generate_instruction.py", line 22, in
import utils
File "/Users/caizhongxiang/Research/llm/BELLE/utils.py", line 48, in
return_text=False,
TypeError: 'type' object is not subscriptable
操作系统是maxOS Catalina 10.15.7
python安装版本是3.7
requirements.txt 里面的均已安装成功。pycharm本身没有提示版本上的问题。
求助~