01-ai / yi-1.5 Goto Github PK

Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.

License: Apache License 2.0

yi-1.5's Introduction

🤗 HuggingFace • 🤖 ModelScope • 🟣 wisemodel
👾 Discord • 🐤 Twitter • 💬 WeChat
📝 Paper • 💪 Tech Blog • 🙌 FAQ • 📗 Learning Hub

Intro
News
Quick Start
Web Demo
Deployment
Fine-tuning
API
License

Intro

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Yi-1.5 comes in 3 model sizes: 34B, 9B, and 6B. For model details and benchmarks, see Model Card.

News

2024-05-13: The Yi-1.5 series models are open-sourced, further improving coding, math, reasoning, and instruction-following abilities.

Requirements

Make sure Python 3.10 or a later version is installed.
Set up the environment and install the required packages.
```
pip install -r requirements.txt
```
Download the Yi-1.5 model from Hugging Face, ModelScope, or WiseModel.

Quick Start

This tutorial runs Yi-1.5-34B-Chat locally on an A800 (80G).

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = '<your-model-path>'

tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)

# Since transformers 4.35.0, the GPT-Q/AWQ model can be loaded using AutoModelForCausalLM.
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype='auto'
).eval()

# Prompt content: "hi"
messages = [
    {"role": "user", "content": "hi"}
]

input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, return_tensors='pt')
output_ids = model.generate(input_ids.to('cuda'), eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)

# Model response: "Hello! How can I assist you today?"
print(response)

Ollama

You can run Yi-1.5 models on Ollama locally.

After installing Ollama, you can start the Ollama service. Note that keep this service running while you use Ollama.
```
ollama serve
```
Run Yi-1.5 models. For more Yi models supported by Ollama, see Yi tags.
```
ollama run yi:v1.5
```

Chat with Yi-1.5 via OpenAI-compatible API. For more details on how to use Yi-1.5 via OpenAI API and REST API on Ollama, see Ollama docs.

from openai import OpenAI
client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)
chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'What is your name',
        }
    ],
    model='yi:1.5',
)

Deployment

Prerequisites: Before deploying Yi-1.5 models, make sure you meet the software and hardware requirements.

vLLM

Prerequisites: Download the latest version of vLLM.

Start the server with a chat model.

python -m vllm.entrypoints.openai.api_server  --model 01-ai/Yi-1.5-9B-Chat  --served-model-name Yi-1.5-9B-Chat

Use the chat API.

HTTP

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Yi-1.5-9B-Chat",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the world series in 2020?"}
        ]
    }'

Python client

from openai import OpenAI
# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

chat_response = client.chat.completions.create(
    model="Yi-1.5-9B-Chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."},
    ]
)
print("Chat response:", chat_response)

Web Demo

You can activate Yi-1.5-34B-Chat through the huggingface chat ui then experience it.

Or you can build it locally by yourself, as follows:

python demo/web_demo.py -c <your-model-path>

Fine-tuning

You can use LLaMA-Factory, Swift, XTuner, and Firefly for fine-tuning. These frameworks all support fine-tuning the Yi series models.

API

Yi APIs are OpenAI-compatible and provided at Yi Platform. Sign up to get free tokens, and you can also pay-as-you-go at a competitive price. Additionally, Yi APIs are also deployed on Replicate and OpenRouter.

License

The code and weights of the Yi-1.5 series models are distributed under the Apache 2.0 license.

If you create derivative works based on this model, please include the following attribution in your derivative works:

This work is a derivative of [The Yi-1.5 Series Model You Base On] by 01.AI, used under the Apache 2.0 License.

[ Back to top ⬆️ ]

yi-1.5's People

Contributors

Stargazers

Watchers

yi-1.5's Issues

`max_length` (=20) to control the generation length.

model Yi-1.5-9B-chat UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation.

关于微调格式的询问

您好，感谢您们优秀的工作。我想请问关于微调Yi-1.5-34B有没有输入格式上的要求？
我在Yi-01 finetuning demo数据中看到一些特殊tag（https://github.com/01-ai/Yi/blob/main/finetune/yi_example_dataset/data/train.jsonl
如果想更好地微调Yi-1.5,我的数据是否应该遵循Yi-01 demo里面的格式呢？
感谢回答

关于 tokenizer 编码 <|im_start|> 的问题

我用下面的代码测试：

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-1.5-9B-Chat")
print("token of <|im_start|>: " + str(tokenizer.encode("<|im_start|>")))
print("token of <|im_end|>: " + str(tokenizer.encode("<|im_end|>")))

结果很奇怪：

token of <|im_start|>: [1581, 59705, 622, 59593, 5858, 46826]
token of <|im_end|>: [7]

按理说 token of <|im_start|> 输出结果应该是 6.

我不知道是不是 tokenizer 的问题，所以我在官方提了pr ：
https://huggingface.co/01-ai/Yi-1.5-9B-Chat/discussions/12
https://huggingface.co/01-ai/Yi-1.5-9B-Chat/discussions/13

麻烦查看一下这里是否有问题，感谢。

需要34b-chat-16k 量化版本

感谢你们发布强悍的模型，是否可以发出 awq或者 gptq-int4

tokenizer的问题

我们知道Yi-34B包括1.5的词表是64000，但为什么tokenizer中多出了3个token，实际是64003？
Yi-1.5使用了新的chatml作为chat template，中间包括了assistant角色，但是词表中没有该token（user是有的），这导致它会被拆成两个token(ass + istant)。

其他的诸如use_fast输出结果不同，tokenizer config中默认enable add bos等问题在其他issues中也有反映

官方微信交流群 Yi User Group

大家好，我们是零一万物开发者关系团队。
为了保证高质量的群聊内容，并防范广告机器人的涌入影响群友的体验，我们的微信交流群 Yi User Group 采取邀请制。
我们的交流内容囊括了从模型训练，下游任务应用，到部署和业界最新进展。
请先通过加我们的微信，确认您是 Yi 模型的开发者后，邀请入群。

我的微信：

Richard Lin 林旅强
零一万物开源负责人

请问yi-large考虑登录一些第三方分发平台吗

对于个人用户来说，使用类似POE这样的平台很方便，yi-large在arena上取得了良好的成绩（祝贺），希望能登录POE，让大家更多体验一下~

4K上下文完全不够用啊，能出个16K的吗？

Yi-1.5-9B指标没法复现

我使用opencompass对Yi-1.5-9B在MATH(4 shot),HumanEval/HumanEval plus(0 shot),MBPP(3 shot)的测试集上进行评估。评估的结果和官方提供的指标有一定差距，能否提供一下官方的评测脚本或者详细参数以便复现指标？

下面是我的评测脚本和结果

脚本：

cd opencompass
python run.py --datasets  math_gen humaneval_gen humaneval_plus_gen mbpp_gen  --hf-path /root/models/Yi-1.5-9B --model-kwargs device_map='auto' --tokenizer-kwargs padding_side='left' truncation='left' use_fast=False --max-out-len 512 --max-seq-len 4096 --batch-size 8 --no-batch-padding --num-gpus 1

结果：

dataset           version    metric                 mode      opencompass.models.huggingface.HuggingFace_models_Yi-1.5-9B

---
math              5f997e     accuracy               gen                                                             28.3
openai_humaneval  8e312c     humaneval_pass@1       gen                                                             25.61
humaneval_plus    8e312c     humaneval_plus_pass@1  gen                                                             21.34
mbpp              3ede66     score                  gen                                                             58.6
mbpp              3ede66     pass                   gen                                                            293
mbpp              3ede66     timeout                gen                                                              4
mbpp              3ede66     failed                 gen                                                             24
mbpp              3ede66     wrong_answer           gen                                                            179

modelscope模型下载问题

为什么通过以下脚本下载模型文件会报错，我用这个指令下载其他模型都是没问题的

import torch
from modelscope import snapshot_download, AutoModel, AutoTokenizer
from modelscope import GenerationConfig
model_dir = snapshot_download('01-ai/Yi-1.5-34B-Chat', cache_dir='/public/home/team4/zerooneai', revision='master')

Inquiries about the AGIEval setup

May I know if AGIEval uses a few-shot or zero-shot setting, and how should I reproduce this result?

Fast tokenizer

目前的tokenizer都与之前的不一样了（vocab里缺少了id 3-13, 新增了许多added_tokens），是有什么特别理由吗？

例如：
https://huggingface.co/01-ai/Yi-1.5-34B-Chat/blob/main/tokenizer.json
https://huggingface.co/01-ai/Yi-1.5-34B-32K/blob/main/tokenizer.json

是否可以在vocab补上缺失的那几个tokens?

what is the prompt template on ollama

test

test github issue feeding

对于发展方向，提点小建议

Yi-1.5的“自我”介绍为：“Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.”

在绝大多数场景中，coding、math的能力都是不需要的。gpt之类也已经在这方面做得比较好。
站在我这个应用开发者的角度，更希望有一款指令跟随能力很强，还节能减排的大模型。

可以请问一下yi-1.5-34b chat推理超参数吗，想复现在alignbench上的效果

Quick start code

For the model that is not a chat model, could you use a more proper demo code than now (Quick start)?

关于200k模型

请问后续有release 200k模型的计划吗？期待！

从transformers推理切换到vllm推理效果变差

**模型：**01ai/Yi-1.5-9B-Chat
**代码：**均为官方提供的代码
**生成参数：**transformers和vllm生成参数均设置为temperature=0.3, top_p=0.7
**问题：**鸡柳是鸡身上哪个部位？

transformers生成结果：

vllm生成结果：

其中vllm试了很多种生成参数，生成了多次，但是没有一次是对的结果。。。

Fast Tokenizer add unexpected space token

Hi Yi developers, Yi-1.5-9B tokenizer will generate an unexpected space token when tokenize "<|im_end|>\n" if use fast tokenizer with previous transformers. While it performs normal with transformers 4.42.4 or without fast tokenizer.

What is the correct way to tokenize "<|im_end|>\n"?
How is it tokenized in SFT stage?

Old version transformers w/ tokenizer_fast

transformers v4.36.5 / v4.41.2
use_fast=True

>>> tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-1.5-9B-Chat")
>>> tokenizer("<|im_end|>")
{'input_ids': [7], 'attention_mask': [1]}
>>> tokenizer("<|im_end|>\n")
{'input_ids': [7, 59568, 144], 'attention_mask': [1, 1, 1]}

In this case, there is an unexpected token 59568, which refers to space

New transformers w/ tokenizer_fast

transformers 4.42.4
use_fast=True

>>> tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-1.5-9B-Chat")
>>> tokenizer("<|im_end|>\n")
{'input_ids': [7, 144], 'attention_mask': [1, 1]}

Old transformers w/o tokenizer_fast

transformers 4.41.2
use_fast=False

>>> tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-1.5-9B-Chat", use_fast=False)
>>> tokenizer("<|im_end|>\n")
{'input_ids': [7, 144], 'attention_mask': [1, 1]}

除了34B，其他小参数模型的指令跟随能力都不行

希望后续版本能对小参数的模型加强这方面的能力

Will Yi-large be published in open source?

Hello!
Tell me, will Yi-Large be published in open source?

中文生成停不下来

你好，我在进行简单的尝试时https://github.com/01-ai/Yi-1.5?tab=readme-ov-file#quick-start，发现针对中文的生成，都停不下来，比如问“你是谁？”，回答
`你好！我是Yi，一个由零一万物自主研发的大规模语言模型。我可以回答问题、提供信息、讨论话题、创作文章等等，无论涉及任何领域，我都会尽力为你提供帮助。如果你有任何疑问或需要帮助，随时可以问我！请问有什么我可以为你服务的？回来，我这里有一个新的回答：

我是零一万物的人工智能助手，被设计来帮助用户解答问题、提供信息和支持。你可以问我关于科学、技术、历史、文化等各种话题。如果你有任何问题，请随时提问。

请问你认为人工智能在未来....
`

补充：

与原代码相比，只改动了一下messages以及在generate那里加了一个 max_new_tokens = 128
messages = [ {"role": "user", "content": "你是谁？"} ]
已确认md5

tokenizer bug

您好，在我使用Yi1.5时，发现会出现decode问题，会在解码时出现很多空格，如：

的输出为

请问是什么原因？

Does Yi-1.5-Chat model use the standard CHATML template?

@richardllin @panyx0718 @Imccccc Hi all, could you please give some advice for this issue?
Does Yi-1.5-Chat model use the standard CHATML template? Is the bos_token <|im_start|> or <|startoftext|>? Is the eos_token <|im_end|> or <|endoftext|>?
Yi-1.5-34B-Chat-16K/config.json is not consistent with Yi-1.5-34B-Chat-16K/tokenizer_config.json.
When model generating or training, will the bos_token be added at the front of prompt?

As shown in Yi-1.5-34B-Chat-16K/config.json：

"bos_token_id": 1,
"eos_token_id": 2,

As shown in Yi-1.5-34B-Chat-16K/tokenizer_config.json：

"bos_token": "<|startoftext|>",
"eos_token": "<|im_end|>",


"1": {
"content": "<|startoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"7": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}