Hello, I am running the model using the newly provided GGUF tutorial

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Unable to Inject Function Result about functionary HOT 3 CLOSED

sashokbg commented on July 17, 2024

Unable to Inject Function Result

from functionary.

Comments (3)

khai-meetkai commented on July 17, 2024 1

I think the issue is you didn't add empty assistant message ({"role": "assistant"}) before inference. Actually we only add: {"role": "assistant"} when we do inference and don't always append it, you can try this block of code:

from llama_cpp import Llama
from functionary.prompt_template import get_prompt_template_from_tokenizer
from transformers import AutoTokenizer
import random 
import json 

functions = [
        {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
            }
    ]


def get_current_weather(arguments):
    return {"temperature": "28 C", "rain": "0%"}


# You can download gguf files from https://huggingface.co/meetkai/functionary-7b-v1.4-GGUF/tree/main
llm = Llama(model_path="models/functionary-7b-v1.4.q4_0.gguf", n_ctx=4096, n_gpu_layers=-1)

tokenizer = AutoTokenizer.from_pretrained("meetkai/functionary-7b-v1.4", legacy=True)
# prompt_template will be used for creating the prompt
prompt_template = get_prompt_template_from_tokenizer(tokenizer)

def generate_message(messages, functions):
    # Before inference, we will add an empty message with role=assistant
    all_messages = list(messages) + [{"role": "assistant"}]
    
    # Create the prompt to use for inference
    prompt_str = prompt_template.get_prompt_from_messages(all_messages, functions)
    token_ids = tokenizer.encode(prompt_str)

    gen_tokens = []
    # Get list of stop_tokens 
    stop_token_ids = [tokenizer.encode(token)[-1] for token in prompt_template.get_stop_tokens_for_generation()]
    print("stop_token_ids: ", stop_token_ids)

    # We use function generate (instead of __call__) so we can pass in list of token_ids
    for token_id in llm.generate(token_ids, temp=0):
        if token_id in stop_token_ids:
            break
        gen_tokens.append(token_id)

    llm_output = tokenizer.decode(gen_tokens)
    result = prompt_template.parse_assistant_response(llm_output)
    return result


messages = [
    {'content': 'Current date is 2023-11-28.', 'role': 'system'},
    {'content': 'Currently connected user is [email protected]', 'role': 'system'},
    {"role": "user", "content": "I wonder if it is going to rain today here in Paris."}
]

while True:
    new_message = generate_message(messages, functions)
    messages.append(new_message)
    if "function_call" in new_message:  # Need to call a function
        func_name = new_message["function_call"]["name"]
        func_args = new_message["function_call"]["arguments"]
        if func_name == "get_current_weather":
            func_res = get_current_weather(json.loads(func_args))
        messages.append({
            "role": "function", "name": func_name,
            "content": json.dumps(func_res, ensure_ascii=False)
        })
        print(f"CALL_FUNCTION: {func_name}, arguments: {func_args}, \nresult: \n{func_res}")
    else:
        print("ASSISTANT_RESPONSE: ", new_message["content"])
        break

input()

Here is the Output in my test:

CALL_FUNCTION: get_current_weather, arguments: {
  "location": "Paris, France"
},
result:
{'temperature': '28 C', 'rain': '0%'}
stop_token_ids:  [32002, 32004]
Llama.generate: prefix-match hit
ASSISTANT_RESPONSE:  The current weather in Paris is 28 degrees Celsius and there's no rain expected today.

from functionary.

sashokbg commented on July 17, 2024 1

Hello @khai-meetkai I tested with re-injecting {"role": "assistant"} only when doing the inference and removing it after and it works !
Can you give us some details on what this empty line does ? Is the model just trained to expect this as a way telling it that it is its turn to "talk" ?

from functionary.

khai-meetkai commented on July 17, 2024 1

Hi @sashokbg. Adding {"role": "assistant"} is like adding a prefix indicating it's the assistant's turn to generate. The model was trained with a specific prefix for assistant

from functionary.

Unable to Inject Function Result about functionary HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent