Giter Site home page Giter Site logo

Comments (3)

khai-meetkai avatar khai-meetkai commented on July 17, 2024 1

I think the issue is you didn't add empty assistant message ({"role": "assistant"}) before inference. Actually we only add: {"role": "assistant"} when we do inference and don't always append it, you can try this block of code:

from llama_cpp import Llama
from functionary.prompt_template import get_prompt_template_from_tokenizer
from transformers import AutoTokenizer
import random 
import json 

functions = [
        {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
            }
    ]


def get_current_weather(arguments):
    return {"temperature": "28 C", "rain": "0%"}


# You can download gguf files from https://huggingface.co/meetkai/functionary-7b-v1.4-GGUF/tree/main
llm = Llama(model_path="models/functionary-7b-v1.4.q4_0.gguf", n_ctx=4096, n_gpu_layers=-1)

tokenizer = AutoTokenizer.from_pretrained("meetkai/functionary-7b-v1.4", legacy=True)
# prompt_template will be used for creating the prompt
prompt_template = get_prompt_template_from_tokenizer(tokenizer)

def generate_message(messages, functions):
    # Before inference, we will add an empty message with role=assistant
    all_messages = list(messages) + [{"role": "assistant"}]
    
    # Create the prompt to use for inference
    prompt_str = prompt_template.get_prompt_from_messages(all_messages, functions)
    token_ids = tokenizer.encode(prompt_str)

    gen_tokens = []
    # Get list of stop_tokens 
    stop_token_ids = [tokenizer.encode(token)[-1] for token in prompt_template.get_stop_tokens_for_generation()]
    print("stop_token_ids: ", stop_token_ids)

    # We use function generate (instead of __call__) so we can pass in list of token_ids
    for token_id in llm.generate(token_ids, temp=0):
        if token_id in stop_token_ids:
            break
        gen_tokens.append(token_id)

    llm_output = tokenizer.decode(gen_tokens)
    result = prompt_template.parse_assistant_response(llm_output)
    return result


messages = [
    {'content': 'Current date is 2023-11-28.', 'role': 'system'},
    {'content': 'Currently connected user is [email protected]', 'role': 'system'},
    {"role": "user", "content": "I wonder if it is going to rain today here in Paris."}
]

while True:
    new_message = generate_message(messages, functions)
    messages.append(new_message)
    if "function_call" in new_message:  # Need to call a function
        func_name = new_message["function_call"]["name"]
        func_args = new_message["function_call"]["arguments"]
        if func_name == "get_current_weather":
            func_res = get_current_weather(json.loads(func_args))
        messages.append({
            "role": "function", "name": func_name,
            "content": json.dumps(func_res, ensure_ascii=False)
        })
        print(f"CALL_FUNCTION: {func_name}, arguments: {func_args}, \nresult: \n{func_res}")
    else:
        print("ASSISTANT_RESPONSE: ", new_message["content"])
        break

input()

Here is the Output in my test:

CALL_FUNCTION: get_current_weather, arguments: {
  "location": "Paris, France"
},
result:
{'temperature': '28 C', 'rain': '0%'}
stop_token_ids:  [32002, 32004]
Llama.generate: prefix-match hit
ASSISTANT_RESPONSE:  The current weather in Paris is 28 degrees Celsius and there's no rain expected today.

from functionary.

sashokbg avatar sashokbg commented on July 17, 2024 1

Hello @khai-meetkai I tested with re-injecting {"role": "assistant"} only when doing the inference and removing it after and it works !
Can you give us some details on what this empty line does ? Is the model just trained to expect this as a way telling it that it is its turn to "talk" ?

from functionary.

khai-meetkai avatar khai-meetkai commented on July 17, 2024 1

Hi @sashokbg. Adding {"role": "assistant"} is like adding a prefix indicating it's the assistant's turn to generate. The model was trained with a specific prefix for assistant

from functionary.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.