Comments (3)
I think the issue is you didn't add empty assistant message ({"role": "assistant"}) before inference. Actually we only add: {"role": "assistant"}
when we do inference and don't always append it, you can try this block of code:
from llama_cpp import Llama
from functionary.prompt_template import get_prompt_template_from_tokenizer
from transformers import AutoTokenizer
import random
import json
functions = [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
}
]
def get_current_weather(arguments):
return {"temperature": "28 C", "rain": "0%"}
# You can download gguf files from https://huggingface.co/meetkai/functionary-7b-v1.4-GGUF/tree/main
llm = Llama(model_path="models/functionary-7b-v1.4.q4_0.gguf", n_ctx=4096, n_gpu_layers=-1)
tokenizer = AutoTokenizer.from_pretrained("meetkai/functionary-7b-v1.4", legacy=True)
# prompt_template will be used for creating the prompt
prompt_template = get_prompt_template_from_tokenizer(tokenizer)
def generate_message(messages, functions):
# Before inference, we will add an empty message with role=assistant
all_messages = list(messages) + [{"role": "assistant"}]
# Create the prompt to use for inference
prompt_str = prompt_template.get_prompt_from_messages(all_messages, functions)
token_ids = tokenizer.encode(prompt_str)
gen_tokens = []
# Get list of stop_tokens
stop_token_ids = [tokenizer.encode(token)[-1] for token in prompt_template.get_stop_tokens_for_generation()]
print("stop_token_ids: ", stop_token_ids)
# We use function generate (instead of __call__) so we can pass in list of token_ids
for token_id in llm.generate(token_ids, temp=0):
if token_id in stop_token_ids:
break
gen_tokens.append(token_id)
llm_output = tokenizer.decode(gen_tokens)
result = prompt_template.parse_assistant_response(llm_output)
return result
messages = [
{'content': 'Current date is 2023-11-28.', 'role': 'system'},
{'content': 'Currently connected user is [email protected]', 'role': 'system'},
{"role": "user", "content": "I wonder if it is going to rain today here in Paris."}
]
while True:
new_message = generate_message(messages, functions)
messages.append(new_message)
if "function_call" in new_message: # Need to call a function
func_name = new_message["function_call"]["name"]
func_args = new_message["function_call"]["arguments"]
if func_name == "get_current_weather":
func_res = get_current_weather(json.loads(func_args))
messages.append({
"role": "function", "name": func_name,
"content": json.dumps(func_res, ensure_ascii=False)
})
print(f"CALL_FUNCTION: {func_name}, arguments: {func_args}, \nresult: \n{func_res}")
else:
print("ASSISTANT_RESPONSE: ", new_message["content"])
break
input()
Here is the Output in my test:
CALL_FUNCTION: get_current_weather, arguments: {
"location": "Paris, France"
},
result:
{'temperature': '28 C', 'rain': '0%'}
stop_token_ids: [32002, 32004]
Llama.generate: prefix-match hit
ASSISTANT_RESPONSE: The current weather in Paris is 28 degrees Celsius and there's no rain expected today.
from functionary.
Hello @khai-meetkai I tested with re-injecting {"role": "assistant"} only when doing the inference and removing it after and it works !
Can you give us some details on what this empty line does ? Is the model just trained to expect this as a way telling it that it is its turn to "talk" ?
from functionary.
Hi @sashokbg. Adding {"role": "assistant"} is like adding a prefix indicating it's the assistant's turn to generate. The model was trained with a specific prefix for assistant
from functionary.
Related Issues (20)
- Functionary model fails to correctly execute functions with autogen HOT 1
- newb, looking for a bit more clarity on how to get rolling HOT 5
- Functionary 2.4 Medium running locally: Tool call returns both 'function_call' and 'tool_calls' HOT 2
- I want to finetune model with a few Chinese datasets, Could you give some datasets format? thank you! HOT 1
- About Flash Attention
- Functionary small 2.4 is ignoring the system prompt. HOT 3
- Update model to Llama3 HOT 8
- The model is about the question of time HOT 8
- Can I get dataset used for training Functionary 2.4? HOT 1
- chatlab + functionary with stream=False HOT 1
- Question about grammar sampling HOT 2
- Libcuda.so.1 error on Modal.com HOT 1
- Working with Ollama HOT 1
- llama-cpp-python +chatlab never stops inference HOT 1
- Feedback function output HOT 1
- Is the packed attention_mask compatible with transformers model? HOT 4
- LM Studio support? HOT 2
- No module named 'lmformatenforcer' HOT 1
- vLLM does not support Functionary Tokenizer HOT 1
- Streaming with vLLM doesn't seem to work HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from functionary.