Is the feature request related to a problem? Self-hosting Gorilla

Additionally, I created this <a href="https://colab.research.google.com/drive/1rKbWmz6

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Thank you, <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Thank you, <a class="user-mention notranslate" data-hovercard-type="user"

[feature] Guidance on Self-Hosting API Endpoints about gorilla HOT 7 CLOSED

kcolemangt commented on June 2, 2024

[feature] Guidance on Self-Hosting API Endpoints

from gorilla.

Comments (7)

ramanv0 commented on June 2, 2024 2

I've submitted a PR with a guide on self-hosting the OpenFunctions model. It includes instructions and example code for setting up a local server using FastAPI and uvicorn, and for configuring a client to interact with this server using the OpenAI package. I would appreciate a review and any feedback!

from gorilla.

ramanv0 commented on June 2, 2024 1

Additionally, I created this Colab notebook to quickly test the example server in action. To run the server, I used a A100 GPU, which provided good response latency. To remotely access the server running on the Colab instance from my local machine, I used ngrok to tunnel the server ports from the Colab instance to public URLs. If you want to try out the server, you can get an ngrok auth-token here.

from gorilla.

ShishirPatil commented on June 2, 2024

Thanks @kcolemangt for raising this. Just updated the README, but basically this prompt is what we used to train the model, and using this should be good? Let me know if you run into any issues.

def get_prompt(user_query, functions=[]):
  if len(functions) == 0:
    return f"USER: <<question>> {user_query}\nASSISTANT: "
  functions_string = json.dumps(functions)
  return f"USER: <<question>> {user_query} <<function>> {functions_string}\nASSISTANT: "

I'll keep this open if you run into any issues.

from gorilla.

kcolemangt commented on June 2, 2024

Thank you, @ShishirPatil. This information helps but could you share the full process to execute the api server locally?

With that information we could update the example to use localhost instead of this line

openai.api_base = "http://luigi.millennium.berkeley.edu:8000/v1"

from gorilla.

coolrazor007 commented on June 2, 2024

@kcolemangt with OpenAI python package you just have to specify the OpenAI compatible API of your local model. For instance, if you are trying this on your local computer, run Ollama with LiteLLM. Then edit the line you mentioned:
openai.api_base = "http://localhost"

Edit "localhost" to whatever LiteLLM tells you to use

from gorilla.

t-dettling commented on June 2, 2024

@kcolemangt with OpenAI python package you just have to specify the OpenAI compatible API of your local model. For instance, if you are trying this on your local computer, run Ollama with LiteLLM. Then edit the line you mentioned: openai.api_base = "http://localhost"

Edit "localhost" to whatever LiteLLM tells you to use

I think what @kcolemangt is looking for is how to run a server like endpoint, not just one for local development. If he does not, then I am curious how they are running the endpoint at http://luigi.millennium.berkeley.edu:8000/v1.

Ollama and LiteLLVM are cool for development but they are not really as good for server deployment, I tried to use a compiled model for MLC (https://llm.mlc.ai/docs/deploy/rest.html#install-mlc-chat-package) to run it with the restful API that is compatible with completions, and it does work but it does not look like it has support for the functions to be passed over the API since it is just making the function based on the user input.

For example when I run this code:

import openai


def get_gorilla_response(prompt="Call me an Uber ride type \"Plus\" in Berkeley at zipcode 94704 in 10 minutes",
                         model="gorilla-openfunctions-v0", functions=[]):
    openai.api_key = "EMPTY"
    openai.api_base = "http://192.168.30.27:8000/v1"
    try:
        completion = openai.ChatCompletion.create(
            model="gorilla-openfunctions-v1-q4f32_0",
            temperature=0.0,
            messages=[{"role": "user", "content": prompt}],
            functions=functions,
        )
        return completion.choices[0].message.content
    except Exception as e:
        print(e, model, prompt)


query = "Call me an Uber ride type \"Plus\" in Berkeley at zipcode 94704 in 10 minutes"
functions = [
    {
        "name": "Uber Carpool",
        "api_name": "uber.ride",
        "description": "Find suitable ride for customers given the location, type of ride, and the amount of time the customer is willing to wait as parameters",
        "parameters": [{"name": "loc", "description": "location of the starting place of the uber ride"},
                       {"name": "type", "enum": ["plus", "comfort", "black"],
                        "description": "types of uber ride user is ordering"},
                       {"name": "time", "description": "the amount of time in minutes the customer is willing to wait"}]
    }
]

resp = get_gorilla_response(query, functions=functions)
print(resp)
print("Done!")

The output is always: call_me_uber_ride_type("Plus", Berkeley, zipcode=94704, duration=10)
So if MLC is not going to work I was wondering if there is another recommended way to spin up an API server on a dedicated GPU server that I can self host like the one at Berkeley?

from gorilla.

ChristianWeyer commented on June 2, 2024

Thank you, @ShishirPatil. This information helps but could you share the full process to execute the api server locally?

With that information we could update the example to use localhost instead of this line
openai.api_base = "http://luigi.millennium.berkeley.edu:8000/v1"

I am also looking for this. Having it running locally as an OpenAI Function Calling drop-in replacement for existing applications. @ShishirPatil

from gorilla.

[feature] Guidance on Self-Hosting API Endpoints about gorilla HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent