Giter Site home page Giter Site logo

Comments (7)

ramanv0 avatar ramanv0 commented on June 2, 2024 2

I've submitted a PR with a guide on self-hosting the OpenFunctions model. It includes instructions and example code for setting up a local server using FastAPI and uvicorn, and for configuring a client to interact with this server using the OpenAI package. I would appreciate a review and any feedback!

from gorilla.

ramanv0 avatar ramanv0 commented on June 2, 2024 1

Additionally, I created this Colab notebook to quickly test the example server in action. To run the server, I used a A100 GPU, which provided good response latency. To remotely access the server running on the Colab instance from my local machine, I used ngrok to tunnel the server ports from the Colab instance to public URLs. If you want to try out the server, you can get an ngrok auth-token here.

from gorilla.

ShishirPatil avatar ShishirPatil commented on June 2, 2024

Thanks @kcolemangt for raising this. Just updated the README, but basically this prompt is what we used to train the model, and using this should be good? Let me know if you run into any issues.

def get_prompt(user_query, functions=[]):
  if len(functions) == 0:
    return f"USER: <<question>> {user_query}\nASSISTANT: "
  functions_string = json.dumps(functions)
  return f"USER: <<question>> {user_query} <<function>> {functions_string}\nASSISTANT: "

I'll keep this open if you run into any issues.

from gorilla.

kcolemangt avatar kcolemangt commented on June 2, 2024

Thank you, @ShishirPatil. This information helps but could you share the full process to execute the api server locally?

With that information we could update the example to use localhost instead of this line

openai.api_base = "http://luigi.millennium.berkeley.edu:8000/v1"

from gorilla.

coolrazor007 avatar coolrazor007 commented on June 2, 2024

@kcolemangt with OpenAI python package you just have to specify the OpenAI compatible API of your local model. For instance, if you are trying this on your local computer, run Ollama with LiteLLM. Then edit the line you mentioned:
openai.api_base = "http://localhost"

Edit "localhost" to whatever LiteLLM tells you to use

from gorilla.

t-dettling avatar t-dettling commented on June 2, 2024

@kcolemangt with OpenAI python package you just have to specify the OpenAI compatible API of your local model. For instance, if you are trying this on your local computer, run Ollama with LiteLLM. Then edit the line you mentioned: openai.api_base = "http://localhost"

Edit "localhost" to whatever LiteLLM tells you to use

I think what @kcolemangt is looking for is how to run a server like endpoint, not just one for local development. If he does not, then I am curious how they are running the endpoint at http://luigi.millennium.berkeley.edu:8000/v1.

Ollama and LiteLLVM are cool for development but they are not really as good for server deployment, I tried to use a compiled model for MLC (https://llm.mlc.ai/docs/deploy/rest.html#install-mlc-chat-package) to run it with the restful API that is compatible with completions, and it does work but it does not look like it has support for the functions to be passed over the API since it is just making the function based on the user input.

For example when I run this code:

import openai


def get_gorilla_response(prompt="Call me an Uber ride type \"Plus\" in Berkeley at zipcode 94704 in 10 minutes",
                         model="gorilla-openfunctions-v0", functions=[]):
    openai.api_key = "EMPTY"
    openai.api_base = "http://192.168.30.27:8000/v1"
    try:
        completion = openai.ChatCompletion.create(
            model="gorilla-openfunctions-v1-q4f32_0",
            temperature=0.0,
            messages=[{"role": "user", "content": prompt}],
            functions=functions,
        )
        return completion.choices[0].message.content
    except Exception as e:
        print(e, model, prompt)


query = "Call me an Uber ride type \"Plus\" in Berkeley at zipcode 94704 in 10 minutes"
functions = [
    {
        "name": "Uber Carpool",
        "api_name": "uber.ride",
        "description": "Find suitable ride for customers given the location, type of ride, and the amount of time the customer is willing to wait as parameters",
        "parameters": [{"name": "loc", "description": "location of the starting place of the uber ride"},
                       {"name": "type", "enum": ["plus", "comfort", "black"],
                        "description": "types of uber ride user is ordering"},
                       {"name": "time", "description": "the amount of time in minutes the customer is willing to wait"}]
    }
]

resp = get_gorilla_response(query, functions=functions)
print(resp)
print("Done!")

The output is always: call_me_uber_ride_type("Plus", Berkeley, zipcode=94704, duration=10)
So if MLC is not going to work I was wondering if there is another recommended way to spin up an API server on a dedicated GPU server that I can self host like the one at Berkeley?

from gorilla.

ChristianWeyer avatar ChristianWeyer commented on June 2, 2024

Thank you, @ShishirPatil. This information helps but could you share the full process to execute the api server locally?

With that information we could update the example to use localhost instead of this line

openai.api_base = "http://luigi.millennium.berkeley.edu:8000/v1"

I am also looking for this. Having it running locally as an OpenAI Function Calling drop-in replacement for existing applications. @ShishirPatil

from gorilla.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.