Giter Site home page Giter Site logo

Comments (5)

ppipada avatar ppipada commented on July 20, 2024

@CalabiYau14 Thanks for your comment.

Just to clarify:
Are you looking for URL to be specified in the huggingface inference api itself?

If yes, can you please give a example of how the request would look like, a curl command or general api info should be good enough.

or
Are you looking for support of open models that can be run offline as ggml or gptq (like llama 2 via llama.cpp, etc)?

I am going to do this very soon.

from vscode-flexigpt.

CalabiYau14 avatar CalabiYau14 commented on July 20, 2024

Hi @ppipada

thanks for your fast reply! Very much appreciated

I would be looking for option 1). Eg. working with a model which is locally deployed on a server and respects Huggingface's API. Instead of making a call to

https://api-inference.huggingface.co/models/MODEL_ID

I would like to be able to make a call to e.g.

https://MY-OWN-URL/starcoder

where the endpoint provides the same methods as the original Huggingface API

Hope this is understandable :-)

from vscode-flexigpt.

ppipada avatar ppipada commented on July 20, 2024

Hi @CalabiYau14 ,

I have extended my work for running local gglms to hugging face to and pushed those changes. Doc changes are pending
Still not sure if this will satisfy your need or not.

Basically you can now configure a "defaultOrigin" config per provider so that it gets used rather than the standard api endpoint. Docs is at: https://pankajpipada.com/flexigpt/#/aiproviders?id=huggingface

E.g: You should be now able to call api at: MY-OWN-URL/models/MODEL_ID

Note that this still means that the /models/ prefix is added by the extension to modelID.

There are still some gaps i am going to address in immediate term as:

  • model-info in request
  • full path override for a provider, rather than just model or origin <not sure if this needed, you could help me understand if this would ne needed for HF models, others I am sure this is not needed>
  • way to switch models mid conversation or continue with the model that was last used/switched to via custom prompt <currently, non prompt chats use the "defaultProvider"s default model>

Let me know if above helps and if the current checked in method solves your usecase too.

from vscode-flexigpt.

CalabiYau14 avatar CalabiYau14 commented on July 20, 2024

Hi @ppipada

thats exactly what I needed. Thank you very much. Perfect!

Cheers!

from vscode-flexigpt.

ppipada avatar ppipada commented on July 20, 2024

Cool.

Noting the below points on gaps I had mentioned:

  • Now you can query Llama.cpp directly (+ the previous via openai custom endpoint like the hugging face one)
  • Discovering whether a model is chat or not is handled in strategy specific way: hugging face provides uses the getModel api and associated "conversation" tag, open ai uses it for gpt3.5/4 by default. anthropic is default chat only, googles model names have chat as a suffix. Not planning to take any further model info now.
  • provider can be provided in requestparams and that will be honored for that prompt.

Pending:
way to switch models mid conversation or continue with the model that was last used/switched to via custom prompt <currently, non prompt chats use the "defaultProvider"s default model>

from vscode-flexigpt.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.