Wow just found this incredible project. This looks super cool! The o

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Cool. Noting the below points on gaps I had mentioned: <ul dir="

Allow for local Huggingface endpoints about vscode-flexigpt HOT 5 CLOSED

ppipada commented on July 20, 2024

Allow for local Huggingface endpoints

from vscode-flexigpt.

Comments (5)

ppipada commented on July 20, 2024

@CalabiYau14 Thanks for your comment.

Just to clarify:
Are you looking for URL to be specified in the huggingface inference api itself?

If yes, can you please give a example of how the request would look like, a curl command or general api info should be good enough.

or
Are you looking for support of open models that can be run offline as ggml or gptq (like llama 2 via llama.cpp, etc)?

I am going to do this very soon.

from vscode-flexigpt.

CalabiYau14 commented on July 20, 2024

Hi @ppipada

thanks for your fast reply! Very much appreciated

I would be looking for option 1). Eg. working with a model which is locally deployed on a server and respects Huggingface's API. Instead of making a call to

https://api-inference.huggingface.co/models/MODEL_ID

I would like to be able to make a call to e.g.

https://MY-OWN-URL/starcoder

where the endpoint provides the same methods as the original Huggingface API

Hope this is understandable :-)

from vscode-flexigpt.

ppipada commented on July 20, 2024

Hi @CalabiYau14 ,

I have extended my work for running local gglms to hugging face to and pushed those changes. Doc changes are pending
Still not sure if this will satisfy your need or not.

Basically you can now configure a "defaultOrigin" config per provider so that it gets used rather than the standard api endpoint. Docs is at: https://pankajpipada.com/flexigpt/#/aiproviders?id=huggingface

E.g: You should be now able to call api at: MY-OWN-URL/models/MODEL_ID

Note that this still means that the /models/ prefix is added by the extension to modelID.

There are still some gaps i am going to address in immediate term as:

model-info in request
full path override for a provider, rather than just model or origin <not sure if this needed, you could help me understand if this would ne needed for HF models, others I am sure this is not needed>
way to switch models mid conversation or continue with the model that was last used/switched to via custom prompt <currently, non prompt chats use the "defaultProvider"s default model>

Let me know if above helps and if the current checked in method solves your usecase too.

from vscode-flexigpt.

CalabiYau14 commented on July 20, 2024

Hi @ppipada

thats exactly what I needed. Thank you very much. Perfect!

Cheers!

from vscode-flexigpt.

ppipada commented on July 20, 2024

Cool.

Noting the below points on gaps I had mentioned:

Now you can query Llama.cpp directly (+ the previous via openai custom endpoint like the hugging face one)
Discovering whether a model is chat or not is handled in strategy specific way: hugging face provides uses the getModel api and associated "conversation" tag, open ai uses it for gpt3.5/4 by default. anthropic is default chat only, googles model names have chat as a suffix. Not planning to take any further model info now.
provider can be provided in requestparams and that will be honored for that prompt.

Pending:
way to switch models mid conversation or continue with the model that was last used/switched to via custom prompt <currently, non prompt chats use the "defaultProvider"s default model>

from vscode-flexigpt.

Allow for local Huggingface endpoints about vscode-flexigpt HOT 5 CLOSED

Comments (5)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent