Observation 1: Before I get too far down this rabbit hole, it would be nice if there w

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Oobabooga vs. Twinny about twinny HOT 11 CLOSED

zaqhack commented on June 16, 2024

Oobabooga vs. Twinny

from twinny.

Comments (11)

zaqhack commented on June 16, 2024 1

Seems to do the trick with Aphrodite-engine ...

Oobabooga is still freaking out. It seems like a default template problem, but I'm honestly not sure. It doesn't give any particularly useful feedback when it pukes. On the plus side, it isn't crashing, now ... but it also isn't giving me much to work with. :-( Here's what I'm seeing, now: https://youtu.be/ZLUoX4YEjqk

I wish I had time to give you more of a hand with this. I have been in love with Twinny from Day One, and it has allowed me to start comparing responses from various coding models. The response time from Aphrodite, as you can see in the video above, is nearly instantaneous.

After a few days of troubleshooting, there is one thing I wish it had: A "reset to default settings" button, somewhere. ha ha ha

from twinny.

zaqhack commented on June 16, 2024 1

It indeed was. Super-weird. I wonder how that even got into my cut-and-paste buffer ...

Still not sure what needs to be done, here, but this is at least the right video. 😅

https://youtu.be/G1C9bdKt5oA

from twinny.

rjmacarthy commented on June 16, 2024

Hi thanks for the report. I don't personally use oobabooga API so having trouble to test it. Please submit a pull request with a fix for it if you can.

Many thanks,

from twinny.

zaqhack commented on June 16, 2024

I'm not sure I have time to learn where this lives, but I think I found (one of) the main issues. I was able to get working last night on the first shot of a new chat. After that, it fails (usually 400, bad request). A little more Wireshark later, I was able to find this in one of the payloads:

{
    "role": "assistant",
    "content": "<|im_start|>assistant\nHello! How can I assist you with your coding needs today?\n",
    "type": "",
    "language": {}
},

I think "language" needs to be a string, not an object. (i.e. "" not {} )

The logs say it isn't getting a "string" for that parameter. If I make this change in Postman, it works. Not sure where it would be outputting an object instead of a string, here, but maybe that helps you narrow it down?

from twinny.

zaqhack commented on June 16, 2024

Side note: It's technically a mimic of the OpenAI API, at this point. If you get it working, it should also work with VLLM, Aphrodite-engine, ChatGPT, and others. You could maybe get away with "OpenAI Compatible" for that dropdown ...

(In truth, I'm getting the log message from Aphrodite because Ooba just implodes when it gets that payload ...)

from twinny.

rjmacarthy commented on June 16, 2024

Hmm ok, thanks for that. I think we already have a PR to fix a similar issue. #159 which has requested changes, if it stays stale for much longer I will take care of it.

Many thanks for the report and detailed response/likely fix.

from twinny.

rjmacarthy commented on June 16, 2024

@zaqhack I just released version 3.8.9 which should address the issue of the non-compliant fields in the payload for the openai api spec, please could let me know if it helps.

Many thanks

from twinny.

rjmacarthy commented on June 16, 2024

Will add it just for you. By the way, the video link seems to go to the wrong link...

from twinny.

rjmacarthy commented on June 16, 2024

Ok, so I just got this working on my local instance. I've to pushed a new version so that ooba worked with /v1/completions by default now and updated the code to stream the data from the correct property path. Also, in ooba CMD_FLAGS.txt I had to add --api and --listen flags for the API to work. After these things we're done it started to work. FYI ooba seems to be streaming junk completions to me but I'm not sure if it's the model/template im using. I don't use obba at all really so I am not sure, Ollama is just way better in my opinion.

from twinny.

zaqhack commented on June 16, 2024

Thanks!

If I can use a model with Aphrodite, I don't look elsewhere. Unfortunately, the acceleration it uses limits my choices a bit by not spanning video cards to allow greater VRAM total (it uses it for acceleration, not more model space). For Twinny, that's a mixed bag. It works well for Deepseek 7b, but I can't fit the bigger models onto one card with it ... for that, I need Kobold or Ooba or whatever. Is what it is. :-)

I should check out Ollama, I guess.

from twinny.

rjmacarthy commented on June 16, 2024

It should work now, closing.

from twinny.

Oobabooga vs. Twinny about twinny HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent