Hi! I just setup twinny with my local Ollama on macOS and it seems like autocompletion

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Mac-book: No reliable generation, autocompletion only engages sometimes about twinny HOT 7 CLOSED

rjmacarthy commented on May 22, 2024

Mac-book: No reliable generation, autocompletion only engages sometimes

from twinny.

Comments (7)

LeonardoGentile commented on May 22, 2024 1

I've also noticed (but could totally be coincidental) that having more vscode windows open affect the speed and reliability of the plugin. Although I couldn't reliably measure this

from twinny.

rjmacarthy commented on May 22, 2024

Thanks for the report. There's not much you can really do wrong. I use an Nvidia 3090 currently with Ollama and codellama:13b-code for FIM completions.

What happens if you run

ollama run codellama:7b-code '<PRE> def compute_gcd(x, y): <SUF>return result <MID>'

In your terminal?

Could you let me know which model are you using? The models can be changed in the settings menu for the extension.

In the settings menu also try to lower the context length which might be confusing the model and taking too long to generate completions. The number is the lines above and below the fim you are attempting. The default was 300 I just changed it to 50 in a new release which provides less context in large files.

Please let me know how you get on.

from twinny.

max-wittig commented on May 22, 2024

Hi! Thanks for the hints. I've set it down to 50.

The command outputs the expected output. I think the plugin is not sending it all the time, not sure though.

I noticed the same as @LeonardoGentile as described above.

 #   Use Euclid's algorithm to compute the greatest common divisor of x and y.
    assert isinstance(x, int) and isinstance(y, int), "x and y must be integers"
    if x == 0 or y == 0: return max(abs(x), abs(y))
    # Euclid's algorithm
    if x < y: x, y = y, x
    while True:
        x_mod_y = x % y
        if x_mod_y == 0: return y
        x = y
        y = x_mod_y
 #   Euclid's algorithm ends here
# end function compute_gcd

def lcm(x, y):
 #   Compute the least common multiple of x and y. Use Euclid's algorithm
 #   to compute the greatest common divisor first, then compute the
 #   product of x and y divided by that value.
    assert isinstance(x, int) and isinstance(y, int), "x and y must be integers"
    return (x * y) / compute_gcd(x, y)
# end function lcm

if __name__ == '__main__':
    import sys
    args = sys.argv[1:]  # Ignore script file name
    if len(args) > 0:
        for arg in args:
            try:
                nums = [int(arg)]
                result = lcm(*nums)
                 <EOT>

from twinny.

rjmacarthy commented on May 22, 2024

Hey @max-wittig thanks for your help already, I am still not sure what the source issue could be.

According to your last message it appears that Ollama is working correctly, are you still having issues with generating completions in the editor, even in the simplest form? I.e open a new file and do something very simple with no context.

There is an option in the settings menu named debounce wait which controls how often the API is called automatically, perhaps it could prevent a bottleneck on your system to increase this value? Also please make sure that use file context is turned off which scans neighbouring documents and adds additional prompt context which could lead to slower completions.

The reported issue with having multiple vscode windows open in theory shouldn't cause any slow down as the plugin on calls the LLM via the Ollama API when a completion is requested, otherwise it should just sit idle. @LeonardoGentile are you seeing things differently, if so do you have any inkling about it?

The last question is; Are you are comfortable debugging in TypeScript and are willing to help? If so perhaps you could help debug what is happening on your local instance? Unfortunately I do not use a Mac Book and therefore cannot test the issue reliably. Personally I am using an Nvidia 3090 and completions in my editor are lighting fast.

Here are the files where the requests to the API are made, it should call and stream successfully when a completion is requested. You can read the contributing guide on how to get started if you decide to attempt this.

twinny/src/providers/completion.ts

Line 158 in 7198618

streamResponse({

twinny/src/utils.ts

Line 35 in 7198618

export async function streamResponse(opts: StreamResponseOptions) {

https://github.com/rjmacarthy/twinny/blob/master/CONTRIBUTING.md

Many thanks,

from twinny.

rjmacarthy commented on May 22, 2024

Please try updating to the newest version and report back if still an issue.

from twinny.

max-wittig commented on May 22, 2024

Sorry for the late response.

It seems to have improved a bit, but it causes high cpu and its really hard to justify using it on the Mac.

This might be not the fault of twinny, but rather of the hardware. I will play around further.

from twinny.

rjmacarthy commented on May 22, 2024

Thanks, closing for now. Please report back or open a new issue, a lot of improvements have been made recently.

from twinny.

Mac-book: No reliable generation, autocompletion only engages sometimes about twinny HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent