rjmacarthy / twinny Goto Github PK

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.

Home Page: https://rjmacarthy.github.io/twinny-docs/

License: MIT License

JavaScript 1.44% TypeScript 83.09% CSS 15.47%

artificial-intelligence code-generation vscode-extension code-completion copilot codellama llama2 ollama ollama-api code-chat

twinny's Introduction

twinny

Tired of the so-called "free" Copilot alternatives that are filled with paywalls and signups? Look no further, developer friend!

Twinny is your definitive, no-nonsense AI code completion plugin for Visual Studio Code and compatible editors like VSCodium. It's designed to integrate seamlessly with various tools and frameworks:

Like Github Copilot but 100% free!

Install Twinny on the Visual Studio Code extension marketplace.

Main Features

Fill in the Middle Code Completion

Get AI-based suggestions in real time. Let Twinny autocomplete your code as you type.

Chat with AI About Your Code

Discuss your code via the sidebar: get function explanations, generate tests, request refactoring, and more.

Additional Features

Operates online or offline
Highly customizable API endpoints for FIM and chat
Chat conversations are preserved
Conforms to the OpenAI API standard
Supports single or multiline fill-in-middle completions
Customizable prompt templates
Generate git commit messages from staged changes
Easy installation via the Visual Studio Code extensions marketplace
Customizable settings for API provider, model name, port number, and path
Compatible with Ollama, llama.cpp, oobabooga, and LM Studio APIs
Accepts code solutions directly in the editor
Creates new documents from code blocks
Copies generated code solution blocks

🚀 Getting Started

Setup with Ollama (Recommended)

Install the VS Code extension here or VSCodium here.
Set up Ollama as the backend by default: Install Ollama
Select your model from the Ollama library (e.g., codellama:7b-instruct for chats and codellama:7b-code for auto complete).

ollama run codellama:7b-instruct
ollama run codellama:7b-code

Open VS code (if already open a restart might be needed) and press ctr + shift + T to open the side panel.

You should see the 🤖 icon indicating that twinny is ready to use.

See Keyboard shortcuts to start using while coding 🎉

Setup with Other Providers llama.cpp / LM Studio / Oobabooga / LiteLLM or any other provider

For setups with llama.cpp, LM Studio, Oobabooga, LiteLLM, or any other provider, you can find more details on provider configurations and functionalities here in providers.md.

Install the VS Code extension here.
Obtain and run your chosen model locally using the provider's setup instructions.
Restart VS Code if necessary and press CTRL + SHIFT + T to open the side panel.
At the top of the extension, click the 🔌 (plug) icon to configure your FIM and chat endpoints in the providers tab.
It is recommended to use separate models for FIM and chat as they are optimized for different tasks.
Update the provider settings for chat, including provider, port, and hostname to correctly connect to your chat model.
After setup, the 🤖 icon should appear in the sidebar, indicating that Twinny is ready for use.
Results may vary from provider to provider especailly if using the same model for chat and FIM interchangeably.

With Non-Local API Providers e.g, OpenAI GPT-4 and Anthropic Claude

Twinny supports OpenAI API-compliant providers.

Use LiteLLM as your local proxy for the best compatibility.
If there are any issues, please open an issue on GitHub with details.

Model Support

Models for Chat:

For powerful machines: deepseek-coder:6.7b-base-q5_K_M or codellama:7b-instruct.
For less powerful setups, choose a smaller instruct model for quicker responses, albeit with less accuracy.

Models for FIM Completions:

High performance: deepseek-coder:base or codellama:7b-code.
Lower performance: deepseek-coder:1.3b-base-q4_1 for CPU-only setups.

Keyboard Shortcuts

Shortcut	Description
`ALT+\`	Trigger inline code completion
`CTRL+SHIFT+/`	Stop the inline code generation
`Tab`	Accept the inline code generated
`CTRL+SHIFT+Z CTRL+SHIFT+T`	Open Twinny sidebar
`CTRL+SHIFT+Z CTRL+SHIFT+G`	Generate commit messages from staged changes

Workspace Context

Enable fileContextEnabled in settings to improve completion quality by tracking sessions and file access patterns. This is off by default to ensure performance.

Known Issues

Visit the GitHub issues page for known problems and troubleshooting.

Contributing

Interested in contributing? Reach out on Twitter, describe your changes in an issue, and submit a PR when ready. Twinny is open-source under the MIT license. See the LICENSE for more details.

Disclaimer

Twinny is actively developed and provided "as is". Functionality may vary between updates.

Star History

twinny's People

Contributors

Stargazers

Watchers

Forkers

allen-li1231 rastna12 stephanmeesters mvandermeulen co3saliuk braveyus semisft ankitsharma22458 kamote johnsmith0031 vfbfoerst nachoac jaytoday wodole oxaronick trulyc badetitou jhoncr aszazel channeladam cryptoronto scotmcc sbeardsley bjornjorgensen winniep 1sankalp antonkrug profintegra viethang aristo-ai steveefemsc kha84 dkzdev bnorick envmirzac zhutony chioranionutcatalin swoldanski curioushuman freshy969 giwan-dev imvijith pacman100 nav9 mahadih534 piulin onel statechangelabs mgmacleod mfernandez4 pixelkaiser hafriedlander amrrx rhdeck spinezhang codewithdripzy rocketli gmh5225 lexikuma private-gpt hgeryville jeffistyping coyang matloobaltaf hangover3832 seshakiran sebastianelsner id-2 toannguyen247 marcusgreen sangyuxiaowu hapliniste zhoupro zcfrank1st ambarrlite thorstone137 cwallace azmisahincom william-vu buyersystem anasvakyathodi beimingmaster zqhhn mayi140611 mr-narender furyhawk sikkgit iuvenis-sapiens

twinny's Issues

No sponsor ship button on your profile page

Is your feature request related to a problem? Please describe.
I would like to sponsor your account with $1 a month (like some other GH users have)

Describe the solution you'd like
If you would enable sponsorship with with a $1 tier, i could join it

Describe alternatives you've considered
One off paypal, revolut could work too

using oobabooga text generation webui instead of ollama

Hi
thanks for your awesome efforts on this!
As you might know, oobabooga text generation webui also offers openAI like api through one of its extensions, and being able to utilize that using twinny, would be a game changer.

anybody had any success doing so?

Add support for Bearer tokens

Problem

I've deployed Ollama and Ollama WebUI on a VM to provide chat-style AI to a development team. It's working very well and Ollama WebUI is under active development, adding features weekly.

I'd like to also provide code completion to the dev team. Since I'm running Ollama on a server, this means making requests from every developer's machine to the server. This means I need to have authentication in place.

Possible solution

Ollama WebUI has a feature where it will proxy requests to the Ollama server as long as you have a valid Bearer token in an HTTP header.

What do you think of adding an extension setting that would:

let the user set an HTTP Bearer token
include it in the "Authorization" header on requests to the backend

If Twinny could do this, it would mean admins don't have to put some other form of authentication in place in front of Ollama in order to offer code completion via their Ollama server.

Support for LM Studio

Is your feature request related to a problem? Please describe.
Nope its a feature request

Describe the solution you'd like
I would like to get LM Studio as a LLM backend supported. It is very easy to use and supports a local server that can be used for completion and/or chat.

Describe alternatives you've considered
I could obviously use ollama as well but then you need to keep the same models around which tend to be pretty large.

Additional context
https://lmstudio.ai/

Feature-Request: Support remote ollama API

Hi :) I have a Server with a gpu running remotely, it would be nice if I could replace the API Base URL with the remote URL of my server, so I don't need to have ollama setup on localhost :)
Or is it already possible?
Thanks :}

Is there a way to pass actual files by pathname in twinny?

Is there a way to pass actual files by pathname in twinny? Copy/paste contents seems a bit primitive.

Very New to Twinny! How to use chat mode?

The code completion is great! Haven't been able to figure out how to use chat mode, though. VS Code Terminal?

Mac OS on M1, VS Code and platformio.

JSON parse errors handling streamed responses

Describe the bug

When I run a local Ollama server on my M1 MacBook, Twinny seems to always receive a single HTTP chunk with a complete JSON object it containing the query response.

However, when running Ollama on a server with a decent GPU, sometimes the responses are large enough that they get split up into multiple chunks, and Twinny attempts to parse incomplete JSON objects.

I believe there is a bug in Twinny's handling of Ollama's "chunked" HTTP transfer encoding, because if I inspect the object being passed to JSON.parse in the extension I see incomplete objects, and a stack trace like this from Twinny:

2024-01-23 12:42:54.158 [error] SyntaxError: Expected ',' or ']' after array element in JSON at position 11330
	at JSON.parse (<anonymous>)
	at onData (/Users/nick/.vscode/extensions/rjmacarthy.twinny-2.6.17/out/extension.js:685:46)
	at IncomingMessage.<anonymous> (/Users/nick/.vscode/extensions/rjmacarthy.twinny-2.6.17/out/extension.js:988:17)
	at IncomingMessage.emit (node:events:513:28)
	at Readable.read (node:internal/streams/readable:539:10)
	at flow (node:internal/streams/readable:1023:34)
	at resume_ (node:internal/streams/readable:1004:3)
	at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

It looks like Twinny is receiving a chunk and trying to parse it, but the chunk is not guaranteed to be valid JSON data:

onData: (chunk, onDestroy) => {
              const json = JSON.parse(chunk)
              completion = completion + json.response

I don't notice this locally, just when running against a server.

To Reproduce
Steps to reproduce the behavior:

set up Twinny to use an Ollama server that will return large responses (e.g. 15kB)
use Twinny in code completion mode

Expected behavior

Twinny should accept the completion suggestion from the server and display it in the editor.

Actual Behaviour

Twinny attempts to parse an incomplete response, resulting in a JSON parsing stack trace, and no suggestion is shown.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: MacOS

Additional context
Add any other context about the problem here.

Not working with localforward to remote ollama server

In my setup I have a underpowered laptop where I do my coding and a beefy server to handle ollama. The server is not on the internet, and needs to be accessed via a jump host on ssh. I use LocalForward 11434 10.10.0.233:11434 to emulate having it locally.

For curl and other vscode plugins like Continue it appears that my ollama server is available on localhost

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "codellama:13b",
  "prompt": "Write me a function that outputs the fibonacci sequence"
}'

{"model":"codellama:13b","created_at":"2024-02-06T07:24:29.612339609Z","response":"\t","done":false}
{"model":"codellama:13b","created_at":"2024-02-06T07:24:29.635149631Z","response":"if","done":false}
{"model":"codellama:13b","created_at":"2024-02-06T07:24:29.657876752Z","response":" err","done":false}
{"model":"codellama:13b","created_at":"2024-02-06T07:24:29.680659605Z","response":" :=","done":false}
{"model":"codellama:13b","created_at":"2024-02-06T07:24:29.703439665Z","response":" validate","done":false}
...

Twinny however does not appear to be compatible with this setup, and keeps insisting I should install ollama locally.

Expected behavior
Since the configuration is all about ports and urls, I'm expecting Twinny to try and use the port:host combo I set up, not any other systemcalls to ollama.

Desktop

I couldn't create a .vsix file

npm run package

[email protected] package
webpack --mode production --devtool hidden-source-map

[webpack-cli] Compiler starting... 
[webpack-cli] Compiler is using config: '/twinny/webpack.config.js'
[webpack-cli] Compiler finished

assets by status 226 KiB [cached] 1 asset
modules by path ./node_modules/ 415 KiB
javascript modules 271 KiB 61 modules
json modules 143 KiB
./node_modules/openai/package.json 470 bytes [built] [code generated]
./node_modules/mime-db/db.json 143 KiB [built] [code generated]
modules by path ./src/*.ts 4.87 KiB
./src/extension.ts 919 bytes [built] [code generated]
./src/completion.ts 3.97 KiB [built] [code generated] [3 errors]

12 modules

ERROR in /twinny/src/completion.ts
./src/completion.ts 135:12-21
[tsl] ERROR in /twinny/src/completion.ts(135,13)
TS7053: Element implicitly has an 'any' type because expression of type '0' can't be used to index type 'CreateCompletionResponseChoicesInner'.
Property '0' does not exist on type 'CreateCompletionResponseChoicesInner'.
@ ./src/extension.ts 5:21-44

ERROR in /twinny/src/completion.ts
./src/completion.ts 136:26-31
[tsl] ERROR in /twinny/src/completion.ts(136,27)
TS2339: Property 'slice' does not exist on type 'CreateCompletionResponseChoicesInner'.
@ ./src/extension.ts 5:21-44

ERROR in /twinny/src/completion.ts
./src/completion.ts 136:42-48
[tsl] ERROR in /twinny/src/completion.ts(136,43)
TS2339: Property 'length' does not exist on type 'CreateCompletionResponseChoicesInner'.
@ ./src/extension.ts 5:21-44

3 errors have detailed information that is not shown.
Use 'stats.errorDetails: true' resp. '--stats-error-details' to show it.

webpack 5.88.2 compiled with 3 errors in 4669 ms

"Use Tls" setting is no longer present in Settings

Describe the bug

In version 2.6.9 and several versions prior, There was a "Use Tls" setting that was added in response to #28. That setting is no longer present.

To Reproduce

upgrade Twinny to 2.6.13
try to use it with a server that is listening for https instead of plain http

Expected behavior

Twinny should send the request over https.

Actual behaviour

Twinny sends the request over plain http.

Here is a capture of the request sent by Twinny:

POST /api/generate HTTP/1.1
Content-Type: application/json
Authorization: Bearer xxxxxx-xxxxxxxx-xxxxxx-xxx-xxx
Host: my.domain.com:444
Connection: close
Transfer-Encoding: chunked

3f3d
{"model":"codellama:13b-code","prompt":"<PRE> \n\n// Language: Javascript (javascript) \n// File uri: file:///Users/someguy/.vscode/extensions/rjmacarthy.twinny-2.6.13/out/extension.js (javascript) \n          getFileHeader(e, t) {\n            const n = a.languages[e];\n            return n\n              ? `\\n${n.comment?.start || \"\"} Language: ${n.name} (${e}) ${\n                  n.comment?.end || \"\"\n                }\\n${n.comment?.start || \"\"} File uri: ${t.toString()} (${e}) ${\n                  n.comment?.end || \"\"\n                }\\n`\n              : \"\";\n          }\n          calculateSimilarity(e, t) {\n            const n = e.split(\"/\"),\n              i = t.split(\"/\"),\n              o = n[n.length - 1],\n              s = i[i.length - 1];\n            return (\n              n.slice(0, -1).join(\"/\").score(i.slice(0, -1).join(\"/\"), 0.5) +\n              o.score(s, 0.5)\n            );\n          }\n          getFileContext(e) {\n            const t = [],\n              n = e.toString();\n            for (const e of i.workspace.textDocuments) {\n              if (\n

and here is the TLS-terminating proxy's response:

HTTP/1.1 400 Bad Request
Server: nginx
Date: Mon, 22 Jan 2024 17:03:06 GMT
Content-Type: text/html
Content-Length: 248
Connection: close
Strict-Transport-Security: max-age=63072000

<html>
<head><title>400 The plain HTTP request was sent to HTTPS port</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<center>The plain HTTP request was sent to HTTPS port</center>
<hr><center>nginx</center>
</body>
</html>

Looking at the extension code, I could see the extension examining a URL to see if it started with "http" or "https", but setting the URL to "https://my.domain.com" resulted in "invalid URL" errors from the extension:

2024-01-22 11:56:58.476 [error] ProxyResolver#addCertificatesV1 TypeError: Invalid URL
	at new NodeError (node:internal/errors:399:5)
	at Url.parse (node:url:425:17)
	at Object.urlParse (node:url:147:13)
	at useProxySettings (/Applications/Visual Studio Code.app/Contents/Resources/app/node_modules.asar/@vscode/proxy-agent/out/index.js:121:35)
	at /Applications/Visual Studio Code.app/Contents/Resources/app/node_modules.asar/@vscode/proxy-agent/out/index.js:113:13
	at /Applications/Visual Studio Code.app/Contents/Resources/app/node_modules.asar/@vscode/proxy-agent/out/index.js:459:13

Editing the extension code to always require('https') makes Twinny send the request over https.

Add contextual `when` to keyboard shortcuts

Is your feature request related to a problem? Please describe.
I'd like to reuse keyboard shortcuts that are already used for something else for twinny.
In my case,

CTRL+SHIFT+Enter to cancel bot generating text.
CTRL+SHIFT+Space to prompt the bot to try to guess what code comes next.

However, those keys are already assigned to other shortcuts which I also would like having for those shortcuts.
However, they don't apply to the same contexts.

Describe the solution you'd like
I'd like the provided shortcuts to already contain context evaluation from the get-go so the shortcuts only apply when the context makes sense.
For example, the key so the bot stops, can be limited only to the context of the bot generating a message (and 0.5-1s grace period). See:
https://code.visualstudio.com/api/references/when-clause-contexts#add-a-custom-when-clause-context

The key to prompt the bot to trigger an inline suggestion can be only active when the user is focusing on the editor (or other similar contexts).

Describe alternatives you've considered
Make the shortcuts more complex, such as 2 stage shortcut to try to avoid the clash. Takes a long time if I keep wanting suggestions but not suggestions on each key press.

Additional context
(Kept the original shortcuts in this shot so you can identify them more easily)

Thank you for this tool!

Completions not working with ollama via ollama webui

Problem

Unable to use twinny (no completions) when setting up ollama via ollama webui despite specifying bearer token.
I wonder if as mentioned here bearer token is not really fully supported and there's still need of a workaround?

Adding screenshot of config for reference:

vLLM Integration

Integration of twinny with the popular vLLM backend can allow a wide range of new use cases, especially on multiuser environments.

Wishlist - Gguf compatibility

I'm using this - https://huggingface.co/TheBloke/Phind-CodeLlama-34B-Python-v1-GGUF

Works with llama.cpp
I couldn't get it working with oogabooga but need ui.

No tutorial available

Running the ollama server and installing the plugin is not sufficient to start code completion. I am still not able to get it to work on VS Code. A tutorial will be helpful to get started.

Use remote docker ollama server

Is your feature request related to a problem? Please describe.
I'm new to the ollama scene. My ideal workflow involves running ollama in a docker container on a remote server and tunneling the ollama port back to my laptop running vscode. This doesn't seem to be supported with twinny because appears to want to ollama installed locally.

Describe the solution you'd like
Anything to make the above happen

Describe alternatives you've considered
I would like to learn more about how ollama and twinny work because I don't understand the reason for the current limitation.

Additional context
I'm comfortable writing a shim if I knew how and why twinny was trying to call ollama so that I could convert those to the appropriate ssh+docker commands

command enter doesn't work to get out of text box

Describe the bug / feature
when in the text box in the chat sidepane, I have to click the button to have the model respond. Everywhere else, command enter, or the corresponding shortcut on windows, will automatically leave the box and accept it.

its kind of a bug.... but maybe it’s a new feature. not sure

Feature: Persist session and new session

When chat session is active, moving away from twinny and then re-visiting the tab should not clear the session. Instead there should be a button to create a new session and the previous chat session should be re-instated.

Settings: api url is confusing

Describe the bug
In settings for the extension, there is a box for the API Url. I assume this used to be the full url then you added the paths and ports. It looks like it is actually just the host name.

Screenshots

Desktop (please complete the following information):

MacOS

<newbie-disclosure> I've installed twinny with VS code and have the chat interface working; however, the code autosuggest (fill in the middle) is not activating.

Describe the bug
Reiterating: This may be my misunderstanding rather than a bug.

I can get twinny to respond to my questions in the chat interface within VSCode; however, nothing triggers via autosuggest as I write code. I verified that I have the two prerequisite models that are supported (screenshot below) but don't know how to ensure both are running -- without collision -- and respond to Twinny at its preconfigured URLs.

Questions:
a) Is it expected for me to have to explicitly run the ollama model(s) before Twinny initializes?

b) How do I spawn Ollama to serve both models at URLs without colliding so that Twinny can reach them both? I've checked the configuration and it appears there is a root url (http://localhost), path (/api/generate) and port (11434) as seen here.

Perhaps expected: after terminating the terminal session that was running $ollama run codellama the Twinny chat stopped responding. I then executed both ollama $run codellama:7b-code as well as $run codellama:7b-instruct concurrently but that did not activate the chat or autosuggest within VS Code. Shutting down VSCode didn't help either.

It appears the only feature I have is chat and will only work if (pre) spawn $ollama run codellama. Do I have it corect that doing so spawns the default codellama (latest) model rather than the fine tuned models for chat and autosuggest. Is this correct?

To Reproduce
Steps to reproduce the behavior:

Spawn VS Code with my sample project; or any for that matter.
Spawn ollama model(s); e.g. 'ollama run codellama'
Click on the Twinny Extension
Type in a question: 'write an email regex matcher function for validation`
Open a typescript file and type in the following `const adder = (op1: number, op2: number): number => op1 + op2;

Expected behavior
Along with the Twinny chat interface, I also expect autosuggest to activate and display code suggestions as I type in code.

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: OSX Sonoma 14.2.1 (23C71)
Browser [e.g. chrome, safari]: chrome but not relevant for this filing.
Version: Twinny version: v3.3.0
VSCode: version 1.85.2

Smartphone (please complete the following information):

Device: N/A
OS: [N/A
Browser N/A
Version N/A

Additional context

Fix: Disable chatbox and button when generating

The chatbox and button should be disabled when generating

Copy paste doesn't always work

Describe the bug

When i want to copy something from the message, not just the highlighted code block, i allows me to select it, to click COPY, but when I paste it into the editor previous copy shows, or empty lines.

To Reproduce
Steps to reproduce the behavior:

Go to to the prompt and say Say hello
Click on 'Send message'
Select the response: "Hello!"
Right click to show context menu
Left click Copy
The paste it into your editor

Expected behavior
Would like to see the "Hello!" pasted in my editor

Screenshots

Desktop (please complete the following information):

OS: Windows 10
Browser Firefox
Version 122

Custom buttons with prompts

Is your feature request related to a problem? Please describe.
I started using the plugin for non-coding sections as well, rephrasing comments, cleaning grammar in readme...
And have to type manually the prompts like "Rephrase the following text" while having selected comment/parahraph of text selected

Describe the solution you'd like
Ability to add custom buttons alongside the Refactor, Explain ... buttons with my own custom prompt

Describe alternatives you've considered
Typing the prompt manually so far works, but having custom buttons for whatever use-cases i will find the plugin useful would be amazing.

Feature: Stop generation

Add a button to stop generating near the chat input box.

Open VSX Support

Hello and thanks for this beautiful repo.

Can you publish the extension on Open VSX too? I'm using VSCodium and it would be nice to get updates automatically.

Best,
Orkut

Really want to be able to specify the fim and chat templates.

Today I am stuck with using only CodeLlama even though I find DeepSeek Coder to be much better. And Infilling with the 1.6b parameter model is amazing. Could you add the ability to add custom templates, or do like Llama Coder extension which defines templates for codellama, deepseek coder, and one other.

Other than that one issue, Twinny is my favorite for vscode AI Assistants. I posted this yesterday about Llama Coder and plan to do an update for Twinny. https://www.youtube.com/watch?v=fT-sUUq48Xk

Add support for multi-language files

Is your feature request related to a problem? Please describe.
Yes.

I'm trying to edit a PHP file which contains HTML embedded which contains some javascript embedded. Twinny's recommendations appear to assume I want PHP and even that the code is PHP when I'm editing the javascript portion.

Describe the solution you'd like
There are 2 separate and different difficulty level possibilities for this.

Identify that it's a multi-language file type and allow the user to clarify (in the chat box, could be the first word, for example) But it can also be a dropbox that is easily accessible.
Identify the language where the cursor is at which is used when prompting the bot
2.1. Doing this half arsed might do so-so reasonable results.... Maybe
2.2. Doing this for real (but doing that well is a PITA and I don't recommend)

Describe alternatives you've considered
I'm unable to find viable alternatives.

llama.cpp api

Is your feature request related to a problem? Please describe.
Llama.cpp provides nice options to run local models in gguf format using both gpu and cpu at the same time

Describe the solution you'd like
Want to use llama.cpp as backend for twinny

Describe alternatives you've considered
Ollama need docker which is rather uncomfortable.

Mac-book: No reliable generation, autocompletion only engages sometimes

Hi! I just setup twinny with my local Ollama on macOS and it seems like autocompletion does not engage
often enough and if it does, it suggests only unrelated edits. Maybe I did something wrong.

I've also tried another model, but it barely got better.

I noticed before, when I had 10 Vscode windows open that it did not engage at all so I had to close all of them. Maybe there is more to it.

macOS 14.2.1
M1 Pro
ollama version is 0.1.22

Allow setting base path on the server

I'm trying to use Twinny to access Ollama via Ollama WebUI. Ollama WebUI forwards any requests to /ollama to Ollama. So for this to work, a client needs to send requests to /ollama/api/generate.

However, There is no way to set the path in Twinny settings.

Also, adding another setting that is used to build the URL might be too many settings. The "Use TLS", "hostname", "port", and "path" settings could all be specified by taking a base URL from the user:

https://some.server.io:8443/ollama

Then the extension could append /api/generate.

This can also be done in a reverse proxy by rewriting the URL but it would be nice not to have to.

Bug: Settings modified do not apply immediately

Expected Behavior

Any change in configurations in settings (either in JSON file or UI) is supposed to be applied immediately.

Actual Behavior

User will have to reload window or reactivate extension to make an apply.

My Suspection

Probably because the configurations are stored in CompletionProvider when extension is activated, and never to be update afterwards.
To fix it:

Make use of workspace.onDidChangeConfiguration method.
Always get a new twinny configuration instance from workspace when it is to be used.

Improvement limit prompt length to LLM context window

Adhere to LLM context window size when conversations overflow the limit.

Stop automatic scrolling of chat content

When interacting in the chat window the content area always scrolls whatever I am reading out of view at the top.

Can we prefend this?

Should FIM Model Name deepseek-coder:6.7b-instruct-q4_K_S work?

I get no FIM using deepseek-coder:6.7b-instruct-q4_K_S , also not when selecting the deepseek Fim template format.
Do I use the correct model?

twinny in Windows

Ollama is not officially supported for Windows. But it works fine in WLS.
Even though olama is launched under WLS, Twinny does not recognize this and wants to install it on a Windows host.

Is it possible to check if localhost:11434 is active and work with that?
or
to contact the WLS directly if this is possible.

Location of accept/copy buttons

I like to read the suggested code and then accept/copy so I find myself constantly scrolling back to find those buttons.

Can we move those to below the suggested code? Or alternatively have them in both locations? Have a feature flag in settings for the location(s) is also an option. But in that case an advanced settings area might start making sense.

C# (csharp) and Javascript JSX (javascriptreact) supported languages

Describe the bug
Unable to use with React and C#. Context menu items "Twinny - Explain, Refactor, Write Tests etc"

When using with C#, I can select the context and chat about the context but the buttons and context menu to use the templates do not work for these languages. I receive the following error in the Extension Host

2024-02-05 08:43:57.151 [error] TypeError: Cannot read properties of undefined (reading 'langName') at e.ChatService.buildTemplatePrompt (/Users/sbeardsley/.vscode/extensions/rjmacarthy.twinny-3.3.5/out/extension.js:2:80876) at e.ChatService.streamTemplateCompletion (/Users/sbeardsley/.vscode/extensions/rjmacarthy.twinny-3.3.5/out/extension.js:2:82336) at /Users/sbeardsley/.vscode/extensions/rjmacarthy.twinny-3.3.5/out/extension.js:2:86378 at Timeout._onTimeout (/Users/sbeardsley/.vscode/extensions/rjmacarthy.twinny-3.3.5/out/extension.js:2:111932) at listOnTimeout (node:internal/timers:569:17) at processTimers (node:internal/timers:512:7)

To Reproduce
Steps to reproduce the behavior:

Open a project with C# files or Javascript JSX files
Open the Extension Host output window
Highlight a code block and right click
Choose a Twinny - * menu option
See error.

Expected behavior
I believe it should work exactly the same as a vanilla JS file.

Desktop (please complete the following information):

OS: macOS Sonoma 14.3

I submitted a PR, but I am not sure how to link the issue to the PR. #88

Add the option to configure the Chat with custom instructions.

Feature Ask

Currently, I find the LLM in the chat quite lengthy, I would like to add custom pre-configured messages to change its writing style e.g., 'be concise'. This can either be an extra button with a pop-up or a field on the settings page.

I pulled the repo locally, so I can make the change myself once approved :)

Support TLS

I have this extension working with plain HTTP, but I'm hosting Ollama in the cloud and I'd like to use TLS to communicate with it. Is there any way to do this?

Setting the hostname and port to my Ollama's TLS endpoint results in a 400 response, so I assume it's sending an HTTP request to an HTTPS port. LMK if I'm wrong.

Add special tokens for deepseek

For deepseek the fim tokens are different.

See official repository
https://github.com/deepseek-ai/DeepSeek-Coder

Allow the user to cancel the initial model download and use a local one already installed.

Suggestion: Automatically publish new versions (tagged commits)

FYI: If you are OK with automation, you can setup a GitHub action to publish on both open-vsx and vscode marketplace.

Example to publish on each new tag:
https://github.com/badetitou/vscode-pharo/blob/main/.github/workflows/publish.yml
A more detailed example:
https://github.com/HaaLeo/vscode-timing/blob/master/.github/workflows/cicd.yml#L1
The instructions of the action are here:
https://github.com/HaaLeo/publish-vscode-extension

How to make a secret:
https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions#creating-secrets-for-a-repository

(credit to @badetitou for the idea #71 (comment))

Support Ollama api

Newlines sometimes absent from suggestions

Describe the bug

Sometimes Ollama returns a suggestion with newlines in it, but Twinny displays the suggestion with the newline stripped out.

This is a screenshot of a YAML file with a pretty predictable pattern. Ollama correctly returned a response with a newline, then the text alias: dl, but in my editor the suggestion had lost the newline. As you can see from the existing entries in the YAML file, there should be a newline before the alias key.

Twinny's HTTP request, from Wireshark:

{"model":"codellama:13b-code","prompt":"<PRE> \n\n# Language: YAML (yaml) \n# File uri: file:///Users/someguy/src/provisioning/roles/fauxpilot/files/ops.yml (yaml) \nactions:\n  start: docker compose up -d --build\n  stop: docker compose down\n  restart: docker compose restart\n  bounce: ops stop && ops start\n  logs:\n    command: docker compose logs --tail 200 --follow\n    alias: l\n  status:\n    command: docker compose ps\n    alias: ps\n  start-with-logs:\n    command: ops start && ops logs\n    alias: sl\n  bounce-with-logs:\n    command: ops bounce && ops logs\n    alias: bl\n  down-with-logs:\n    command: ops stop && ops start && ops logs <SUF>\n  restart-with-logs:\n    command: ops restart 88 ops logs\n    alias: rl\n <MID>","stream":true,"n_predict":-1,"temperature":0.1,"options":{"temperature":0.1,"num_predict":-1}}

Ollama's response, from Wireshark:

6a
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.294959315Z","response":"\n","done":false}

6b
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.313395311Z","response":"   ","done":false}

6e
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.327937513Z","response":" alias","done":false}

69
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.343851477Z","response":":","done":false}

6a
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.359633046Z","response":" d","done":false}

69
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.377242264Z","response":"l","done":false}

78
{"model":"codellama:13b-code","created_at":"2024-01-30T14:06:04.391557372Z","response":" \u003cEOT\u003e","done":false}

Personally, I like just getting one line at a time back from the LLM. However, if a multiline response is returned but newlines are stripped the suggested completions won't be usable without manual tweaking.

To Reproduce

Steps to reproduce the behavior:

Use Twinny until you get a completion that should have a newline
Note the lack of a newline

Expected behavior

Suggested completion is shown with newlines.

Desktop (please complete the following information):

Version 3.0.6

Keys to focus on the chat prompt and send messages

Probably these keys (and commands) already exist and I couldn't find them.
If not I guess they should be introduced:

focus on the chat (the command show twinny just open the sidebar but doesn't focus on the chat input)
key to send message (I though it should the "standard" cmd-Enter but it's not)
key to stop message generation

add gpt-4-turbo-preview

Hi, I will like to have support for models like gpt-4-turbo-preview

Ollama app not starting automatically

Hello,
thank you very much for this plugin! It works correctly in my case but I have a minor problem as mentioned in this other issue: ollama/ollama#2031 (comment)

Basically, I don't want Ollama to be running on my system all the times, so when I've installed it I might have disable the "run in background" or "run at login", I can't recall which one. I've now enabled again "run in background" but it didn't solve the problem I'm about to explain.

I've realized that when Ollama.app is not running and I launch vscode then the Ollama.app icon bounces a couple of times on the dock meaning it's being launched (probably by this plugin) but I can't see anything on the menubar.

The only way is to manually click on the Ollama.app icon, then everything works again (but I have to re-launch vscode if it was already running).

My question: is this plugin trying to launch Ollama.app when lauching/restarting vscode? If so why can't it start the server? Can it be an access rights problem?

If the ollama server responds with status code 404 response sounds like AI responding when AI wasn't the writer

Describe the bug

I opened the chat window and I sent a message. As expected (because I didn't install the model yet), the request doesn't lead to any useful result.
However, twinny displays in the UI text which can be plausible as the AI response:

Sorry, I don’t understand. Please try again.

To Reproduce
Steps to reproduce the behavior:

Install ollama anew and start serving.
Install twinny in the editor.
Open the sidebar
Type something to the bot.
Twinny answers Sorry, I don’t understand. Please try again.

Expected behavior
I'd expect some different icon that resembles a server and the name system or similar mentioning, for example:

🖥️ ollama
Server returned error: "model 'codellama:A7b-instruct' not found, try pulling it first"

Screenshots

Desktop (please complete the following information):

OS: N/A
Browser N/A
Version N/A

Smartphone (please complete the following information):

Device: N/A
OS: N/A
Browser N/A
Version N/A

Additional context
Add any other context about the problem here.

Call the plugin on request only

Hello,
right now it seems that once activated the plugin runs all the time, trying to complete the code as we press keys in vscode.

This is what I'd want most of the times but not always, sometimes the plugins just gets in the way by proposing its completion and I have to press esc all the times to cancel them.

Ideally I'd like to call the plugin when I need it by pressing a shortcut or call a command.
I mean, I'd like to keep it disabled until I decide when it should be enabled on my request.

Is this possible right now? If it isn't it would be nice to have this feature, to both have more control and don't show annoying completions but also to limit the cpu limit when it's not necessary the plugin use.

rjmacarthy / twinny Goto Github PK

twinny's Introduction

twinny

Main Features

Fill in the Middle Code Completion

Chat with AI About Your Code

Additional Features

🚀 Getting Started

Setup with Ollama (Recommended)

Setup with Other Providers llama.cpp / LM Studio / Oobabooga / LiteLLM or any other provider

With Non-Local API Providers e.g, OpenAI GPT-4 and Anthropic Claude

Model Support

Keyboard Shortcuts

Workspace Context

Known Issues

Contributing

Disclaimer

Star History

twinny's People

Contributors

Stargazers

Watchers

Forkers

twinny's Issues

Problem

Possible solution

Problem

Expected Behavior

Actual Behavior

My Suspection

Feature Ask

Recommend Projects

Recommend Topics

Recommend Org