Giter Site home page Giter Site logo

How to run it in VS Code? about llama-coder HOT 32 OPEN

ex3ndr avatar ex3ndr commented on June 27, 2024 6
How to run it in VS Code?

from llama-coder.

Comments (32)

taoeffect avatar taoeffect commented on June 27, 2024 2

I also am having problems. Can't figure out how to get it to complete anything (on macOS) M2.

I ran:

$ ollama pull codellama:7b-code-q4_K_M 

Restarted VSCodium (after installing the extension manually using the codium command).

The Output of the plugin looks like this:

2023-12-02 22:30:33.297 [info] Running AI completion...
2023-12-02 22:30:55.896 [info] Receive line: {"model":"codellama:7b-code-q4_K_M","created_at":"2023-12-03T06:30:55.8943Z","response":"\n","done":false}
2023-12-02 22:30:55.898 [info] AI completion completed: 
2023-12-02 22:30:55.898 [info] Canceled after AI completion.
2023-12-02 22:30:55.899 [info] Canceled before AI completion.
2023-12-02 22:30:55.901 [info] Canceled before AI completion.
2023-12-02 22:30:55.903 [info] Canceled before AI completion.
2023-12-02 22:30:55.904 [info] Canceled before AI completion.
2023-12-02 22:30:55.905 [info] Canceled before AI completion.
2023-12-02 22:30:55.906 [info] Canceled before AI completion.
2023-12-02 22:30:55.910 [info] Canceled before AI completion.
2023-12-02 22:30:55.911 [info] Canceled before AI completion.
2023-12-02 22:30:55.912 [info] Canceled before AI completion.
2023-12-02 22:30:55.913 [info] Canceled before AI completion.
2023-12-02 22:30:55.914 [info] Canceled before AI completion.
2023-12-02 22:30:55.923 [info] Running AI completion...
2023-12-02 22:30:55.960 [info] AI completion completed: 
2023-12-02 22:30:56.673 [info] Running AI completion...
2023-12-02 22:31:05.452 [info] Receive line: {"model":"codellama:7b-code-q4_K_M","created_at":"2023-12-03T06:31:05.451724Z","response":"\n","done":false}
2023-12-02 22:31:05.455 [info] AI completion completed: 
2023-12-02 22:31:05.455 [info] Canceled after AI completion.
2023-12-02 22:31:05.456 [info] Canceled before AI completion.
2023-12-02 22:31:05.458 [info] Canceled before AI completion.
2023-12-02 22:31:11.775 [info] Running AI completion...
2023-12-02 22:31:20.224 [info] Receive line: {"model":"codellama:7b-code-q4_K_M","created_at":"2023-12-03T06:31:20.223826Z","response":"\n","done":false}
2023-12-02 22:31:20.227 [info] AI completion completed: 
2023-12-02 22:31:20.227 [info] Canceled after AI completion.
2023-12-02 22:31:20.228 [info] Canceled before AI completion.
2023-12-02 22:31:20.228 [info] Canceled before AI completion.
2023-12-02 22:31:20.230 [info] Canceled before AI completion.
2023-12-02 22:31:20.231 [info] Canceled before AI completion.
2023-12-02 22:31:20.233 [info] Canceled before AI completion.

Halp?

from llama-coder.

taoeffect avatar taoeffect commented on June 27, 2024 1

it looks good, just codellama returns a line break

The question is, how do I get it to do anything else? 😅

from llama-coder.

seriouscoderone avatar seriouscoderone commented on June 27, 2024

I think you need to add a comment.

Hi.

I installed it locally on my M1 and it works in CLI. When i click on Llama Coder in top right corner (status bar) of VS Code it does nothing. Sorry for question, maybe its too obvious for me.

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

In current version, i have disabled autocomplete for empty lines, but this was a mistake. Also keep in mind that sometimes neural network would recommend nothing.

from llama-coder.

batuco9 avatar batuco9 commented on June 27, 2024

Could it somehow pull the base models from https://huggingface.co/deepseek-ai be useful?

from llama-coder.

stratus-ss avatar stratus-ss commented on June 27, 2024

I also am not clear how this plugin is supposed to work.

For me ollama is running on a remote host with 48G of ram on the video card. The ollama systemd unit is running and I can see the port is open:

stratus@stratus-desktop ~  $ > /dev/tcp/192.168.99.37/11434
stratus@stratus-desktop ~  $ > /dev/tcp/192.168.99.37/11435
bash: connect: Connection refused
bash: /dev/tcp/192.168.99.37/11435: Connection refused

The above is just to demonstrate that a closed port refuses the connection.

Am I supposed to do

ollama run codellama:34b-code-q6_K

When I try this the model downloads and I have an interactive prompt. I set both the user and workspace settings in the plugin to the following:

image

However, its not quite clear how I should know if Llama coder is working

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

You don't need to download it manually, plugin would download it and shows the download indicator if it is not. Deep Seek model is discussed in #2

from llama-coder.

stratus-ss avatar stratus-ss commented on June 27, 2024

thats for responding. It's still not clear what to expect. It didn't seem to do anything when i traced the network calls on the port. Is there any way to debug the vscode plugin?

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

You can try to open Output window of a plugin, it is named "llama code"

from llama-coder.

stratus-ss avatar stratus-ss commented on June 27, 2024

So I am not exactly sure how to line up my expectations with what this plugin provides.

For example, I tried typing comments, doc strings and a simple for-loop in python.

I tried this with and without ollama running on the remote host and all I see in the output window for Llama Coder is

2023-11-24 19:41:28.460 [info] Llama Coder is activated.
2023-11-24 19:43:10.484 [info] No inline completion required
2023-11-24 19:43:17.462 [info] No inline completion required
2023-11-24 19:43:27.681 [info] No inline completion required
2023-11-24 19:43:54.531 [info] No inline completion required

I would have expected some messages about ollama being off line. Is it possible that someone might provide a known working example of what I should expect from the plugin?

Thanks for your efforts! Really appreciated

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

I recommend to try a fresh version with a smallish models and you probably would see a difference.

from llama-coder.

stratus-ss avatar stratus-ss commented on June 27, 2024

I wanted to just touch base again about this issue.

So I couldn't get the Llama coder plugin to work remotely. So what I did was I used the VSCode ssh extension to connect to and code on the box that I am running ollama on. When I did this I do see the activity in the output tab

2023-11-29 19:09:33.330 [info] [DOWNLOAD] {"status":"verifying sha256 digest"}
2023-11-29 19:09:50.121 [info] [DOWNLOAD] {"status":"writing manifest"}
2023-11-29 19:09:50.121 [info] [DOWNLOAD] {"status":"removing any unused layers"}
2023-11-29 19:09:50.121 [info] [DOWNLOAD] {"status":"success"}
2023-11-29 19:09:50.123 [info] Canceled after AI completion.
2023-11-29 19:09:50.124 [info] Canceled before AI completion.
2023-11-29 19:10:20.019 [info] Running AI completion...

Ollama isn't using my GPUs for some reason but that isn't llama-coder's issue

from llama-coder.

erf avatar erf commented on June 27, 2024

I installed codellama:7b-code-q4_K_M now manually, but no autocomplete seem to occur when typing. On macOS

from llama-coder.

thawkins avatar thawkins commented on June 27, 2024

I'm having the same problem with remote installs, i have a remote server with 64GB and want to use that, ollama is installed and active via systemd, i have set the bindings properly, and the firewall and ports are all correct

my ollama install is on the machine "server01.local" on the default port

using the api to pull the list of installed models from the machine with VSC on it gives me

$ curl http://server01.local:11434/api/tags | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1399 100 1399 0 0 2418 0 --:--:-- --:--:-- --:--:-- 2416
{
"models": [
{
"name": "codellama:13b",
"modified_at": "2023-12-03T01:08:36.168957228+07:00",
"size": 7365960935,
"digest": "9f438cb9cd581fc025612d27f7c1a6669ff83a8bb0ed86c94fcf4c5440555697"
},
{
"name": "codellama:latest",
"modified_at": "2023-12-03T01:02:39.406375348+07:00",
"size": 3825910662,
"digest": "8fdf8f752f6e80de33e82f381aba784c025982752cd1ae9377add66449d2225f"
},
{
"name": "deepseek-coder:33b",
"modified_at": "2023-12-03T00:07:28.625096533+07:00",
"size": 18819455804,
"digest": "2941d6ab92f3165c82487c1022dc07a86f52cfb21258c693e511c5bedf0fb2b1"
},
{
"name": "deepseek-coder:6.7b",
"modified_at": "2023-12-03T00:17:31.082375219+07:00",
"size": 3827833882,
"digest": "72be2442d736e1f3c33b23c1c633d638ff07b445e941d2e343580fb839da72c0"
},
{
"name": "deepseek-coder:latest",
"modified_at": "2023-12-02T23:27:10.73776562+07:00",
"size": 776080218,
"digest": "140a485970a6bbe497984a305bb2c30d25da1d8bf56b688f0aeafd1fbebd11ab"
},
{
"name": "falcon:40b",
"modified_at": "2023-12-02T21:46:51.407166762+07:00",
"size": 23808463019,
"digest": "bc9368437a24284c4dc3b9e3813d21162639ced55fc81a2830e39c17070f803a"
},
{
"name": "falcon:latest",
"modified_at": "2023-12-02T21:33:00.764327582+07:00",
"size": 4210994570,
"digest": "4280f7257e73108cddb43de89eb9fa28350a21aaaf997b5935719f9de0281563"
},
{
"name": "llama2:latest",
"modified_at": "2023-12-02T22:35:54.683311517+07:00",
"size": 3825819519,
"digest": "fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e"
}
]
}

I have set that host URL in the settings but nothing seems to appear

the only thing I saw in the llama coder output window is:

2023-12-02 17:19:15.879 [info] Llama Coder is activated.

from llama-coder.

wrapss avatar wrapss commented on June 27, 2024

@thawkins what does the llama coder output return when you write to an existing file?

from llama-coder.

iLoveBug avatar iLoveBug commented on June 27, 2024

I also have problem on how to run it in VS Code. Ollama is installed and works locally, but when I click the "Llama Coder" at the right bottom corner (status bar) of VS Code it does nothing

from llama-coder.

wrapss avatar wrapss commented on June 27, 2024

it looks good, just codellama returns a line break

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

I would recommend to try deepseek models, they are superior!

from llama-coder.

thawkins avatar thawkins commented on June 27, 2024

Deepseek have just released a 67b parameter version, it will be a little while before it's available on ollama.

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

I have changed default model to deepseek-1b one with q4 quantization that would take only ~700mb of VRAM/RAM. It works faster and provide auto-complete more often.

from llama-coder.

taoeffect avatar taoeffect commented on June 27, 2024

I tried with deepseek but it still doesn't show any suggestions in the UI. I tried pressing the tab key, ctrl+space, neither worked. On macOS M2 using VSCodium.

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

It shows automatically as a grey text, it is not tab or ctrl+space, you just type. Can you show a snipped where you are trying to do so?

from llama-coder.

taoeffect avatar taoeffect commented on June 27, 2024

OK, it seems to be sometimes working... maybe I'm just not understanding how and when it works.

For example, here it won't complete (in a .vue component):

  mounted () {
    this.pushNotificationGranted = 
  },

I feel like the tool could use a "manual autocomplete" mode where the user forces it to generate a completion. Maybe I'm just using it wrong.

from llama-coder.

ex3ndr avatar ex3ndr commented on June 27, 2024

It didn't work after a simple equal, but after any other symbol i got suggestion
image

from llama-coder.

PtradeLLC avatar PtradeLLC commented on June 27, 2024

Same here. Been staring at VS Code for the past half hr wondering how to use this. What good is this if you don't know/can't use it?

from llama-coder.

wrapss avatar wrapss commented on June 27, 2024

Same here. Been staring at VS Code for the past half hr wondering how to use this. What good is this if you don't know/can't use it?

It works perfectly here. so before you say anything, yes, we know how it works, but you don't seem to know how to make a bug report, no errors, no screenshots, no walkthrough.
work

from llama-coder.

stratus-ss avatar stratus-ss commented on June 27, 2024

I assume the language doesn't matter. I have been having some problems with python (specifically gradio).

Let me refine this statement. I will try changing to the deepseek its possible that its an issue with the LLM and not the plugin

from llama-coder.

stratus-ss avatar stratus-ss commented on June 27, 2024

So here is a code snippet that didn't return anything for me

if is_not_member(user, members):
                    logging.info(f"Adding {user} as an owner of --> {org['name']} <--")
                    response = yaml.load(quay_server_api.create_org_member(org_name=org['name'], new_member=user, team_name="owners").content, Loader=yaml.FullLoader)
                    # check the response and if it has the key of 'error_message' use logging.debug to show the user
                    if response['error_message']:

The output window shows this

2023-12-08 19:53:31.146 [info] AI completion completed: 
2023-12-08 19:53:31.146 [info] Canceled after AI completion.
2023-12-08 19:53:31.148 [info] Canceled before AI completion.
2023-12-08 19:53:31.268 [info] Running AI completion...
2023-12-08 19:53:31.516 [info] Receive line: {"model":"deepseek-coder:1.3b-base-q4_1","created_at":"2023-12-09T01:53:31.508277645Z","response":"","done":true,"total_duration":246028369,"load_duration":654242,"prompt_eval_count":179,"prompt_eval_duration":196113000,"eval_count":1}
2023-12-08 19:53:31.516 [info] AI completion completed: 
2023-12-08 19:53:31.516 [info] Canceled after AI completion.
2023-12-08 19:53:31.521 [info] Running AI completion...
2023-12-08 19:53:31.768 [info] Receive line: {"model":"deepseek-coder:1.3b-base-q4_1","created_at":"2023-12-09T01:53:31.760166256Z","response":"","done":true,"total_duration":244460612,"load_duration":706569,"prompt_eval_count":177,"prompt_eval_duration":193939000,"eval_count":1}
2023-12-08 19:53:31.768 [info] AI completion completed: 
2023-12-08 19:53:35.825 [info] Running AI completion...
2023-12-08 19:53:36.074 [info] Receive line: {"model":"deepseek-coder:1.3b-base-q4_1","created_at":"2023-12-09T01:53:36.066184699Z","response":"","done":true,"total_duration":246527613,"load_duration":865777,"prompt_eval_count":177,"prompt_eval_duration":196048000,"eval_count":1}
2023-12-08 19:53:36.074 [info] AI completion completed: 

The ollama service on the remote host has the following log generated as i typed if response['error_message']

Dec 08 19:56:13 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:13 | 200 |     453.235µs |   192.168.99.20 | GET      "/api/tags"
Dec 08 19:56:13 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086973,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58738,"status":200,"method":"HEAD","path":"/","params":{}}
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        load time =    1379.13 ms
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:      sample time =       0.54 ms /     1 runs   (    0.54 ms per token,  1848.43 tokens per second)
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings: prompt eval time =     252.75 ms /   178 tokens (    1.42 ms per token,   704.24 tokens per second)
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086974,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58738,"status":200,"method":"POST","path":"/completion","params":{}}
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:       total time =     280.78 ms
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086974,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58748,"status":200,"method":"POST","path":"/tokenize","params":{}}
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:14 | 200 |  303.636975ms |   192.168.99.20 | POST     "/api/generate"
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:14 | 200 |     920.277µs |   192.168.99.20 | GET      "/api/tags"
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086974,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58748,"status":200,"method":"HEAD","path":"/","params":{}}
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        load time =    1379.13 ms
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:      sample time =       0.54 ms /     1 runs   (    0.54 ms per token,  1848.43 tokens per second)
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings: prompt eval time =     192.16 ms /   178 tokens (    1.08 ms per token,   926.34 tokens per second)
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:       total time =     219.87 ms
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086974,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58748,"status":200,"method":"POST","path":"/completion","params":{}}
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086974,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58758,"status":200,"method":"POST","path":"/tokenize","params":{}}
Dec 08 19:56:14 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:14 | 200 |  242.418587ms |   192.168.99.20 | POST     "/api/generate"
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:16 | 200 |     811.308µs |   192.168.99.20 | GET      "/api/tags"
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086976,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58758,"status":200,"method":"HEAD","path":"/","params":{}}
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        load time =    1379.13 ms
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:      sample time =       0.54 ms /     1 runs   (    0.54 ms per token,  1848.43 tokens per second)
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086976,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58758,"status":200,"method":"POST","path":"/completion","params":{}}
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings: prompt eval time =     196.29 ms /   182 tokens (    1.08 ms per token,   927.20 tokens per second)
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
Dec 08 19:56:16 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:       total time =     223.89 ms
Dec 08 19:56:17 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086977,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58762,"status":200,"method":"POST","path":"/tokenize","params":{}}
Dec 08 19:56:17 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:17 | 200 |  246.343976ms |   192.168.99.20 | POST     "/api/generate"
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:18 | 200 |     296.526µs |   192.168.99.20 | GET      "/api/tags"
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086978,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58762,"status":200,"method":"HEAD","path":"/","params":{}}
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 19:56:18 | 200 |  232.884092ms |   192.168.99.20 | POST     "/api/generate"
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        load time =    1379.13 ms
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:      sample time =       1.10 ms /     2 runs   (    0.55 ms per token,  1814.88 tokens per second)
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings: prompt eval time =     200.44 ms /   182 tokens (    1.10 ms per token,   907.98 tokens per second)
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        eval time =      31.04 ms /     1 runs   (   31.04 ms per token,    32.22 tokens per second)
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:       total time =     259.93 ms
Dec 08 19:56:18 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702086978,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":58762,"status":200,"method":"POST","path":"/completion","params":{}}

I have verified that the model downloaded. I am not including the output window of this because I think that's largely pointless.

NVTOP shows the model is loaded into video ram

 Device 0 [Quadro P6000] PCIe GEN 1@16x RX: 0.000 KiB/s TX: 0.000 KiB/s Device 1 [Quadro P6000] PCIe GEN 1@16x RX: 0.000 KiB/s TX: 0.000 KiB/s
 GPU 139MHz  MEM 405MHz  TEMP  36°C FAN  26% POW  17 / 250 W            GPU 139MHz  MEM 405MHz  TEMP  31°C FAN  26% POW   9 / 250 W
 GPU[                            0%] MEM[|||||        4.519Gi/24.000Gi] GPU[                            0%] MEM[|            0.929Gi/24.000Gi]

I have installed this on the remote host using the vscode remote ssh ability. The remote host is running RHEL 9.

These are the plugin settings for the RHEL 9 remote host:

image

If I attempt the same process (i.e. installing the plugin on the remote host) but the remote host is a Raspberry Pi 4 running Raspberry Pi OS I do see things with just simply typing if

Output window:

2023-12-09 02:01:34.985 [info] Llama Coder is activated.
2023-12-09 02:02:30.326 [info] Canceled after AI completion.
2023-12-09 02:02:30.367 [info] Running AI completion...
2023-12-09 02:02:33.869 [info] Receive line: {"model":"deepseek-coder:1.3b-base-q4_1","created_at":"2023-12-09T02:02:33.901714836Z","response":" Utility","done":false}
2023-12-09 02:02:33.902 [info] Receive line: {"model":"deepseek-coder:1.3b-base-q4_1","created_at":"2023-12-09T02:02:33.936255334Z","response":".","done":false}
2023-12-09 02:02:33.921 [info] AI completion completed:  Utility

Ollama logs:

Dec 08 20:02:30 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 20:02:30 | 200 |     342.691µs |  192.168.99.244 | GET      "/api/tags"
Dec 08 20:02:30 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 20:02:30 | 200 |     281.494µs |  192.168.99.244 | GET      "/api/tags"
Dec 08 20:02:30 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702087350,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":48490,"status":200,"method":"HEAD","path":"/","params":{}}
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1066]: [GIN] 2023/12/08 - 20:02:33 | 200 |  3.536628869s |  192.168.99.244 | POST     "/api/generate"
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        load time =    1379.13 ms
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:      sample time =       2.14 ms /     4 runs   (    0.54 ms per token,  1866.54 tokens per second)
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings: prompt eval time =    3449.16 ms /  5385 tokens (    0.64 ms per token,  1561.25 tokens per second)
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:        eval time =      69.55 ms /     3 runs   (   23.18 ms per token,    43.13 tokens per second)
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1066]: llama_print_timings:       total time =    3549.24 ms
Dec 08 20:02:33 gpt-gpu.stratus.lab ollama[1808]: {"timestamp":1702087353,"level":"INFO","function":"log_server_request","line":1233,"message":"request","remote_addr":"127.0.0.1","remote_port":48490,"status":200,"method":"POST","path":"/completion","params":{}}

These are the plugin settings for the Raspberry PI remote host:

image

What other information can I provide that would be useful?

from llama-coder.

gerroon avatar gerroon commented on June 27, 2024

I do not get how I am supposed to be invoking this :( I have llama with codellama:13b-code-q4_K_M running on my local. I can use ollama locally and ask questions. The Vscode extension does not seem to be doing or invoking anything. Is there a shortcut to invoke it?

It creates activity there it is connected but no results

                                                      [GIN] 

2023/12/12 - 21:28:49 | 200 |  1.717110153s |       127.0.0.1 | POST     "/api/generate"                                                                                                                                                                 [GIN] 

2023/12/12 - 21:28:49 | 200 |     410.641µs |       127.0.0.1 | GET      "/api/tags"                                                                                                                                                                     {"timestamp":1702438129,"level":"INFO","function":"log_server_request","line":2599,"message":"request","remote_addr":"127.0.0.1","remote_port":44842,"status":200,"method":"HEAD","path":"/","params":{}}                                                      {"timestamp":1702438129,"level":"INFO","function":"log_server_request","line":2599,"message":"request","remote_addr":"127.0.0.1","remote_port":44832,"status":200,"method":"POST","path":"/completion","params":{}}                                            [GIN] 

2023/12/12 - 21:28:50 | 200 |     405.054µs |       127.0.0.1 | GET      "/api/tags"                                                                                                                                                                     {"timestamp":1702438169,"level":"INFO","function":"log_server_request","line":2599,"message":"request","remote_addr":"127.0.0.1","remote_port":44842,"status":200,"method":"POST","path":"/completion","params":{}}                                            [GIN] 

2023/12/12 - 21:29:29 | 200 | 40.303350313s |       127.0.0.1 | POST     "/api/generate"                                                                                                                                                                 {"timestamp":1702438169,"level":"INFO","function":"log_server_request","line":2599,"message":"request","remote_addr":"127.0.0.1","remote_port":54244,"status":200,"method":"HEAD","path":"/","params":{}}                                                      [GIN] 

2023/12/12 - 21:29:31 | 200 | 40.278685691s |       127.0.0.1 | POST     "/api/generate"                                                                                                                                                                 {"timestamp":1702438171,"level":"INFO","function":"log_server_request","line":2599,"message":"request","remote_addr":"127.0.0.1","remote_port":54244,"status":200,"method":"POST","path":"/completion","params":{}}  

from llama-coder.

MikeyBeez avatar MikeyBeez commented on June 27, 2024

VSCode downloaded the model but the extension does nothing.

from llama-coder.

yekm avatar yekm commented on June 27, 2024

Similar issue here. Just installed. Configured remote host. Nothing changed, nothing happens.
Plugin output contains one line [info] Llama Coder is activated.

Upd: after enabling editor.inlineSuggest.enabled everything works fine

from llama-coder.

itamark-targa avatar itamark-targa commented on June 27, 2024

Trying to use it on VSCode connected to WSL, the ollama itself is running on a remote machine and I port-forward 11434 to local, but nothing works. In the extension output I see

2024-01-30 22:20:16.977 [info] Llama Coder is activated.
2024-01-30 22:20:22.789 [warning] Error during inference: fetch failed
2024-01-30 22:20:25.271 [warning] Error during inference: fetch failed
2024-01-30 22:20:26.168 [warning] Error during inference: fetch failed
2024-01-30 22:20:26.663 [warning] Error during inference: fetch failed

Note that if I open a browser at localhost:11434 I do get a Ollama is running response

from llama-coder.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.