Giter Site home page Giter Site logo

lmstudio-bug-tracker's Introduction

lmstudio-bug-tracker

Bug tracking for the LM Studio desktop application

lmstudio-bug-tracker's People

Contributors

yagil avatar

Stargazers

Ayoub Achak avatar AdAeternum avatar

Watchers

 avatar  avatar  avatar

lmstudio-bug-tracker's Issues

Please help. unable to run a model even with sufficient memory.

{
  "cause": "(Exit code: 0). Some model operation failed. Try a different model and/or config.",
  "suggestion": "",
  "data": {
    "memory": {
      "ram_capacity": "63.71 GB",
      "ram_unused": "54.15 GB"
    },
    "gpu": {
      "gpu_names": [
        "NVIDIA GeForce RTX 4060 Laptop GPU"
      ],
      "vram_recommended_capacity": "8.00 GB",
      "vram_unused": "6.93 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.22631"
    },
    "app": {
      "version": "0.2.26",
      "downloadsDir": "C:\\Users\\genco\\.cache\\lm-studio\\models"
    },
    "model": {}
  },
  "title": "Error loading model."
}```

LMStudio App "Cors=true" button broken

Looks like you pushed an update today! Is there a way to turn off autoupdates?

[2024-05-07 10:32:53.913] [INFO] [LM STUDIO SERVER] Stopping server..
[2024-05-07 10:32:53.914] [INFO] [LM STUDIO SERVER] Server stopped
[2024-05-07 10:33:03.858] [INFO] [LM STUDIO SERVER] Verbose server logs are ENABLED
[2024-05-07 10:33:03.858] [INFO] [LM STUDIO SERVER] Heads up: you've enabled CORS. Make sure you understand the implications
[2024-05-07 10:33:03.913] [INFO] [LM STUDIO SERVER] Stopping server..

The error throws the warning message, and then turns off. the warning is being thrown as an error when it shouldn't be.

npx lmstudio install-cli

I fixed this by installing the cli tool which I assume overwrote whatever is broken by the update script.

Not able to select 2nd hd to hold the models

I keep getting hanging and it asking me to return the drive back to main hd which is a small 128 m.2 which fills quickley, i had it working on old version but when i downloaded updates it stopped working it put the files here E:\Model and got this error
Capture

LM Studio 0.2.22 running out of memory with context sizes larger than 56k (model supports 1024k)

When trying to utilize the full context size for this model https://huggingface.co/vsevolodl/Llama-3-70B-Instruct-Gradient-1048k-GGUF i get an out of RAM(?) error like this:

{
  "title": "Failed to load model",
  "cause": "",
  "errorData": {
    "n_ctx": 1048576,
    "n_batch": 512,
    "n_gpu_layers": 81
  },
  "data": {
    "memory": {
      "ram_capacity": "314.65 GB",
      "ram_unused": "316.65 KB"
    },
    "gpu": {
      "type": "NvidiaCuda",
      "vram_recommended_capacity": "141.90 GB",
      "vram_unused": "130.46 GB"
    },
    "os": {
      "platform": "linux",
      "version": "5.15.0-106-generic",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.22",
      "downloadsDir": "/home/loading/.cache/lm-studio/models"
    },
    "model": {}
  }
}

so, it claims that the ram is kinda used but when in fact htop only reports a 10gb RAM usage and LM Studio itself (at the top right) reports 48GB of RAM being used (although i believe, this might include the VRAM being used).

i try to fully offload to GPU.

i also noticed a bit of a slow down during the loading process. so it loads slower and slower until the above error pops up, but i dont know if this is as its supposed to be. maybe its just faking the progress bar, a little bit, and towards the end it realizes that there is still ways to go to load the rest of the model.

The model works with context sizes of up to 56k, everything larger ends with the above error.

i can use larger models than this with no issues (although they only have 8k context size). right now i tested https://huggingface.co/lmstudio-community/Meta-Llama-3-120B-Instruct-GGUF/ fully offloaded and it works like a charme (kinda. could run faster but its doing ok).

Server will not start

When I click on Start Server. It does nothing. It will not start and the logs are blank

Missing Application Settings

Missing Application Settings

  • Missing application settings menu
  • Missing light mode for the visually challenged and/or weirdos who like light mode
  • Missing options/control for customized install location of desktop application
  • Everything else [...]

Bug in the handling of connections identified, possible source of requests hanging.

Think I found where LM Studio is failing to render a response in some front ends.

HTTP 1.1 mandates a connection header, and that header can be either "keep-alive" or "close"

Case in point, when using sillytavern with LMStudio on the back end, sometimes the connection will appear to hang because the connection isn't being closed on the server side after sending the text completion, when SillyTavern had specified "Connection: Close" as the header.

This might be the cause behind a number of other issues on this board too.

You can try this for yourself with using https://github.com/oobabooga/text-generation-webui.git to compare it to and using wireshark to capture the traffic. Set both to use the same API port, same settings, and then try both one after the other from silly tavern to see when ST hangs.

It'll be because LM Studio isn't respecting the connection header. This is a problem with how LM Studio handles HTTP traffic at the protocol level concerning when to terminate the TCP connection.

Buld 0.2.23 of LM studio seems to still have some 22.04 binaries

Hi,
I'm trying to use LM Studio 0.2.23 on RHEL8 (usually has the same requirements as Ubuntu 20.04).
Thank you for agreeing to downgrade your build chains to 20.04., btw
The AppImage now starts but there is still a GLIBC errror and a popup comes up:

$ ./LM_Studio-0.2.23.AppImage 
15:49:17.173 › App starting...
(node:2575489) UnhandledPromiseRejectionWarning: ReferenceError: Cannot access 'q' before initialization
    at /tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/index.js:11:38584
(Use `lm-studio --trace-warnings ...` to show where the warning was created)
(node:2575489) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
15:49:17.617 › Downloads folder from settings.json: /export/home/raistlin/.cache/lm-studio/models
15:49:17.621 › Extensions backends directory already exists at /export/home/raistlin/.cache/lm-studio/extensions/backends
15:49:17.624 › Available backend descriptors:
{
  "extension" : [],
  "bundle" : [
    {
      "path": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/CUDA",
      "manifest": {
        "target_libraries": [
          {
            "name": "llm_engine_cuda.node",
            "type": "llm_engine",
            "version": "0.1.0"
          },
          {
            "name": "liblmstudio_bindings_cuda.node",
            "type": "liblmstudio",
            "version": "0.2.23"
          }
        ],
        "type": "llama_cuda",
        "platform": "linux",
        "supported_model_formats": [
          "gguf"
        ]
      }
    },
    {
      "path": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/NoGPU",
      "manifest": {
        "target_libraries": [
          {
            "name": "llm_engine.node",
            "type": "llm_engine",
            "version": "0.1.0"
          },
          {
            "name": "liblmstudio_bindings.node",
            "type": "liblmstudio",
            "version": "0.2.23"
          }
        ],
        "type": "llama_cpu",
        "platform": "linux",
        "supported_model_formats": [
          "gguf"
        ]
      }
    },
    {
      "path": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/OpenCL",
      "manifest": {
        "target_libraries": [
          {
            "name": "llm_engine_clblast.node",
            "type": "llm_engine",
            "version": "0.1.0"
          },
          {
            "name": "liblmstudio_bindings_clblast.node",
            "type": "liblmstudio",
            "version": "0.2.23"
          }
        ],
        "type": "llama_opencl",
        "platform": "linux",
        "supported_model_formats": [
          "gguf"
        ]
      }
    }
  ]
}
15:49:17.624 › Backend keys and libpaths for use:
{
  "llama_cuda" : {
    "libLmStudioPath": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/CUDA/liblmstudio_bindings_cuda.node",
    "llmEngineLibPath": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/CUDA/llm_engine_cuda.node"
  },
  "llama_cpu" : {
    "libLmStudioPath": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/NoGPU/liblmstudio_bindings.node",
    "llmEngineLibPath": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/NoGPU/llm_engine.node"
  },
  "llama_opencl" : {
    "libLmStudioPath": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/OpenCL/liblmstudio_bindings_clblast.node",
    "llmEngineLibPath": "/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/OpenCL/llm_engine_clblast.node"
  }
}
15:49:17.625 › Surveying backend-hardware compatibility...
15:49:17.625 › Loading LM Studio core from: '/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/CUDA/liblmstudio_bindings_cuda.node'
15:49:17.834 › Error message recieved from LMSCore process: Failed to load libLmStudio: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/CUDA/liblmstudio_bindings_cuda.node)
15:49:17.836 › Error while surveying hardware with backend 'llama_cuda': LMSCore load lib failed - child process with PID 2575757 exited with code 1
1th kill failed
15:49:17.838 › Loading LM Studio core from: '/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/NoGPU/liblmstudio_bindings.node'
15:49:17.989 › Error message recieved from LMSCore process: Failed to load libLmStudio: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/NoGPU/liblmstudio_bindings.node)
15:49:17.991 › Error while surveying hardware with backend 'llama_cpu': LMSCore load lib failed - child process with PID 2575921 exited with code 1
1th kill failed
15:49:17.991 › Loading LM Studio core from: '/tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/OpenCL/liblmstudio_bindings_clblast.node'
2th kill failed
2th kill failed
15:49:18.230 › Error message recieved from LMSCore process: Failed to load libLmStudio: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /tmp/.mount_LM_StuqJLxgm/resources/app/.webpack/main/build/Release/OpenCL/liblmstudio_bindings_clblast.node)
15:49:18.231 › Error while surveying hardware with backend 'llama_opencl': LMSCore load lib failed - child process with PID 2576036 exited with code 1
15:49:18.231 › Backend-hardware compatibility survey complete:

}
D [fallbackBackendPref] Initializing FileData
3th kill failed
3th kill failed
4th kill failed
4th kill failed
5th kill failed
5th kill failed
6th kill failed
6th kill failed
7th kill failed
7th kill failed
8th kill failed
8th kill failed
9th kill failed
9th kill failed
10th kill failed
10th kill failed
11th kill failed
Too many fails, giving up.
11th kill failed
Too many fails, giving up.


[Ubuntu 24.04] Chrome sandbox issue

I'm having an issue on Ubuntu 24.04 that I cannot start the app image. The following issue is thrown:

./LM_Studio-0.2.24.AppImage                                                                                                                                                              ─╯
[7882:0526/125358.338741:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_LM_StuRBujGc/chrome-sandbox is owned by root and has mode 4755.
[1]    7882 trace trap (core dumped)  ./LM_Studio-0.2.24.AppImage

Changing the binary behind download link breaks Distribution (in Nixpkgs)

Hello LMStudio-Team,

much appreciate your efforts, building a great tool for working with LLMs locally. I'm a fan and love to use it almost every productive day.

I'm writing you as a maintainer of the lmstudio package in nixpkgs, one of the largest and most up to date repositories. We recently had an issue, because you have changed the binary available under the download link (https://releases.lmstudio.ai/mac/arm64/0.2.22/b/latest/LM-Studio-0.2.22-arm64.dmg at the time), which results in breaking the lmstudio package for Nix(OS).

A stable download link is favourable, because it gives us the security, that the package we download is exactly the package, we expect it to be. This does help avoiding supply chain attacks, as well as just gives general reliability, as it does keep every operation reproducible.
If there was a critical (especially security) issue, I think, everyone can understand quite well, that you would want to stop distribution altogether ASAP, however that's not the case, as far as I'm informed.

I fear, with breaking existing binary download links, we might lose acceptance for the package in Nixpkgs altogether.
I would be glad to keep the package available to NixOS users using the official Nixpkgs repository.

Is it possible to establish some pattern of stable download links, serving the same binary for as long of a period as possible?

Thanks a lot for your time.

Best Regards
Dean

[REQUEST] - Suggestion for Enhancements in LM Studio - Draft History & Undo Functionality

Hello,

Firstly, I'd like to express my gratitude for the exceptional LLM platform, LM Studio. It has significantly enriched my experience with AI-generated content.

I kindly request two enhancements to further optimize the user interface:

  1. Multiple Draft History & Regeneration Cycling: Similar to the draft icon found in chatGPT, adding multiple draft history would allow seamless cycling between various proposed text generations for comparison and selection of the best fit. Currently, I need to copy each proposition into MS Word, which introduces unnecessary inconvenience.
  2. Undo Button for Accidental Deletion: The ease with which prompts and generated text can be deleted on your platform makes it susceptible to unintended deletions. Implementing an undo button would prevent such occurrences and ensure data integrity.

Additionally, I propose incorporating distinct text colors for User Input and AI-generated Text as a visual code, enhancing readability and clarity within the interface.

Thank you for considering these suggestions. Your continuous improvement efforts make LM Studio an increasingly valuable tool in my workflow.

Best regards,

Please add support for intel Mac.

Very cool application. It would be cool if there was a version for intel Mac. I know llama.cpp has problems with Metal on intel Mac. But what about CPU or Vulkan version?

Concurrent embeddings requests cause requests to hang

When making concurrent requests while request queue on, many of the requests are not returning responses at all. Only a few requests are returning responses. This is most likely an issue with the request queue logic.

It works fine if the requests are made sequentially

image

image

Context Menu Issue: Only "Copy Entire Message" Option Available

Description:

The current context menu in LMstudio only has one option, "Copy Entire Message", which is limiting and counterintuitive. A context menu with only one option is essentially a longer way to implement the same behavior as the terminal's Copy-on-select or Copy-on-highlight feature.

To make matters worse, if you want to copy a specific portion of the text, and you forgetfully use right-click to try and do so, you have to:

  • Click off the message to deselect the text

  • Reselect the desired text

  • Remember to use the keyboard shortcuts (e.g. Ctrl+C) to copy the selection, since the context menu only offers "Copy Entire Message"

This behavior is inconsistent with most applications, which typically offer a "Copy" or "Copy Selection" option in the context menu. This inconsistency causes wasted time and frustration.

Severity: Minor (but affects usability)

Reproducibility: Always (when right-clicking on selected text)

Environment: LMstudio GUI application (version [insert version number])

Expected Behavior:

  • A context menu with multiple options, including "Copy Selection" or "Copy"

  • Ability to copy selected text using the context menu or keyboard shortcuts

Actual Behavior:

  • A context menu with only one option, "Copy Entire Message"

  • No ability to copy selected text using the context menu

Request:

Please update the context menu to include a "Copy Selection" or "Copy" option, allowing users to easily copy selected text without having to use keyboard shortcuts or work around the current limitation. This will improve the user experience and make the application more intuitive to use.

Local Server - Generation of a fixed token after long use

After using the Local Server for multiple requests without resenting it or creating a new one, all the LLM used starts to generate text in a fixed way. Essentially, they stop generating a response and start generate a fixed number endlessly. Here an example of output:

  • After a long time of being up and running, the Local Server starts to generate a fi7777777777777777777777777777777777777777777777777777777777777777777777777777777777...

This process will never stop. even after 10 min. The LLM was loaded 15 layers on GPU and all the remaining on the CPU, using langchain for interacting with the Local Server.

These are the LLM used for the Local Server:

  • dolphin-2.2.1-mistral-7B-GGUF/mistral-7b-instruct-v0.1.Q4_K_M.gguf
  • Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q5_K_M.gguf

For the scenario, I had some documents that I wanted to summarize in a path on my laptop and, for each one of them I asked for a concise summary to the Local Server through LangChain. After the 14* or 15* request, the LLMs used started to print something like the example above.

This will occurs independently of the LLM used.

lm studio not running any model successfully

windows 2022 Datacenter
64gig ram
16 core
version 0.2.23
It ran successfully for me on three machines but this machine was not happy after upgrade and llama3 download.
the modal just does not seem to respond. I did extract a log file attached. Not sure how to clean this up to working accept to
reload windows which is obviously undesireable.
main.log

Gemma 2 significantly worse in 0.2.27

I just upgraded LM Studio to 0.2.27 and the latest Gemma 2 quants and it's complete trash now. For example, I used to get a good answer to this test question: "how much time in minutes is needed to heat a room 3 x 5 x 7 m from 0 C to 30 C with 2kw heater?". Not anymore. Right now it even refuses to answer - it says "I can't give you the answer to that question, here's why:"

What happened? How is it improved when it's objectively way worse, and I didn't touch any inference parameters?

Multi-model session cpu and gpu support?

Right now multi-model sessions are limited to only support full GPU offload. Any plans for supporting CPU offload so that we are able to run multiple models that uses VRAM across CPU and GPU?

Font size too small and not configurable

Given how many useful settings the app has, it boggles the mind there is no option to change at least the font size. Having the app open on a big monitor, it is barely readable from a healthy distance. I really don't feel like sitting glued to the screen to be able to read code blocks.

As the poor grandma demonstrates below, the goal is to use app comfortably without having to consult your nearest optician after a few days of use. Now, be a good lad and save the grandma from turning into Ray Charles.

image
image

Exllamav2

Hi , are you planning on adding exllamav2 in the near future?

X11 Forwarding Issue on Windows Client for lmstudio

I am encountering an issue where lmstudio fails to run via X11 forwarding from a Linux server to a Windows client. The application works correctly when both the server and the client are Linux machines, but it does not work with a Windows client using Xming and PuTTY for X11 forwarding. Simpler applications like gedit and nautilus run without issues.

Steps to Reproduce:
Setup:

Linux Server: Ubuntu 20.04
Windows Client: Windows 11
Xming: Version 6.9.0.31
PuTTY: Version 0.76
Xming Configuration:

Started Xming with the -ac option to disable access control.
PuTTY Configuration:

Enabled X11 forwarding with X display location set to localhost:0.
Linux Server Configuration:

Ensured the following lines are present and uncommented in /etc/ssh/sshd_config:

X11Forwarding yes
X11DisplayOffset 10
X11UseLocalhost yes

Restarted the SSH service.
Running lmstudio:
Connected to the Linux server via PuTTY.
Exported the DISPLAY variable:
export DISPLAY=localhost:10.0
Attempted to run lmstudio with indirect rendering:
LIBGL_ALWAYS_INDIRECT=1 ./lmstudio

Observed Behavior:
lmstudio fails to start, with errors indicating issues related to OpenGL and EGL initialization.
Relevant error messages include:

./lmstudio
14:55:02.357 › GPU info: '04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)'
14:55:02.377 › Got GPU Type: unknown
14:55:02.378 › LM Studio: gpu type = Unknown
14:55:02.401 › App starting...
[525067:0607/145507.633420:ERROR:(-1)] Check failed: false.
[525067:0607/145507.633459:ERROR:(-1)] Check failed: false.
14:55:07.760 › Downloads folder from settings.json: /home/jugs/.cache/lm-studio/models
[525124:0607/145520.105959:ERROR:angle_platform_impl.cc(43)] Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
ERR: Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
[525124:0607/145520.110539:ERROR:gl_display.cc(504)] EGL Driver message (Critical) eglInitialize: Unsupported GLX version (requires at least 1.3).
[525124:0607/145520.110744:ERROR:gl_display.cc(793)] eglInitialize OpenGL failed with error EGL_NOT_INITIALIZED, trying next display type
[525124:0607/145520.125369:ERROR:angle_platform_impl.cc(43)] Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
ERR: Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
[525124:0607/145520.125660:ERROR:gl_display.cc(504)] EGL Driver message (Critical) eglInitialize: Unsupported GLX version (requires at least 1.3).
[525124:0607/145520.125755:ERROR:gl_display.cc(793)] eglInitialize OpenGLES failed with error EGL_NOT_INITIALIZED
[525124:0607/145520.125843:ERROR:gl_display.cc(819)] Initialization of all EGL display types failed.
[525124:0607/145520.125948:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[525124:0607/145522.677446:ERROR:angle_platform_impl.cc(43)] Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
ERR: Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
[525124:0607/145522.677758:ERROR:gl_display.cc(504)] EGL Driver message (Critical) eglInitialize: Unsupported GLX version (requires at least 1.3).
[525124:0607/145522.677821:ERROR:gl_display.cc(793)] eglInitialize OpenGL failed with error EGL_NOT_INITIALIZED, trying next display type
[525124:0607/145522.692794:ERROR:angle_platform_impl.cc(43)] Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
ERR: Display.cpp:1019 (initialize): ANGLE Display::initialize error 12289: Unsupported GLX version (requires at least 1.3).
[525124:0607/145522.693003:ERROR:gl_display.cc(504)] EGL Driver message (Critical) eglInitialize: Unsupported GLX version (requires at least 1.3).
[525124:0607/145522.693077:ERROR:gl_display.cc(793)] eglInitialize OpenGLES failed with error EGL_NOT_INITIALIZED
[525124:0607/145522.693144:ERROR:gl_display.cc(819)] Initialization of all EGL display types failed.
[525124:0607/145522.693210:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[525124:0607/145522.709838:ERROR:(-1)] Check failed: false.
[525124:0607/145522.710101:ERROR:(-1)] Check failed: false.
[525124:0607/145522.710440:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization

Expected Behavior:
lmstudio should start and function correctly via X11 forwarding on a Windows client, similar to its behavior on a Linux client.
Additional Information:
The application works perfectly when the client is a Linux machine.
Using gedit and other simpler applications via X11 forwarding on the Windows client works without any issues.
Screenshot (27)
Screenshot (28)
Screenshot (26)

Built-in updater takes too long to download

On a good day, the built-in updater takes about 20 minutes to download the updated LM-Studio. Today, I quit after an hour and a half and downloaded it with a browser from lmstudio.ai in 45 seconds.

When I started the update, I was also downloading a large file and that was consuming most of my bandwidth. I am guessing that the updater is trying to be smart and capping itself at whatever bandwidth it calculates it has at the time you press the "download & install" button instead of just getting the file and letting the network do what it does best.

Before giving up, I took a look in Task Manager and the 5 LM-Studio threads were consuming 0% network at a time when it should be consuming ~100% since everything else I was doing had completed.

How do I increase the font size in LM Studio?

When I opened the app for the first time, there was a conversation, and it said to increase the font size, 'Under the View' menu.

I tried to ask where the View menu is, but I can't get a model installed because I can't read your screen at all. And you have blue text button on a black background, which should be light colored text if you care about accessibility.

I can't install a model if I can't read the screen. It looks like this app was written by someone in their 20's.

Vision models stop working after first attempt

I'm seeing a weird behavior with vision models.

I am using the Default LM Studio Windows config, which is the only one I have been able to get vision models to work with.

I have tried 2 different models: xtuner's llava llama 3 f16 and jartine's llava v 1.5

Both models and both in the chat interface and the local API deployment (using the vision example), when I ask for an image description I get a perfect description on the first request, and then a random response after that (usually mentioning collages). I'm not sure what's causing this, but its fairly consistent.

Might be related to t his issue: lmstudio-ai/.github#26

LM Studio 0.2.22 (linux) on Ubuntu 24.04 LTS won't recognize ROCm environment

I am successfully running a ROCm stable diffusion setup using PyTorch-ROCm on a 6900 XT. I have the following ROCm and HIP libraries installed with no system error messages:
Screenshot from 2024-05-10 21-32-11
However, LM Studio will only report that I have an OpenCL GPU installed. It recognizes the 16GB of VRAM, but still throws an error when I try to load a model. I can run phi3 easily and with speed on just the CPU. But I'd like to run a larger model on the GPU.

Run error

system OS: ubuntu 20.04

error info: best backend info is null after logic attempting to get (no Mac) at t.backendmanager.getbestendnonmacgguforthrow

image

Same Seed, Different Result using LM Studio API

I'm working on a project , and I need to track the seed used in each generation so that I can reproduce the output when needed using the same config ( and same seed ). However, I find that it's not always the case.

I tried using the seed 42, and it gave me the exact same result each time with the same config.
When I tried a larger number 1715852364 ( which I usually get from the epoch time ) I found out that it gives different results.

  • Output with seed 42 (Exactly the same twice ):
    image

  • Output with seed 1715852364 ( Completly different ):
    image

Here is the code I used to produce this bug (which is part of my code project):
lm_studio.py :

import requests
try :
    from base import BaseModel
except :
    from models.ai.base import BaseModel
from typing import Any, Dict
from openai import OpenAI
import json

class LMStudioModel(BaseModel):
    def __init__(self, api_url: str, headers: Dict[str, str], config: Dict[str, Any]) -> None:
        super().__init__(api_url, headers, config)
        self.client = OpenAI(base_url=api_url, api_key="not-needed")

    def __str__(self) -> str:
        return "LMStudioModel"
    
    def __repr__(self) -> str:
        return f"{self.__class__.__name__}(api_url={self.api_url}, headers={self.headers}, config={self.config})"
    
    @staticmethod
    def load_config(config_path: str) -> Dict[str, Any]:
        return super(LMStudioModel, LMStudioModel).load_config(config_path)

    def generate_text(self, prompt: str, parameters: Dict[str, Any]) -> Any:
        # Adjust parameters based on the method signature and expected parameters
        data = {
            "messages": [
                {"role": "system", "content": parameters.get("instructions", "You are an intelligent assistant. You always provide well-reasoned answers that are both correct and helpful.")},
                {"role": "user", "content": prompt}
            ],
            "temperature": parameters.get("temperature", 0.7),
            "max_tokens": parameters.get("max_tokens", -1),
            "stream": parameters.get("stream", False)
        }
        response = requests.post(self.api_url + "/chat/completions", headers=self.headers, json=data)
        try:
            response.raise_for_status()
            return response.json()
        except Exception as e:
            return {"error": str(e)}

    def predict(self, prompt: str, params: Dict[str, Any] = None) -> Any:
        if params is None:
            params = self.config.get('default_parameters', {})
        response = self.generate_text(prompt, params)
        return response
    
    def inference(self, prompt, seed=None) -> str:
        chat_completion = self.client.chat.completions.create(
            messages=[
                {
                    "role": "user",
                    "content": prompt.strip(),
                }
            ],
            model="not-needed", # unused 
            seed=seed
        )
        return chat_completion.choices[0].message.content

    def sys_inference(self, sys_prompt: str, usr_prompt: str, seed=None) -> str:
        print("Using seed %s with type %s" % (seed, type(seed)))
        chat_completion = self.client.chat.completions.create(
            messages=[
                {"role": "system", "content": sys_prompt},
                {
                    "role": "user",
                    "content": usr_prompt.strip(),
                }
            ],
            model="not-needed", # unused 
            temperature=0.7,
            seed=seed
        )
        return chat_completion.choices[0].message.content
    
    def interactive_prompt(self):
        print("You are now chatting with the intelligent assistant. Type something to start the conversation.")
        history = [
            {"role": "system", "content": "You are an intelligent assistant. You always provide well-reasoned answers that are both correct and helpful."},
            {"role": "user", "content": "Hello, introduce yourself to someone opening this program for the first time. Be concise."},
        ]

        while True:
            messages = history[-2:]  # Consider only the last system message and the last user message for brevity
            completion = self.client.chat.completions.create(
                model="local-model",  # this field is currently unused
                messages=messages,
                temperature=0.7,
                max_tokens=150,
                stream=True
            )   
            new_message = {"role": "assistant", "content": ""}
            for chunk in completion:
                if chunk.choices[0].delta.content:
                    print(chunk.choices[0].delta.content, end="", flush=True)
                    new_message["content"] += chunk.choices[0].delta.content

            history.append(new_message)
            print()
            # Capture user input
            user_input = input("> ")
            if user_input.lower() == 'quit':
                print("Exiting interactive prompt...")
                break
            history.append({"role": "user", "content": user_input})

    def update_token(self, new_token: str) -> None:
        self.headers['Authorization'] = f"Bearer {new_token}"

    def calc_tokens(self, prompt: str) -> int:
        # Simplified token calculation; you might want to adjust this according to your actual tokenization logic
        return len(prompt.split())

    @classmethod
    def setup_from_config(cls, config_path: str):
        config = cls.load_config(config_path)
        api_url = config.get("api_url", "http://localhost:1234/v1")  # Default to example URL
        headers = {"Content-Type": "application/json"}  # Default header for JSON content
        headers.update(config.get("headers", {}))  # Update with any additional headers from config
        return cls(api_url=api_url, headers=headers, config=config)
    
    @classmethod
    def setup_from_dict(cls, config_json: Dict[str, Any] | str ):
        if isinstance(config_json, dict):
            api_url = config_json.get("api_url", "http://localhost:1234/v1")  # Default to example URL
            headers = {"Content-Type": "application/json"}  # Default header for JSON content
            headers.update(config_json.get("headers", {}))  # Update with any additional headers from config
            return cls(api_url=api_url, headers=headers, config=config_json)
        elif isinstance(config_json, str): # if it's a string, convert it to a dict
            config : dict = json.loads(config_json)
            return cls.setup_from_dict(config)
    
# Example usage
if __name__ == '__main__':
    config_path = "configs/lm_studio.config.json"
    lm_studio = LMStudioModel.setup_from_config(config_path)
    # print(lm_studio.sys_inference(sys_prompt="You are a helpful assistant", usr_prompt="Hello there", seed=42))
    print(lm_studio.sys_inference(sys_prompt="You are a helpful assistant", usr_prompt="Hello there", seed=1715852364))
    # lm_studio.interactive_prompt()

base.py :

from abc import ABC, abstractmethod
from typing import Any, Dict
import json 
import os 
import sys 

class BaseModel(ABC):
    """
    Abstract base class for models to interact with APIs and perform data processing.
    """
    
    def __init__(self, api_url: str, headers: Dict[str, str], config: Dict[str, Any]) -> None:
        self.api_url = api_url
        self.headers = headers
        self.config = config
    
    @staticmethod
    @abstractmethod
    def load_config(config_path: str) -> Dict[str, Any]:
        """
        Loads configuration from a specified path.
        """
        with open(config_path, 'r') as file:
            return json.load(file)

    @abstractmethod
    def generate_text(self, prompt: str, parameters: Dict[str, Any]) -> Any:
        """
        Generates text based on a prompt and parameters.
        This method needs to be implemented by the subclass.
        """
        pass

    @abstractmethod
    def predict(self, prompt: str, params: Dict[str, Any]) -> Any:
        """
        Processes a prompt and returns a prediction.
        This method needs to be implemented by the subclass.
        """
        pass
    
    @abstractmethod
    def inference(self) -> str:
        """
        Performs inference using the model.
        This method needs to be implemented by the subclass.
        """
        pass
    @abstractmethod
    def sys_inference(self, sys_prompt, user_prompt, seed:int | None =None) -> str:
        """
        Performs inference using the model with system prompt .
        This method needs to be implemented by the subclass.
        """
        pass
    
    @abstractmethod
    def update_token(self, new_token: str) -> None:
        """
        Updates the API token used for authentication.
        This method needs to be implemented by the subclass.
        """
        pass

    
    @abstractmethod
    def calc_tokens(self, prompt: str) -> int:
        """
        Calculates the number of tokens in a prompt.
        This method needs to be implemented by the subclass.
        """
        pass

    def interactive_prompt(self) -> None:
        """
        Optional: Implement an interactive prompt for testing purposes.
        This method can be overridden by subclasses for specific interactive functionality.
        """
        print("This method can be overridden by subclasses.")

    @classmethod
    def setup_from_config(cls, config_path: str):
        """
        Sets up the model based on the specified configuration.
        This method must be implemented by subclasses.
        """
        pass
    
    def setup_from_dict(cls, config_json: Dict[str, Any] | str ):
        """
        Sets up the model based on the specified configuration.
        This method must be implemented by subclasses.
        """
        pass

configs/lm_studio.config.json :

{
    "api_url": "http://localhost:1234/v1",
    "instructions": "You are a helpful AI Assistant.",
    "default_parameters":{
        "temperature": 0.7,
        "max_tokens": -1,
        "stream": false
    }
}

I only know that LM Studio uses llama.cpp, but not sure if it has to do with the size of the seed, if so what's the maximum integer where the same seed will always give the same results ?

Unable to load vision model: openbmb/MiniCPM-Llama3-V-2_5

this model has a vision adapter: mmproj-model-f16.gguf
i never used any vision model in lmstudio, so I don´t know if that is a bug or related to this particular model.
because this model has strong OCR capabilities, I wanted to test it, but lmstudio is unable to load any of the GGUF versions
while the "mmproj-model-f16.gguf" exists. if I delete that file, the model is loading.

again: don´t know if that is lmstudio bug or not, but it would be nice to test this model ;)

LMStudio unexpectedly terminates if drive where models are hosted is full

Windows, 0.2.26

I accidentally discovered this bug, it was actually a big headache, because I could not understand why suddenly LMStudio closed and why I could not open it again - window would flicker and vanish.

I host all AI models on a separate partition and while LMStudio was downloading new model it filled the partition and that caused those symptoms.

I think there should be some kind of error handling procedure so that program itself would still work.

Thank you!

Local Interface Server can not work when the request content is empry

Local Interface Server works well in example

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{ 
    "model": "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF",
    "messages": [ 
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "Introduce yourself." }
    ], 
    "temperature": 0.7, 
    "max_tokens": -1,
    "stream": true
}'

However, when the content is empty, it can not work.
I just set content to empty, it prints error.

[2024-05-04 01:10:59.564] [INFO] [LM STUDIO SERVER] Processing queued request...
[2024-05-04 01:10:59.564] [INFO] Received POST request to /v1/chat/completions with body: {
  "model": "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF",
  "messages": [
    {
      "role": "system",
      "content": ""
    },
    {
      "role": "user",
      "content": "Introduce yourself."
    }
  ],
  "temperature": 0.7,
  "max_tokens": -1,
  "stream": true
}
[2024-05-04 01:10:59.565] [ERROR] [Server Error] {"title":"'messages' array must only contain objects with a 'content' field that is not empty"}

Just hope when role of system's content is empty, it still can work.

Chat Window Becomes Unresponsive After Loading Large JSON Data

I encountered an issue where the chat window became unresponsive after pasting a very large JSON file (approximately 60,000 tokens or more). Initially, the application was functioning correctly. However, after introducing the large JSON data, the chat window froze and became unusable.

I was able to restore functionality by manually clearing all files in the .cache folder of the application.

Attempts to Resolve:

  • Force quitting and reopening the application.
  • Multiple reinstallation's of the app.
  • Restarting the device.

None of the above methods cleared the cached data or resolved the issue until I manually deleted the cache files.

Environment:

  • Operating System: MacOS
  • Application Version: 0.2.23

Unexpected RAM/VRAM Consumption

Version

0.2.22

What went wrong

Unexpected RAM consumption when/after model load when using full offload to GPU

Unexpected Behavior In Detail

I just upgraded the lm-studio to 0.2.22 and I got it running with a Tesla M40, it’s got 12GB of VRAM and the previous downloaded LLAMA3-8B-Q4_K_M(~4GB) should be entirely load on VRAM without any issue, but it seems like even though I set the GPU offload to max, and the VRAM usage is normal(~5GB), and the speed of token splitting is much faster which means it should be running properly on the gpu, I’m curious about why it’s still consuming over HUGE AMOUNT of ram (over 4GB physical RAM & over 10-15GB swap/virtual ram) and it GOT WORSE/CRASHED when enabling Flash Attention and it just ran out of ram(8GB Physical & 17GB Swap) when there’s still plenty of VRAM on the gpu side when switching flash attention on
Logs & Screenshots Attached

Expected Behavior

Normal RAM & VRAM Usage, (RAM: 2-3GB->Same When IDLE VRAM: Depends On Model Size)

Attachments

Log-1 Diagnostic Info

{
  "cause": "(Exit code: 0). Some model operation failed. Try a different model and/or config.",
  "suggestion": "",
  "data": {
    "memory": {
      "ram_capacity": "7.88 GB",
      "ram_unused": "832.32 MB"
    },
    "gpu": {
      "type": "NvidiaCuda",
      "vram_recommended_capacity": "12.00 GB",
      "vram_unused": "11.10 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.22631",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.22",
      "downloadsDir": "E:\\LM_STUDIO_M_ARC"
    },
    "model": {}
  },
  "title": "Error loading model."
}

Log-2

main.log

Screenshots Combination

F44EEF46F9516D46B43935E5F33B9185
5332703069BF4E611DE1E84024756ABF
960FE6AD-315D-4DDC-B804-A57CBEF0E1D7
3E29C49E-84E1-460B-8CED-B2702A9F806F
1C0A8499-A2ED-4064-B0B6-F156D73F9FF0
31D11C8F-E830-48DD-8893-7FFFD376CE8A
C70455D6-E18E-4026-A397-7A9FCD57ECA2
CAF8A6495E20BFAA50B3483A8E0CB88B
B2D64A0A5626D239EEBFB0BAF5FB7690

Screenshot Related INFO/ChatLog
-Well as we can see I attached 4 screenshots, two of them were before loading the model, vice versa
-*after offloading the entire model to gpu
-P.S. FYI the preset was the same as the built in llama3 preset except I switched the cpu threads value from 4 to 128 cuz I found it faster sometime back in few weeks ago when I had to run it on cpu only, and the result would be the same when switching to default value/preset if u would have question about that : )
And this is what happens when reloading the model after enabling flash attention

May LM-Studio continue to Flourish and Prosper.
Best Regards
PD-Kerman

ibm-granite models cannot be loaded, although they report StarCoder architecture

Copypasted from Failed to load model error modal:

{
  "title": "Failed to load model",
  "cause": "llama.cpp error: 'check_tensor_dims: tensor 'output.weight' not found'",
  "errorData": {
    "n_ctx": 4096,
    "n_batch": 512,
    "n_gpu_layers": 89
  },
  "data": {
    "memory": {
      "ram_capacity": "62.71 GB",
      "ram_unused": "78.71 KB"
    },
    "gpu": {
      "type": "Nvidia CUDA",
      "vram_recommended_capacity": "23.68 GB",
      "vram_unused": "21.81 GB"
    },
    "os": {
      "platform": "linux",
      "version": "6.5.0-35-generic",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.23",
      "downloadsDir": "path/to/cache/lmstudio_cache"
    },
    "model": {}
  }
}

Issue with Downloading and Searching Models in LM Studio

Hi,

I'm experiencing issues with LM Studio on Ubuntu 22.04.

When I try to download a model listed on the landing screen, I get the error: "Download failed: unexpected status code 429".

Additionally, when I try to search for models, I receive the error: "Error searching for models: HTTP error! Status 429".

Thanks!

Generates gibberish after switching from Gemma 2 with GPU accel to DeepSeek-Coder-V2-Lite-Instruct Q5_K_M

LM Studio 0.2.27
GPU acceleration: On, with CUDA.
From: bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF
To: qwp4w3hyb/DeepSeek-Coder-V2-Lite-Instruct-iMat-GGUF.gguf

Example output.

38"4EC$!31=.4<H0+':#"4::H/2H$

If you then switch GPU Acceleration off, it works fine. Switching GPU Accel back on is then fine too.

I suppose the thing that made Gemma 2 faster on GPU in 0.2.27, is still switched on in the GPU kernel after switching away from Gemma 2.

Error fetching model files 401 Unauthorized

I'm not able to use google new model "google/codegemma-7b-it-GGUF"

It seems that the reason is I have to accept terms and conditions on hugging face.
Is there any way to login to hugging face in lm studio or use hugging face token?

Potential CPU Usage Issue/CPU Usage Display Issue

Version

0.2.22

What went wrong

CPU Usage Display provided suspiciously low usage as shown in the user interface when output generating
CPU is almost IDLE-ing when the model is (pre)generating outputs without any offload on GPU and its also being confirmed in the detail section in task manager

Expected Behavior

Full CPU Usage

Attachments

0b407a0d21b214e7c5b0a1aed99c6006

c84b8db8a12376ed002150423d6a442b

f17b026ba62d8f73eeb10a18c94defc1

80513107257baf28cebe9cf19ab7c2fc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.