lmstudio-ai / model-catalog Goto Github PK
View Code? Open in Web Editor NEWA collection of standardized JSON descriptors for Large Language Model (LLM) files.
License: Apache License 2.0
A collection of standardized JSON descriptors for Large Language Model (LLM) files.
License: Apache License 2.0
You should add the ability to add models to catalogue for the users right in the LM studio UI
Uncaught Exception:
Error: dlopen(/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/build/Release/gguf/llm_engine.node, 0x0001): Symbol not found: _cblas_sgemm$NEWLAPACK$ILP64
Referenced from: /Applications/LM Studio.app/Contents/Resources/app/.webpack/main/build/Release/gguf/libllama.dylib (built for macOS 14.0 which is newer than running OS)
Expected in: /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate
at process.func [as dlopen] (node:electron/js2c/asar_bundle:2:1822)
at Module._extensions..node (node:internal/modules/cjs/loader:1326:18)
at Object.func [as .node] (node:electron/js2c/asar_bundle:2:1822)
at Module.load (node:internal/modules/cjs/loader:1096:32)
at Module._load (node:internal/modules/cjs/loader:937:12)
at f._load (node:electron/js2c/asar_bundle:2:13330)
at Module.require (node:internal/modules/cjs/loader:1120:19)
at require (node:internal/modules/cjs/helpers:103:18)
at 4001 (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:2:943)
at r (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:3957
My Device configuration :
Why no copy all button for each response ? maybe i am just not noticing it.
As in the title
I'm attempting to download my first model with LM Studio. Unfortunately, when I select a model to download I get a "Failed" message immediately. Is there anywhere in the application where I can view the details of what is causing the download failure?... a log of some sort?
would it be possible to enable support for Mixtral-8x7B base and instruct models?
Suggestion to incorporate TTS STT With Whisper AI
request to support dbrx-16x12b-instruct-q4_0.gguf
"failed to load model"
{
"cause": "unknown model architecture: 'phi2'",
"title": "Failed to load model",
"errorData": {
"n_ctx": 32768,
"n_batch": 512,
"n_gpu_layers": 0
}
}
On my MacBook Air M1 with 16GB of RAM, I am unable to search or download in LM Studio, but on my other Mac Mini M2 with 16GB of RAM, everything works fine.
"I have a suggestion. I hope you can program a virtual NPU (Neural Processing Unit) that allows large models to run on it, with speeds still faster than pure CPU. The virtual NPU can call upon both the GPU and CPU to work together for acceleration. For example, it could make use of common operation instructions within the CPU and GPU as much as possible. This way, it would be more universal, allowing more older PCs and mobile phones to load more open-source large models. When more people use open-source software, more people will participate in building open-source."
This is an interesting idea! The concept of using a virtual NPU to leverage both CPU and GPU resources for running large models could indeed make it more accessible for devices with limited hardware capabilities. However, implementing such a system would be quite complex and require deep knowledge of hardware architecture, software design, and machine learning frameworks. It would also involve optimizing the distribution of tasks between the CPU and GPU to achieve the best performance. Nevertheless, if successful, it could significantly contribute to the open-source community and make advanced machine learning models more accessible to a broader range of users.
我有个建议,
希望你编程写一个虚拟NPU,
让大模型运行在虚拟NPU上, 速度还是要比纯CPU快一些的.
虚拟NPU可以调用GPU与CPU一起协同加速.
比如尽可能使用CPU与GPU内的通用操作指令.
这样就能更加的通用.更多老旧的PC,手机,加载更多开源大模型.
当更多人使用开源,就会有更多人参与建设开源.
I am using Mistral 7B and it was configured with context lenght 1500. So when I asked a questiom it was not able to finish, after the first answer it errored out due to context lenght 1500.
So I tried to set it higher to 7500. But then the computer first fresed and then the answer was empty.
I am using MacBook 'Pro with M1.
A JavaScript error occurred in the main process
Uncaught Exception:
Error: ENOENT: no such file or directory, watch '/Users/andre/.cache/lm-studio/config-presets/'
at FSWatcher. (node:internal/fs/watchers:247:19)
at Object.watch (node:fs:2315:34)
at new t.ConfigsManager (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:183951)
at new W (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:249928)
at 1287 (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:313105)
at r (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:401969)
at /Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:402521
at Object. (/Applications/LM Studio.app/Contents/Resources/app/.webpack/main/index.js:8:402547)
at Module._compile (node:internal/modules/cjs/loader:1241:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1296:10)
I want to use the app on a server through the web. Is this possible ?
Dear Volks,
first I wanna thank you for this amazing UI. I am enjoying using it on my M3 Max.
I am a no-coder but very enthusiast to run LLM locally since I have sensitive data.
I was wondering if it is possible to do RAG using LM Studio, maybe do you have some plugins such as ChromaDB.
If it is supported, could you please make a small tutorial on how to upload own data to LM Studio?
Thank you so much.
Best,
Aymen
I just installed Gemma from LM Studio and it gives this error:
{
"cause": "llama.cpp error: 'unknown model architecture: 'gemma''",
"errorData": {
"n_ctx": 2048,
"n_batch": 512,
"n_gpu_layers": 10
},
"data": {
"memory": {
"ram_capacity": "39.24 GB",
"ram_unused": "14.18 GB"
},
"gpu": {
"type": "NvidiaCuda",
"vram_recommended_capacity": "6.00 GB",
"vram_unused": "5.00 GB"
},
"os": {
"platform": "win32",
"version": "10.0.22631",
"supports_avx2": true
},
"app": {
"version": "0.2.14",
"downloadsDir": "C:\\Users\\fabio\\.cache\\lm-studio\\models"
},
"model": {}
},
"title": "Failed to load model",
"systemDiagnostics": {
"memory": {
"ram_capacity": 42131316736,
"ram_unused": 16063143936
},
"gpu": {
"type": "NvidiaCuda",
"vram_recommended_capacity": 6441926656,
"vram_unused": 5370806272
},
"os": {
"platform": "win32",
"version": "10.0.22631",
"supports_avx2": true
},
"app": {
"version": "0.2.14",
"downloadsDir": "C:\\Users\\fabio\\.cache\\lm-studio\\models"
},
"model": {
"gguf_preview": {
"name": "gemma-2b-it",
"arch": "gemma",
"context_length": 8192,
"embedding_length": 2048,
"num_layers": 18,
"rope": {},
"head_count": 8,
"head_count_kv": 1,
"parameters": ""
},
"filesize": 2669351840,
"config": {
"path": "C:\\Users\\fabio\\.cache\\lm-studio\\models\\lmstudio-ai\\gemma-2b-it-GGUF\\gemma-2b-it-q8_0.gguf",
"loadConfig": {
"n_ctx": 2048,
"n_batch": 512,
"rope_freq_base": 0,
"rope_freq_scale": 0,
"n_gpu_layers": 10,
"use_mlock": true,
"main_gpu": 0,
"tensor_split": [
0
],
"seed": -1,
"f16_kv": true,
"use_mmap": true,
"no_kv_offload": false,
"num_experts_used": 0
}
}
}
}
}````
do we need to wait for some new update?
Thank you very much for your work, which allows me to use LLMs conveniently locally. However, I have a small issue here: my network environment requires the use of a proxy to access GitHub. It would be very helpful for me if you could add the function of configuring a proxy in the software. Thank you.
Sir, I have a model want to use in lm studio, chatGLM. Can you help me convert it to gguf, the link is : https://huggingface.co/ayoolaolafenwa/ChatLM/tree/main
It seem be a pytorch model. Or, can you tell me how I can convert it? I am a new learner. Thank you very much.
Some models feature Q8_0 files, but that quant isn't supported in Metal.
This issue is for replacing those quant with Q6_K versions s.t. the model will work with Metal ON.
I have this basic code for RAG using Auto-Merge technique wthin llamaindex
node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_size)
nodes = node_parser.get_nodes_from_documents([doc])
storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)
index = VectorStoreIndex(nodes, storage_context=storage_context , embed_model=embed_model)
postproc = None
reranker = SentenceTransformerRerank(model="cross-encoder/ms-marco-MiniLM-L-2-v2", top_n=3)
retriever = index.as_retriever(similarity_top_k=retrieval_metadata_similarity)
retriever = AutoMergingRetriever(retriever,index.storage_context,verbose=True)
response_synthesizer = get_response_synthesizer(response_mode=response_mode)
node_postprocessors = [postproc, reranker]
node_postprocessors = [processor for processor in node_postprocessors if processor is not None]
query_engine = RetrieverQueryEngine(retriever, node_postprocessors=node_postprocessors)
Now, I want to use nominic-embeddings via lmstudio, whose basic code is this
pip install openai
firstfrom openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
def get_embedding(text, model="nomic-ai/nomic-embed-text-v1.5-GGUF"):
text = text.replace("\n", " ")
return client.embeddings.create(input = [text], model=model).data[0].embedding
print(get_embedding("Once upon a time, there was a cat."))
However, this gives me embeddings directly, whereas I want to use in the above code (specifically in place of vectorstoreindex)
How can I do that?
I was just wondering if there are any plans to integrate Apple's Ferret support.
Thanks!
hey your models aren't as fast as https://labs.perplexity.ai/ have you guys used https://github.com/NVIDIA/TensorRT-LLM
for your optimization
is there plan for MPI ?
If you have time, I hope to fix it, thank you very much
Please include AWQ models. This will allows us to choose better model with less computation configuration
Since lmstudio already supports vision adapters, how hard would it be to add https://huggingface.co/blog/idefics2 support?
On AI Chat view when there is lot of lines it is very slow/not convenient to scroll (using mouse wheel or keyboard key up/down...) and there is no vertical scroll bar
A good feature will be to add a vertical scroll bar
妈的太垃圾了,浪费时间,草
Announcement: https://mistral.ai/news/announcing-mistral-7b/
Original model: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
GGUF model: https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF
Hi,
I would like to use the LLaVa model, but can't because missing "add attachments" input
:)
Win10
if i start
LLM V 0.2.10
instantly one core is 100%
and i can kill the process while the other LLM app is still work without CPU usage, ok if i talk to the model than CPU or GPU whill work, thats ok ;)
same if i still keep this one that always use 100%
if i close LLM the process ist still there
How come if i copy a model i already have that works like wizard vic, but then I rename the copy to gpt-3.5-turbo-16k-0613 to be able to integrate with ChatDev and then change the corresponding .json file to the same name it will no longer load? how am I meant to rename a file to work with another API call if when you rename the file and the corresponding config it breaks the loading of that model?
I was searching for this HF repo in the LM Studio. Somehow couldn't find it in the results(Later realised because of the Compatibility Filter all results were not shown). Even though the same model uploaded by another user shows up.
I download models and they never show up, though they are visible via My Modules -> Show in File Explorer
I'm not sure if I have missed an options button or if my current screen settings of 1280x760 are so low that an options tab doesn't appear, but it sure would be nice to set the GUI to match my display settings, including text size. I have the setting low mostly for gaming purposes, which allows me to play more GPU-intensive games. This allows me to play at better visual quality overall by lowering the pixel count and applying my limited laptop resources to the other settings so the game doesn't look "potato". The catch is that when exiting, I have to remember to switch it to the default... Most often I don't, and it's not an issue until I run into a program that doesn't scale its UI to match the many, MANY different possible settings customer computers may be set to, including large TV screens. Again, I apologize if this functionality is in your GUI somewhere, but at this screen setting, I cannot find it. I should also mention, I'm an end user, not a database engineer or programmer, so forgive my lack of technogizmo terminology. :)
Hi, thanks for the great tool of LMStudio.
I am just wondering if there is any way to interact with Langchain using any way like API end point after the model is loaded?
Thank you!
support for phi 2: https://huggingface.co/microsoft/phi-2
If possible, its a small very capable llm, all the advantages
I really need to be able to test the json output for a given model based on an input and system prompt.
It would be much easier if I could do that in LM Studio by loading in the grammar file that allows these models to do JSON properly. Would it be possible to add the functionality?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.