rustformers / llmcord Goto Github PK
View Code? Open in Web Editor NEWA Discord bot, written in Rust, that generates responses using the LLaMA language model.
License: GNU General Public License v3.0
A Discord bot, written in Rust, that generates responses using the LLaMA language model.
License: GNU General Public License v3.0
#20 brought to my attention that embeds could be used as a way of presenting the response; they have full Markdown support, as well as longer message lengths.
I'd like to add this as an option (making sure to increase the message length) so that users can specify the use of embeds if that works better than raw messages for their use case.
Dear llmcord developer,
Greetings! I am vansinhu, a community developer and volunteer at InternLM. Your work has been immensely beneficial to me, and I believe it can be effectively utilized in InternLM as well. Welcome to add Discord https://discord.gg/gF9ezcmtM3 . I hope to get in touch with you.
Best regards,
vansinhu
The cancel button doesn't appear until it starts generating. It should be present (possibly a different colour) to stop the generation from happening altogether.
Previously, the cancel button would wipe out the content of your generation, but this was removed to handle multi-message generation. This should be restored in the form of a second delete button that will remove all messages from a generation strand.
You should be able to reply to generated text to insert new text into a generation. This should have an optional prefix for chat mode.
It should keep a few sessions in memory in a LRU cache. Sessions not in memory are replayed with a progress indicator.
Hi,
Love the repo and your work.
I add the embed message in a fork that I made here.
To increase the MESSAGE_CHUNK_SIZE
to 4096.
I already test it and work amazing :)
let me know if you accept the PR or you can take the idea to increase the size of the message. (currently learning Rust so any feedback is well received also)
Not sure if this is scope creep or not, but it would be nice for users to define their own prompt presets (potentially with parameters) and to be able to use them:
/alpaca preset set name:image preset:"What would be a good image description for "$1"?"
/alpaca prompt:"a glorious kingdom" preset:"image"
Finished release [optimized] target(s) in 0.41s
Running target\release\llmcord.exe
Loaded hyperparameters
Error: invariant broken: 1003168914 <= 2 in Some("models/7B/ggml-alpaca-q4_0.bin")
error: process didn't exit successfully: target\release\llmcord.exe
(exit code: 1)
what ‘ s happened?
The prompted text is not displayed to others unless it's obvious. Could have an option to bold or underline the text of the prompt for clarity.
Instead of loading a single model upfront, each command should have their own model associated (or an option to use the "root" model). Not sure about the memory management strategy yet - always force mmap
for multi-model support?
Don't think there's an easy way to detect this, so it'd be a config option
If a response goes past 2000 or some specified characters the bot should reply to itself continuing the output
For each slash command you should be able to specify default values for various settings like temperature, and also set whether those values can be changed
Hello,
I've installed everything successfully (note in case it's helpful to others on WSL2 I needed to update ssl-dev
and pkg-config
) and then updated the config file with the llama ggml model I downloaded path as well as my token (which I would recommend adding as a commented out field in the config).
When I run it, I see the model load everything correctly...
Loading of model complete
Model size = 3616.07 MB / num tensors = 291
<bot> is connected; registering commands...
<bot> is good to go!
I see the bot live in the discord it should be showing up in, however, the commands don't show up and just entering them doesn't work. They're set to 'true' on the enabled line of config as well.
At present, there is one hard-coded prompt
input. This should be per-command, so that you can supply more than one input to the template.
Possibly use a templating language?
If a message that came in during an existing generation starts generating, a new message should be sent to point readers to where it began, so they know it's started.
I have been trying to use GPT-2 models setting on the config.toml
:
[model]
path = "/usr/src/llmcord/weights/wizardcoder-guanaco-15b-v1.1.ggmlv1.q8_0.bin"
context_token_length = 2048
architecture = "GPT-2"
prefer_mmap = true
I'm getting zero response at the moment I try it.
I think llmcord
don't support right now GPT-2
?
Any help is gladly received.
If the bot can't send a message to the server (due to insufficient permissions or the like), it will panic, and take down the bot with it.
C:\Users\micro\Downloads\llamacord>cargo run --release
Finished release [optimized] target(s) in 0.16s
Running `target\release\llamacord.exe`
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidMagic { path: "models/vicuna-13b-free-q4_0.bin" }', src\main.rs:116:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Sometimes the model needs to be coaxed to generate something. Having command-defined buttons (e.g. pressing a button to feed "### Comments:\n\n" to the LLM) and a modal button to feed a specific prompt should help. Maybe they should only apply after the next newline?
Not sure how bots do this these days since I remember reactions having some problems with the API and rate-limiting, but a reaction to cancel prompt generation would be helpful if a message immediately goes in a direction you don't want.
It does not take autocomplete-like prompts.
The current logic is designed to show the entire prompt so that you can see how far the model has processed the prompt. Unfortunately, this gets quite noisy for Alpaca.
Instead, it should immediately show the user's prompt with bold strikethrough. Once it gets to the prompt, it should unstrikethrough it as it gets processed. It should then wait until it actually starts generating new tokens to show anything else.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.