Comments (11)
I've tried mapping the concept to fit Puterbot's case but I'm having trouble understanding what the paper is essentially proposing. Are we compressing prompts into "gist tokens" after initially feeding the LM the entire uncompressed prompt so that it saves on read and compute time later on? Or are we compressing it from the beginning ? The attention masks part is especially confusing, so I was wondering if I could get someone else's perspective on Gisting :D
from openadapt.
Alright so I think I spent an unreasonable amount of time trying to understand how exactly the gist tokens themselves looked like from multiple reads on the paper, but I think the repo you linked above blackboxes it for us.
What I think remains now is to take advantage of the above repo such that any prompting we do on the LLM we use to automate processes, can be gisted and we save on compute time. Unfortunately, I haven't tinkered around with the Puterbot codebase enough yet to try and see where we can inject this improvement :/ Will comment/commit soon :D
Edit: Also, I was hoping if you could explain what "diff" meant? I saw it used on the other repo too but I haven't been able to infer what it implies :D
from openadapt.
@FFFiend if you are interested in being considered for an internship, please submit a PR with your work, and reference this issue in the description.
from openadapt.
Thank you for engaging on this issue @FFFiend !
Are we compressing prompts into "gist tokens" after initially feeding the LM the entire uncompressed prompt so that it saves on read and compute time later on?
I believe that is correct. From https://arxiv.org/pdf/2304.08467.pdf:
we add a single special gist token to
the model vocabulary and embedding matrix, much
like the start/end-of-sentence tokens often present
in such models. Then, given a (task, input) pair
(t, x), we concatenate t and x with a set of k successive gist tokens in between: (t, g1, . . . , gk, x), e.g.
Translate French: The cat. This
sequence is fed into the model, with the restriction that input tokens after the gist tokens cannot
attend to any of the prompt tokens before the gist
tokens (but they can attend to the gist tokens). This
forces the model to compress the information in
the prompt into the gist tokens, since the input x
(and output y) cannot attend to the prompt t.
According to the repo (https://github.com/jayelm/gisting) it appears that the trained gist model parameter diffs are now available on HuggingFace for LLaMA-7B and FLAN-T5-XXL 😄
from openadapt.
@FFFiend please join us on Slack if you haven't already: https://join.slack.com/t/mldsai/shared_invite/zt-1uf94nn7r-qcQnS~hinLPKftUapNzbuw (link at top of README)
from openadapt.
Alright so I think I spent an unreasonable amount of time trying to understand how exactly the gist tokens themselves looked like from multiple reads on the paper,
Thank you!
but I think the repo you linked above blackboxes it for us.
Unfortunate, but not a deal breaker.
I haven't tinkered around with the Puterbot codebase enough yet to try and see where we can inject this improvement
I think the MVP looks like creating a GistingReplayStrategyMixin, analogous to e.g. LLMReplayStrategyMixin
at https://github.com/MLDSAI/puterbot/blob/main/puterbot/strategies/llm_mixin.py (along the way, perhaps it's worthwhile thinking about what a general framework for any model might look like).
Integrate whatever they make available with as few lines as possible 👍
from openadapt.
Edit: Also, I was hoping if you could explain what "diff" meant? I saw it used on the other repo too but I haven't been able to infer what it implies :D
In general a diff is just a difference between two states (e.g. the previous one and the current one). In our case it refers to Screenshots/WindowStates. I'm not sure about what "the other repo" is referencing, can you please clarify? 🙏
from openadapt.
but I think the repo you linked above blackboxes it for us.
Can you please clarify?
From https://github.com/jayelm/gisting#demo--checkpoints:
To use the model and try out gist caching, use the src/compress.py script, e.g.
python -m src.compress --model_name_or_path jayelm/llama-7b-gist-1 --base_llama_path llama-7b \ --instruction "Name the top cities in France that should not be missed. Include the best aspects of each place as well."
Seems to me like we want this in the GistingReplayStrategyMixin
, with a compress
method
from openadapt.
Edit: Also, I was hoping if you could explain what "diff" meant? I saw it used on the other repo too but I haven't been able to infer what it implies :D
In general a diff is just a difference between two states (e.g. the previous one and the current one). In our case it refers to Screenshots/WindowStates. I'm not sure about what "the other repo" is referencing, can you please clarify? 🙏
Oh I meant the Gisting repo used that word too and I was confused what its technical implications were :D
but I think the repo you linked above blackboxes it for us.
Can you please clarify?
From https://github.com/jayelm/gisting#demo--checkpoints:
To use the model and try out gist caching, use the src/compress.py script, e.g.
python -m src.compress --model_name_or_path jayelm/llama-7b-gist-1 --base_llama_path llama-7b \ --instruction "Name the top cities in France that should not be missed. Include the best aspects of each place as well."
Seems to me like we want this in the
GistingReplayStrategyMixin
, with acompress
method
Yep noticed this too, apologies on the confusion. The paper didn't go into an example of what the gist tokens looked like and upon a cursory read of the readme it didn't seem like showing the actual compression was something they highlighted, but I was wrong.
from openadapt.
Continuing our conversation, I just found out you can make Python modules using an empty init file LOL this task seems a tad bit easier now haha, I was going crazy looking for ways to integrate that ENTIRE codebase to complete the task
Edit: it appears I can't make use of Gisting as a module, neither with imports nor a pip install. I've created an issue over at its repository for now.
from openadapt.
Update: I spoke to the author of the repo and was proposed with 2 possible solutions: write a setup.py
script or simply include the repo as a submodule on my fork.
So my plan is to use the Gisting repo as a submodule, modify the compress.py main file to return a compressed version of the input (which is passed in as the "instruction" variable to be compressed as I believe all we will be feeding any LLM we work with is instructions anyway, such as "open gmail and send an email to my Mom") as well as the gisted input tokens as well, and pass these into the model to generate the output, i.e instead of using the Tokenizer like in the llm_mixin
file on line 53, we generate the input_tokens
variable from a call to the modified compress.py
file.
Do I have the right idea? If so I can whip up a quick PR :)
from openadapt.
Related Issues (20)
- Implement NanoLLaVA completion adapter
- Implement Autodistill
- Support winget HOT 1
- [Bug]: manual installation on Windows fails on `poetry shell`
- please make docker type installation for ease. HOT 2
- please add python3.12 support HOT 2
- please add install instructions for linux on the `https://openadapt.ai/#start` page HOT 2
- [Bug]: from matplotlib._path import ( ImportError: DLL load failed while importing _path: The specified module could not be found.
- Regarding the warning and the GUI not visible
- Implement prettier for dashboard
- Implement MiniCPM-V-2
- [Bug]: fix notification icons
- Trigger replay programmatically
- Evaluate on OSWorld
- [Bug]: `poetry run install-dashboard` generates warning
- Refactor RecordProc and related
- Automatically lint dashboard
- [Bug]: replay from system tray
- Refactor config HOT 1
- [Bug]: database is locked when running dashboard
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openadapt.