glaciohound / lm-infinite Goto Github PK

View Code? Open in Web Editor NEW

70.0 70.0 7.0 105 KB

Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"

Home Page: https://arxiv.org/abs/2308.16137

License: MIT License

Python 96.92% Shell 3.08%

language-model long-context model-diagnostics

lm-infinite's People

Contributors

Stargazers

Watchers

Forkers

xnliang98 shossain soacker bruinxiong georgegu

lm-infinite's Issues

Should the llama model be fine-tuned?

Hello! I am a rookie to LLMs and I want to reproduce your nice work with the llama model (not the llama2).
Should I fine-tune the llama model on ARXIV or OpenWebText2 before evaluating it?
From my comprehension these two datasets are both the pre-training dataset of llama, so maybe the raw weights of llama model just work?
Thank you so much for your reply~

Some errors.

Hi,

when I run the code, I encounter two errors:

1. An error 1 occurred when running Evaluation on Passkey Retrieval Task:
Traceback (most recent call last):
File "scripts/eval_downstream_tasks.py", line 121, in
main(args)
File "scripts/eval_downstream_tasks.py", line 71, in main
output, output_ids = model.generate(
TypeError: generate() missing 1 required positional argument: 'do_sample'
2. An error 2 occurred when running Generation":
Traceback (most recent call last):
File "scripts/eval_generation.py", line 107, in
main(args)
File "scripts/eval_generation.py", line 94, in main
scores = generation_overall_metric(
File "LM-Infinite/data/generation_metrics.py", line 6, in generation_overall_metric
rouge = evaluate.load("rouge")
File "python3.8/dist-packages/evaluate/loading.py", line 731, in load
evaluation_module = evaluation_module_factory(
File "python3.8/dist-packages/evaluate/loading.py", line 681, in evaluation_module_factory
raise FileNotFoundError(
FileNotFoundError: Couldn't find a module script at LM-Infinite/rouge/rouge.py. Module 'rouge' doesn't exist on the Hugging Face Hub either.

Looking forward to your reply！

Implementation with RoPE

Hi, thanks for sharing this nice work!
I am a little confused about why keeping all k vectors unrotated while rotating all q vectors on the global branch. Any explanations would be appreciated!

How to Inferance?

The documentation does not make it clear how to perform inference using the lambda attention.

limited_distance_forward() got an unexpected keyword argument 'padding_mask'

I'm trying to run the eval script.

PYTHONPATH=. deepspeed --include localhost:$CUDA_VISIBLE_DEVICES --master_port $MASTER_PORT scripts/eval_downstream_tasks.py     --deepspeed_config configs/zero3_efficient_config.json     --model meta-llama/Llama-2-7b-hf --tokenizer_path meta-llama/Llama-2-7b-hf     --use_lambda_attention --local_branch 4096 --global_branch 100 --limit_distance 4096     --dataset passkey_retrieval --dataset_dir ${PASSKEY_DATA} --dataset_group ${MAX_LENGTH}     --max_generation_length 10 --evaluate_metrics     --log_dir $LOG_DIR/$TRIAL

GPTNeoX or Transformers support?

I'm trying to integrate LM-Infinite into GPTNeoX pythia-dedup. I managed to bring in the lambda_attn to work, but the rotary's implementation on the GPTNeoX is a bit different, and the heads is a 3 * hidden_size to form QKV, and the other model has separated layers of 1 * hidden_size that are independent Q/K/V. It managed to train, but during inference or evaluation (single batch) I got stuck on some shape mismatch.

I did managed to see the training benefit of lambda_attn, with a higher it/s. The GPU metrics are more smooth and steady on high throughput. The CPU exhibits also higher compute demand compared to traditional training and it doesn't appear to show any contention for the training. As a test, I did managed to train a larger context with the same hardware and at a higher performance, this works obviously.

Perhaps I was thinking wether having a folder or a separate repo with these modeling_$model.py that can be fit into transformers, would help to simplify the setup and adoption?

glaciohound / lm-infinite Goto Github PK

lm-infinite's People

Contributors

Stargazers

Watchers

Forkers

lm-infinite's Issues

Should the llama model be fine-tuned?

Some errors.

Implementation with RoPE

How to Inferance?

limited_distance_forward() got an unexpected keyword argument 'padding_mask'

GPTNeoX or Transformers support?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent