Giter Site home page Giter Site logo

neurips_llm_efficiency_challenge's People

Contributors

aniketmaurya avatar carmocca avatar drisspg avatar mreso avatar msaroufim avatar perlitz avatar pietrolesci avatar rasbt avatar riaz avatar shushengyuan avatar weiweiy avatar xindi-dumbledore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neurips_llm_efficiency_challenge's Issues

Toy submission issues. Incorrect file path?

Hey, I am trying to run the tutorial at https://github.com/llm-efficiency-challenge/neurips_llm_efficiency_challenge/tree/master/sample-submissions/lit-gpt

It doesn't say so, but if I want to make a Lit-GPT submission, I have to cd into /neurips_llm_efficiency_challenge/sample-submissions/lit-gpt because that's where the Dockerfile is?

Now, when running the docker build step, I am getting the following error:

 master ~/neurips_llm_efficiency_challenge/sample-submissions/lit-gpt docker build -t toy_submission .
[+] Building 2.5s (12/16)                                                                     docker:default
 => [internal] load .dockerignore                                                                       0.0s
 => => transferring context: 2B                                                                         0.0s
 => [internal] load build definition from Dockerfile                                                    0.0s
 => => transferring dockerfile: 1.52kB                                                                  0.0s
 => [internal] load metadata for ghcr.io/pytorch/pytorch-nightly:c69b6e5-cu11.8.0                       0.1s
 => [ 1/12] FROM ghcr.io/pytorch/pytorch-nightly:c69b6e5-cu11.8.0@sha256:748628fda7661f7e0612299b2012c  0.0s
 => [internal] load build context                                                                       0.0s
 => => transferring context: 19.21kB                                                                    0.0s
 => CACHED [ 2/12] WORKDIR /submission                                                                  0.0s
 => CACHED [ 3/12] COPY /lit-gpt/ /submission/                                                          0.0s
 => CACHED [ 4/12] COPY ./fast_api_requirements.txt fast_api_requirements.txt                           0.0s
 => CACHED [ 5/12] RUN pip install --no-cache-dir --upgrade -r fast_api_requirements.txt                0.0s
 => CACHED [ 6/12] RUN apt-get update && apt-get install -y git                                         0.0s
 => CACHED [ 7/12] RUN pip install -r requirements.txt huggingface_hub sentencepiece                    0.0s
 => ERROR [ 8/12] RUN python scripts/download.py --repo_id openlm-research/open_llama_3b                2.3s

Trying the last step manually, it gives me an error that the scripts/download.py file doesn't exist. Shouldn't it be lit-gpt/scripts/download.py instead?

RuntimeError: [enforce fail at inline_container.cc:471] . PytorchStreamWriter failed writing file data/151: file write failed

Following the README, I tried building up the Docker image. But, get the following error in the weights conversion step.

 => ERROR [10/13] RUN python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/open-llama/7B --model_size 7B                                                       66.8s
------
 > [10/13] RUN python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/open-llama/7B --model_size 7B:
65.78 Initializing lit-llama
65.78 Saving to disk at checkpoints/lit-llama/7B
65.78 Processing checkpoints/open-llama/7B/pytorch_model-00001-of-00002.bin
65.78 Traceback (most recent call last):
65.78   File "/submission/scripts/convert_hf_checkpoint.py", line 136, in convert_hf_checkpoint
65.78     sd[sd_key] = saver.store_early(sd[sd_key])
65.78   File "/submission/lit_llama/utils.py", line 469, in store_early
65.78     return SavingProxyForTensor(tensor, self)
65.78   File "/submission/lit_llama/utils.py", line 387, in __init__
65.78     storage_proxy = SavingProxyForStorage(
65.78   File "/submission/lit_llama/utils.py", line 363, in __init__
65.78   File "/submission/lit_llama/utils.py", line 492, in _write_storage_and_return_key
65.78     storage_key = saver._write_storage_and_return_key(storage)
65.79     self.zipfile.write_record(name, storage.data_ptr(), num_bytes)
65.79 RuntimeError: [enforce fail at inline_container.cc:471] . PytorchStreamWriter failed writing file data/151: file write failed
65.79
65.79 During handling of the above exception, another exception occurred:
65.79
65.79 Traceback (most recent call last):
65.79   File "/submission/scripts/convert_hf_checkpoint.py", line 166, in <module>
65.79     CLI(convert_hf_checkpoint)
65.79   File "/opt/conda/lib/python3.10/site-packages/jsonargparse/_cli.py", line 85, in CLI
65.79     return _run_component(component, cfg_init)
65.79   File "/opt/conda/lib/python3.10/site-packages/jsonargparse/_cli.py", line 147, in _run_component
65.79     return component(**cfg)
65.79   File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
65.79     return func(*args, **kwargs)
65.79   File "/submission/scripts/convert_hf_checkpoint.py", line 88, in convert_hf_checkpoint
65.79     with incremental_save(output_dir / "lit-llama.pth") as saver:
65.79   File "/submission/lit_llama/utils.py", line 496, in __exit__
65.79     self.zipfile.write_end_of_file()
65.79 RuntimeError: [enforce fail at inline_container.cc:337] . unexpected pos 17973300160 vs 17973300056
------
Dockerfile:22
--------------------
  20 |     # get open-llama weights
  21 |     RUN python scripts/download.py --repo_id openlm-research/open_llama_7b --local_dir checkpoints/open-llama/7B
  22 | >>> RUN python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/open-llama/7B --model_size 7B
  23 |
  24 |     # Copy over single file server
--------------------
ERROR: failed to solve: process "/bin/sh -c python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/open-llama/7B --model_size 7B" did not complete successfully: exit code: 1

ERROR: failed to solve: failed to read dockerfile: open /var/lib/docker/tmp/buildkit-mount3987352626/Dockerfile: no such file or directory

Ref: https://github.com/llm-efficiency-challenge/neurips_llm_efficiency_challenge/tree/master/toy-submission#build-and-run

$ docker build -t toy_submission .

[+] Building 0.1s (2/2) FINISHED                                                                                                                                      docker:default
 => [internal] load build definition from Dockerfile                                                                                                                            0.0s
 => => transferring dockerfile: 2B                                                                                                                                              0.0s
 => [internal] load .dockerignore                                                                                                                                               0.0s
 => => transferring context: 2B                                                                                                                                                 0.0s
ERROR: failed to solve: failed to read dockerfile: open /var/lib/docker/tmp/buildkit-mount3987352626/Dockerfile: no such file or directory

cc: @carmocca

Remove multilingual tasks

I wonder if you might consider getting rid of the conlang_translation task. Doing multilingual NLP is a whole different thing, and it's useful, but if everything else is English, then having one translation task seems like a distraction.

...And maybe language_identification. That's already so easy and effective with fasttext-style models anyway, that it seems like a low-priority thing for people to be working on with LLMs.

AFAICT removing these two would mean people can focus on English tasks. There's already a lot to do, so this would make life easier!
cc @weiweiy, @artidoro and @perlitz as requested by @msaroufim

lit-gpt error with quantize lib

I got this error in block code

from quantize.bnb import Linear4bit

class QuantizedLinear(Linear4bit):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, quant_type="nf4", compress_statistics=True, **kwargs)

image

but, how to fix?

Unable to install helm

Hi, I am not able to run the below command:

pip install git+https://github.com/drisspg/helm.git@neruips_client

below is the snapshot of error:

image

any workaround for this?

Missing dependencies in Sample Submission

In the toy submission docs here, I think you'd also need to first install the dependencies, e.g., via pip install -r requirements.txt from the lit-gpt subfolder before running docker build -t toy_submission .
Or perhaps this should be added as a step to the Dockerfile (but I don't know enough about Docker).

Possible bug in /process endpoint

In the line here, the size of the iterables included in ZIP does not match. More specifically, the tokens includes the logprob of all the tokens (ie input prompt + model response) whereas logprobs corresponds only to the new tokens generated ie model response. As a result, I believe the generated_tokens variable has incorrect values stored in it.
Can someone look into it/confirm this ?

\c @drisspg @rasbt

Test HELM locally on gsm8k Request error

I deployed HELM according to helm.md. When I use the following conf it is normal
entries: [{description: "mmlu:model=neurips/local,subject=college_computer_science", priority: 4}]
But when I use
entries: [{description: "gsm:model=neurips/local", priority: 4}]
there will be the following prompt

"/lit-gpt": not found when build docker

ubuntu@146-235-201-180:~/neurips_llm_efficiency_challenge/sample-submissions/lit-gpt$ sudo docker build -t toy_submission .
[+] Building 0.5s (7/16)                                                                                                                                                   
 => [internal] load build definition from Dockerfile                                                                                                                  0.0s
 => => transferring dockerfile: 1.38kB                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                       0.0s
 => [internal] load metadata for ghcr.io/pytorch/pytorch-nightly:c69b6e5-cu11.8.0                                                                                     0.5s
 => [internal] load build context                                                                                                                                     0.0s
 => => transferring context: 128B                                                                                                                                     0.0s
 => [ 1/12] FROM ghcr.io/pytorch/pytorch-nightly:c69b6e5-cu11.8.0@sha256:748628fda7661f7e0612299b2012ca3a9407ac920ea791398f9d553de8a43380                             0.0s
 => CACHED [ 2/12] WORKDIR /submission                                                                                                                                0.0s
 => ERROR [ 3/12] COPY /lit-gpt/ /submission/                                                                                                                         0.0s
------
 > [ 3/12] COPY /lit-gpt/ /submission/:
------
Dockerfile:10
--------------------
   8 |     
   9 |     # Copy the specific file into the container at /submission
  10 | >>> COPY /lit-gpt/ /submission/
  11 |     
  12 |     # Setup server requriements
--------------------
ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref moby::de7206o7kdfxfqkatsgfjhmw2: "/lit-gpt": not found

I followed all instructions in README.md, but I got the above error.

Does dockerfile need to be modified?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.