Your current environment vllm version: 0.4.0.post1 <h3 dir="au

using langchain can works, as an alternative. <a href="https://pytho

I have this error: <div class="snippet-clipboard-content notranslate position-rela

[Bug]: NameError: name 'vllm_ops' is not defined about vllm HOT 11 CLOSED

yananchen1989 commented on June 8, 2024

[Bug]: NameError: name 'vllm_ops' is not defined

from vllm.

Comments (11)

Bellk17 commented on June 8, 2024 5

Found fix; for me it was an issue for running benchmark test from the source directory. During installation, the _C module is compiled into the site-packages directory of the pip installation. When running from the source directory, the script is getting the code from source and not the installed package containing the compiled module.

Command:

FROM vllm dir
python3 benchmarks/benchmark_throughput.py --input-len=50 --output-len=100 --enforce-eager --tensor-parallel-size=6

Error:
... File "~/workspace/vllm/vllm/_custom_ops.py", line 176, in reshape_and_cache vllm_cache_ops.reshape_and_cache(key, value, key_cache, value_cache, NameError: name 'vllm_cache_ops' is not defined (Caught error: No module named 'vllm._C')

The script is picking up on the local module at ~/workspace/vllm/vllm instead of the installed module. Running the command from a different directory, such as the benchmarks directory, fixes this.

@yananchen1989 I notice your stack trace is also coming from source, /home/chenyanan/vllm/vllm/_custom_ops.py, try running from a separate directory after installing / compiling source. Let me know if this fixes the issue.

That being said, the try / except imports are causing unhelpful stack traces; I will look into doing an audit of compiled modules and adding useful warnings when not detected.

from vllm.

peterauyeung commented on June 8, 2024 4

Found fix; for me it was an issue for running benchmark test from the source directory. During installation, the _C module is compiled into the site-packages directory of the pip installation. When running from the source directory, the script is getting the code from source and not the installed package containing the compiled module.

Command:

FROM vllm dir

python3 benchmarks/benchmark_throughput.py --input-len=50 --output-len=100 --enforce-eager --tensor-parallel-size=6

Error: ... File "~/workspace/vllm/vllm/_custom_ops.py", line 176, in reshape_and_cache vllm_cache_ops.reshape_and_cache(key, value, key_cache, value_cache, NameError: name 'vllm_cache_ops' is not defined (Caught error: No module named 'vllm._C')

The script is picking up on the local module at ~/workspace/vllm/vllm instead of the installed module. Running the command from a different directory, such as the benchmarks directory, fixes this.

@yananchen1989 I notice your stack trace is also coming from source, /home/chenyanan/vllm/vllm/_custom_ops.py, try running from a separate directory after installing / compiling source. Let me know if this fixes the issue.

That being said, the try / except imports are causing unhelpful stack traces; I will look into doing an audit of compiled modules and adding useful warnings when not detected.

Confirm this is correct. I just need to cd out of the source and able to run without the error

from vllm.

cybrtooth commented on June 8, 2024

I just received this error as well. Seems to only happen on non-quantized mistral-7B models.

from vllm.

yananchen1989 commented on June 8, 2024

using langchain can works, as an alternative.

https://python.langchain.com/docs/integrations/llms/vllm/

from langchain_community.llms import VLLM

llm_vllm = VLLM(model='mistralai/Mistral-7B-Instruct-v0.2',
           trust_remote_code=True,  # mandatory for hf models
           max_new_tokens=2048,
           temperature=1,
           # tensor_parallel_size=... # for distributed inference
)

from vllm.

Bellk17 commented on June 8, 2024

I'm seeing the same issue.

Catching the import error gives:
No module named 'vllm._C'

Also seeing warnings during install:

...
CMake Warning at /home/tensorwave/install_vllm/venv/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /home/tensorwave/install_vllm/venv/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:67 (find_package)


CMake Warning at CMakeLists.txt:124 (message):
  Pytorch version 2.1.1 expected for ROCMm 6.x build, saw 2.4.0 instead.


-- HIP supported arches: gfx906;gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100
-- HIP target arches: gfx942;gfx942;gfx942;gfx942;gfx942;gfx942;gfx942;gfx942
CMake Warning at CMakeLists.txt:266 (message):
  Unable to create _punica_C target because none of the requested
  architectures (gfx942;gfx942;gfx942;gfx942;gfx942;gfx942;gfx942;gfx942) are
  supported, i.e.  >= 8.0
...

Works when TP is not set.

Currently trying to get working with MI300x on ROCm 6.1.

from vllm.

leiwen83 commented on June 8, 2024

tests/ folder also suffer from this vllm_ops not defined issue.

And I create this PR for pytest, #4231, which force pytest to search module from installed place.

from vllm.

dagelf commented on June 8, 2024

I get this error after doing a clean install with pip install -e . of commit 26f2fb5. There were no errors during the installation... but this runtime error might be because I used too new dependencies (Python 3.10.12, Pytorch 2.3, Cuda 12.4)?

from vllm.

chrisociepa commented on June 8, 2024

it looks like Pytorch 2.3.0 causes the problem

from vllm.

Semihal commented on June 8, 2024

I have this error:

INFO 05-22 14:25:08 utils.py:660] Found nccl from library /lib64/libnccl.so.2
INFO 05-22 14:25:09 selector.py:81] Cannot use FlashAttention-2 backend because the flash_attn package is not found. Please install it for better performance.
INFO 05-22 14:25:09 selector.py:32] Using XFormers backend.
INFO 05-22 14:25:34 model_runner.py:175] Loading model weights took 13.5516 GB
[rank0]: Traceback (most recent call last):
[rank0]:   File "<frozen runpy>", line 198, in _run_module_as_main
[rank0]:   File "<frozen runpy>", line 88, in _run_code
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 168, in <module>
[rank0]:     engine = AsyncLLMEngine.from_engine_args(
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 366, in from_engine_args
[rank0]:     engine = cls(
[rank0]:              ^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 324, in __init__
[rank0]:     self.engine = self._init_engine(*args, **kwargs)
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 442, in _init_engine
[rank0]:     return engine_class(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 172, in __init__
[rank0]:     self._initialize_kv_caches()
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 249, in _initialize_kv_caches
[rank0]:     self.model_executor.determine_num_available_blocks())
[rank0]:     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/executor/gpu_executor.py", line 106, in determine_num_available_blocks
[rank0]:     return self.driver_worker.determine_num_available_blocks()
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/worker/worker.py", line 139, in determine_num_available_blocks
[rank0]:     self.model_runner.profile_run()
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/worker/model_runner.py", line 888, in profile_run
[rank0]:     self.execute_model(seqs, kv_caches)
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/worker/model_runner.py", line 808, in execute_model
[rank0]:     hidden_states = model_executable(**execute_model_kwargs)
[rank0]:                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 316, in forward
[rank0]:     hidden_states = self.model(input_ids, positions, kv_caches,
[rank0]:                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 253, in forward
[rank0]:     hidden_states, residual = layer(
[rank0]:                               ^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 202, in forward
[rank0]:     hidden_states = self.input_layernorm(hidden_states)
[rank0]:                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/model_executor/layers/layernorm.py", line 60, in forward
[rank0]:     ops.rms_norm(
[rank0]:   File "/usr/local/lib64/python3.11/site-packages/vllm/_custom_ops.py", line 106, in rms_norm
[rank0]:     vllm_ops.rms_norm(out, input, weight, epsilon)
[rank0]:     ^^^^^^^^
[rank0]: NameError: name 'vllm_ops' is not defined

This work for me:

pip install https://github.com/vllm-project/vllm/releases/download/v0.4.2/vllm-0.4.2-cp311-cp311-manylinux1_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu121

from vllm.

dong-liuliu commented on June 8, 2024

Found fix; for me it was an issue for running benchmark test from the source directory. During installation, the _C module is compiled into the site-packages directory of the pip installation. When running from the source directory, the script is getting the code from source and not the installed package containing the compiled module.

Command:

FROM vllm dir

python3 benchmarks/benchmark_throughput.py --input-len=50 --output-len=100 --enforce-eager --tensor-parallel-size=6

Error: ... File "~/workspace/vllm/vllm/_custom_ops.py", line 176, in reshape_and_cache vllm_cache_ops.reshape_and_cache(key, value, key_cache, value_cache, NameError: name 'vllm_cache_ops' is not defined (Caught error: No module named 'vllm._C')

The script is picking up on the local module at ~/workspace/vllm/vllm instead of the installed module. Running the command from a different directory, such as the benchmarks directory, fixes this.

@yananchen1989 I notice your stack trace is also coming from source, /home/chenyanan/vllm/vllm/_custom_ops.py, try running from a separate directory after installing / compiling source. Let me know if this fixes the issue.

That being said, the try / except imports are causing unhelpful stack traces; I will look into doing an audit of compiled modules and adding useful warnings when not detected.

I also met this error. And it was fixed after changing working directory out of the vllm source code directory.
If anyone's error stack trace has shown with your specific path, then try to 'cd' out.

Probably many of us have a same habit to directly run tests or getting-start at the source code dir :)

from vllm.

DarkLight1337 commented on June 8, 2024

Fixed by #5009

from vllm.

[Bug]: NameError: name 'vllm_ops' is not defined about vllm HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent