Giter Site home page Giter Site logo

make error about forward HOT 7 CLOSED

tencent avatar tencent commented on April 29, 2024
make error

from forward.

Comments (7)

yuanzexi avatar yuanzexi commented on April 29, 2024

Could you specify the Environment you used? Better following the Issue-get-started hints. Thanks.

Describe the bug
A clear and concise description of what the bug is.

Environment

TensorRT Version:
NVIDIA GPU:
NVIDIA Driver Version:
CUDA Version:
CUDNN Version:
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):

Relevant Files

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

from forward.

CB-Jack-S avatar CB-Jack-S commented on April 29, 2024

OK, Let's see..
I am using Jetson AGX Xavier with Jetpack 4.4.1, operating system is "Ubuntu 18.04.5 LTS"
So the cuda version=10.2, cudnn version=8.0, TensorRT version=7.1.3, python version=3.6.9 and pytorch version=1.7.0
First I cd to the root file of forward
I changed the CMakelist.txt like :

...

Enable TensorRT

option(ENABLE_TENSORRT "Enable TensorRT" ON)

Enable Torch

option(ENABLE_TORCH "Enable Torch" ON)

Enable TensorFlow

option(ENABLE_TENSORFLOW "Enable TensorFlow" OFF)

Enable Keras

option(ENABLE_KERAS "Enable Keras" OFF)

Enable profiling

option(ENABLE_PROFILING "Enable profiling" ON)

Enable logging

option(ENABLE_LOGGING "Enable logging" ON)

Enable dynamic batch size

option(ENABLE_DYNAMIC_BATCH "Enable dynamic batch size" OFF)

Build Python Lib

option(BUILD_PYTHON_LIB "Build Python Lib" ON)

Enable Inference Tests (need OpenCV)

option(ENABLE_INFER_TESTS "Enable Inference Tests" OFF)

Enable RNN models forward

option(ENABLE_RNN "Enable RNN models forward" ON)

Enable unit tests

option(ENABLE_UNIT_TESTS "Enable unit tests" ON)
...

python

set(PYTHON_EXECUTABLE /usr/bin/python3.6)
...

Then I followed the orders:
'mkdir build' and 'cd build'
'cmake .. -DTensorRT_ROOT=/usr/src/tensorrt' and it shows'Generating done'

So I run 'make' and dangdang! error happens:

agx@agx-desktop:~/SCW/Forward-master/build$ make
Consolidate compiler generated dependencies of target simple-utils
[ 4%] Built target simple-utils
[ 5%] Building NVCC (Device) object source/trt_engine/CMakeFiles/trt_engine.dir/trt_network_crt/plugins/emb_layer_norm_plugin/trt_engine_generated_emb_layer_norm_kernel.cu.o
In file included from /home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/block_discontinuity.cuh:37:0,
from /home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/block_histogram_sort.cuh:37,
from /home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/block_histogram.cuh:36,
from /home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/cub.cuh:38,
from /home/agx/SCW/Forward-master/source/trt_engine/trt_network_crt/plugins/common/bert_plugin_util.h:33,
from /home/agx/SCW/Forward-master/source/trt_engine/trt_network_crt/plugins/emb_layer_norm_plugin/emb_layer_norm_kernel.cu:36:
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:238:61: warning: missing terminating " character
asprmt"bfi.b32 %0, %1, %2, %3;" : "=r"(ret) : ar"(x), "r"(x), b, in) - 1;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:282:2: error: #else without #if
#else
^~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:284:2: error: #endif without #if
#endif
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:294:2: error: #else without #if
#else
^~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:296:2: error: #endif without #if
#endif
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:306:2: error: #else without #if
#else
^~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:308:2: error: #endif without #if
#endif}
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:319:2: error: #else without #if
#else
^~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:321:46: warning: missing terminating " character
3;" : "word(ret) : word("(x), "rc_bit-of("(x), flags() + z;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:322:2: error: #endif without #if
#endif
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:334:2: error: #else without #if
#else
^~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:336:46: warning: missing terminating " character
3;" : "word(ret) : word("(x), "rc_bit-of("(x), flags() + z;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:337:2: error: #endif without #if
#endif
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:349:2: error: #else without #if
#else
^~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:351:44: warning: missing terminating " character
3;" : "word(ret) : word("(x), "rc_lane("(x), flags() + z;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:352:2: error: #endif without #if
#endif
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:371:61: warning: missing terminating " character
asfma.rz.ffi.b32 %0, %1, %2, %3;" f"(d(ret)f: ar"(xf, "r"(xf, cr) - 1;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:375:2: error: #endif without #if
#endif // DOXYGEN_SHOULD_SKIP_THIS
^~~~~
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:383:20: warning: missing terminating " character
vo orileasexit;")>())
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:395:20: warning: missing terminating " character
vo orileasllup;")>()x;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:406:18: warning: missing terminating " character
tef adIdx"urn x;
^
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:475:1: error: unterminated comment
/**
^
In file included from /home/agx/SCW/Forward-master/source/trt_engine/trt_network_crt/plugins/common/bert_plugin_util.h:33:0,
from /home/agx/SCW/Forward-master/source/trt_engine/trt_network_crt/plugins/emb_layer_norm_plugin/emb_layer_norm_kernel.cu:36:
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/cub.cuh:54:10: fatal error: device/device_run_length_encode.cuh: No such file or directory
#include "device/device_run_length_encode.cuh"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
CMake Error at trt_engine_generated_emb_layer_norm_kernel.cu.o.cmake:220 (message):
Error generating
/home/agx/SCW/Forward-master/build/source/trt_engine/CMakeFiles/trt_engine.dir/trt_network_crt/plugins/emb_layer_norm_plugin/./trt_engine_generated_emb_layer_norm_kernel.cu.o

source/trt_engine/CMakeFiles/trt_engine.dir/build.make:1591: recipe for target 'source/trt_engine/CMakeFiles/trt_engine.dir/trt_network_crt/plugins/emb_layer_norm_plugin/trt_engine_generated_emb_layer_norm_kernel.cu.o' failed
make[2]: *** [source/trt_engine/CMakeFiles/trt_engine.dir/trt_network_crt/plugins/emb_layer_norm_plugin/trt_engine_generated_emb_layer_norm_kernel.cu.o] Error 1
CMakeFiles/Makefile2:556: recipe for target 'source/trt_engine/CMakeFiles/trt_engine.dir/all' failed
make[1]: *** [source/trt_engine/CMakeFiles/trt_engine.dir/all] Error 2
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2

Appreciate it for listening!

from forward.

yuanzexi avatar yuanzexi commented on April 29, 2024

The Environment is almost the same as the passed Travis CI (https://travis-ci.com/github/Tencent/Forward) 'CUDA 10.2', so your environment has no problem and it should work. Here the error happens on the third-party 'cub-1.8.0', so maybe we need to check the files you download.

There is a character missing problem in your error message
/home/agx/SCW/Forward-master/source/third_party/cub-1.8.0/cub/block/specializations/../../block/../util_ptx.cuh:238:61: warning: missing terminating " character asprmt"bfi.b32 %0, %1, %2, %3;" : "=r"(ret) : ar"(x), "r"(x), b, in) - 1;
. However, you can see that
the statement in the line 238 of cub-1.8.0/cub/util_ptx.cuh is asm ("prmt.b32 %0, %1, %2, %3;" : "=r"(ret) : "r"(a), "r"(b), "r"(index));, so there must be some bad things happened, lead to missing terminating " character.

From my point of view, there are several possible reasons:

  1. The code files you downloaded are damaged in the transmission.
  2. The encoding of code files are not UTF-8 in your platform such that the files are analyzed in a wrong way.
  3. The code files were changed because of some reasons.
  4. The NVCC in your platform cannot decode asm codes well.

I suggest that you can download a new copy of source code and try again.

from forward.

CB-Jack-S avatar CB-Jack-S commented on April 29, 2024

Thanks a lot! That's the problem!
I'm using Xfpt to transfer those files and they are damaged in the transmission as you said!
But I came into a new 'make' error:

[ 4%] Built target simple-utils
[ 87%] Built target trt_engine
[ 97%] Built target fwd_torch
[ 98%] Building CXX object source/py_fwd/CMakeFiles/forward.dir/py_forward.cpp.o
:0:0: warning: "_GLIBCXX_USE_CXX11_ABI" redefined
:0:0: note: this is the location of the previous definition
In file included from /home/agx/SCW/Forward-master/source/py_fwd/py_forward.cpp:32:0:
/home/agx/SCW/Forward-master/source/py_fwd/py_forward_torch.h:32:10: fatal error: torch/csrc/jit/pybind_utils.h: No such file or directory
#include <torch/csrc/jit/pybind_utils.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
source/py_fwd/CMakeFiles/forward.dir/build.make:75: recipe for target 'source/py_fwd/CMakeFiles/forward.dir/py_forward.cpp.o' failed
make[2]: *** [source/py_fwd/CMakeFiles/forward.dir/py_forward.cpp.o] Error 1
CMakeFiles/Makefile2:573: recipe for target 'source/py_fwd/CMakeFiles/forward.dir/all' failed
make[1]: *** [source/py_fwd/CMakeFiles/forward.dir/all] Error 2
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2

I'm not so good at c++, so I don't know if I missed any essential library files like this 'pybind_utils.h' head file. But I do installed pybind11. Maybe you can give me some advice on this issue too. Thank you so much!

from forward.

yuanzexi avatar yuanzexi commented on April 29, 2024

Sorry, here has a missing code that I didn't submit. I submit a new commit (9b2139e) to fix this problem which is introduced by NEW_TORCH_API. You can view the commit to change your code and make again.

from forward.

CB-Jack-S avatar CB-Jack-S commented on April 29, 2024

Yes, I find that file in torch/csrc/jit/python/pybind_utils.h
Sorry to ask, but another error happens:

‘’‘[ 97%] Built target fwd_torch
[ 98%] Building CXX object source/py_fwd/CMakeFiles/forward.dir/py_forward.cpp.o
:0:0: warning: "_GLIBCXX_USE_CXX11_ABI" redefined
:0:0: note: this is the location of the previous definition
In file included from /home/agx/SCW/Forward-master/source/py_fwd/py_forward.cpp:32:0:
/home/agx/SCW/Forward-master/source/py_fwd/py_forward_torch.h: In member function ‘bool pybind11::detail::type_casterc10::IValue::load(pybind11::handle, bool)’:
/home/agx/SCW/Forward-master/source/py_fwd/py_forward_torch.h:62:39: error: no matching function for call to ‘toIValue(pybind11::handle&)’
value = torch::jit::toIValue(src);
^
In file included from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/ir/named_value.h:5:0,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/ir/ir.h:5,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/api/function_impl.h:4,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/api/method.h:5,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/api/object.h:5,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/frontend/tracer.h:9,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autograd/generated/variable_factories.h:12,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/types.h:7,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /home/agx/.local/lib/python3.6/site-packages/torch/include/torch/extension.h:4,
from /home/agx/SCW/Forward-master/source/py_fwd/py_forward_torch.h:32,
from /home/agx/SCW/Forward-master/source/py_fwd/py_forward.cpp:32:
/home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/ir/constants.h:48:33: note: candidate: c10::optionalc10::IValue torch::jit::toIValue(const torch::jit::Value*)
TORCH_API c10::optional toIValue(const Value* v);
^~~~~~~~
/home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/ir/constants.h:48:33: note: no known conversion for argument 1 from ‘pybind11::handle’ to ‘const torch::jit::Value*’
In file included from /home/agx/SCW/Forward-master/source/py_fwd/py_forward_torch.h:34:0,
from /home/agx/SCW/Forward-master/source/py_fwd/py_forward.cpp:32:
/home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/python/pybind_utils.h:509:15: note: candidate: c10::IValue torch::jit::toIValue(pybind11::handle, const TypePtr&, c10::optional)
inline IValue toIValue(
^~~~~~~~
/home/agx/.local/lib/python3.6/site-packages/torch/include/torch/csrc/jit/python/pybind_utils.h:509:15: note: candidate expects 3 arguments, 1 provided
source/py_fwd/CMakeFiles/forward.dir/build.make:75: recipe for target 'source/py_fwd/CMakeFiles/forward.dir/py_forward.cpp.o' failed
make[2]: *** [source/py_fwd/CMakeFiles/forward.dir/py_forward.cpp.o] Error 1
CMakeFiles/Makefile2:573: recipe for target 'source/py_fwd/CMakeFiles/forward.dir/all' failed
make[1]: *** [source/py_fwd/CMakeFiles/forward.dir/all] Error 2
Makefile:90: recipe for target 'all' failed
make: *** [all] Error
’‘’

-_-//

from forward.

yuanzexi avatar yuanzexi commented on April 29, 2024

Sorry, a compile problem exists here because of torch 1.7.0. I submit a commit(640a788) to fix this problem. In this travis CI (https://travis-ci.com/github/Tencent/Forward/builds/221009275), the compilation of CUDA 10.2 + TRT 7.2 + Torch 1.7.0 has passed.

from forward.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.