Giter Site home page Giter Site logo

Comments (12)

Syzygianinfern0 avatar Syzygianinfern0 commented on September 22, 2024
  1. We are not sure how the dependency issue of torchvision and torch has crept up into the requirements file. Possibly the compatibility of them during 1.7.1's early releases could have been alright (when we installed it), and then incompatibility introduced during nightly versions and continues to exist. Anyways, thank you for pointing out the compatibility issue.
  2. Although the cudatoolkit shipped with torch=1.7.1 is not of 10.0 version, the CUDA modules in our code do not depend on them. They rather compile on the system CUDA version (the one linked to nvcc).

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024
  1. i have Cuda 10.1 installed on my system itself. and also using torch with cuda. I have tested both the conda and pip installation, still getting the attached error.

Basically, i cant inference so i can't test the quality of the output results, thus i can't decide whether to develop on the code.

from palmira.

Syzygianinfern0 avatar Syzygianinfern0 commented on September 22, 2024

CUDA is unfortunately not backward compatible (I have faced the same issue). To run code that depends on CUDA 10.0, we cannot use CUDA 10.1.

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024

what about the pytorch and torchvision versions required, which versions should i install?

from palmira.

Syzygianinfern0 avatar Syzygianinfern0 commented on September 22, 2024

Please go ahead with torch 1.7.1

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024

should i install the cpuonly version, because there is no cuda10.0 with pytorch 1.7.1

from palmira.

Syzygianinfern0 avatar Syzygianinfern0 commented on September 22, 2024

No. As I mentioned here, the CUDA modules in the code depend on the system CUDA. The python's cuda do not matter for it.

Let me lay out the complete configuration again.

  • Python related
    • pytorch==1.7.1 torchvision==0.8.2
    • cudatoolkit (of your choice)
  • System related
    • CUDA 10.0

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024

i have Cuda 10.0 and Cudnn 7.3 installed on my system
i installed pytorch 1.7.1 torchvision 0.8.2 using conda with cuda 10.1 in the conda environment
i built detectron2 0.4 from source
but still i am getting error when trying to inference:

(d2) home@home-lnx:~/programs/Palmira$ python demo.py --input ./input/1.jpg --output ./output --config configs/palmira/Palmira.yaml --opts MODEL.WEIGHTS ./Palmira_indiscapes.pth
Using /home/home/.cache/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/home/.cache/torch_extensions/check_condition_bbox/build.ninja...
Building extension module check_condition_bbox...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module check_condition_bbox...
Traceback (most recent call last):
  File "demo.py", line 14, in <module>
    from defgrid.config import add_defgrid_maskhead_config
  File "/home/home/programs/Palmira/defgrid/__init__.py", line 2, in <module>
    from .mask_head import DefGridHead
  File "/home/home/programs/Palmira/defgrid/mask_head.py", line 17, in <module>
    from defgrid.layers.DefGrid.diff_variance import LatticeVariance
  File "/home/home/programs/Palmira/defgrid/layers/DefGrid/diff_variance.py", line 7, in <module>
    from defgrid.layers.DefGrid.check_condition_lattice_bbox.utils import check_condition_f_bbox
  File "/home/home/programs/Palmira/defgrid/layers/DefGrid/check_condition_lattice_bbox/utils.py", line 22, in <module>
    verbose=True,
  File "/home/home/anaconda3/envs/d2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 997, in load
    keep_intermediates=keep_intermediates)
  File "/home/home/anaconda3/envs/d2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1213, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/home/anaconda3/envs/d2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1564, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/home/home/anaconda3/envs/d2/lib/python3.7/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/home/anaconda3/envs/d2/lib/python3.7/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024

my main system, Cuda version

home@home-lnx:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

my main system, Cudnn version

home@home-lnx:~$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 3
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024

hmmm, this seems to be caused by having cuda 10.0 installed on the system, while pytorch 1.7.1 comes with cuda 10.1
as explained here facebookresearch/detectron2#588 (comment)
it seems that pytorch needs to be compiled from source.
Wouldn't it be better to make this repository code compatible with pytorch prebuilts

from palmira.

seekingdeep avatar seekingdeep commented on September 22, 2024

Solved by building pytorch 1.7.1 and torchvision 0.8.2 from source within the environment.
build detectron 0.4 from source within the environment.
installed cuda 10.0 and cudnn 7.3 on main terminal

from palmira.

Syzygianinfern0 avatar Syzygianinfern0 commented on September 22, 2024

it seems that pytorch needs to be compiled from source.

We did not have to do that. Our torch and detectron were installed from pre-built binaries. Also, some other people from our group have reproduced the setup necessary to run this code.

Wouldn't it be better to make this repository code compatible with pytorch prebuilts

We believed it was. However, after your case we might need to investigate further. Sorry for your inconveniences. Glad to know its fixed now anyways.

from palmira.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.