Giter Site home page Giter Site logo

GPUassert: invalid device symbol about cule HOT 5 CLOSED

nvlabs avatar nvlabs commented on August 18, 2024
GPUassert: invalid device symbol

from cule.

Comments (5)

KyunghyunLee avatar KyunghyunLee commented on August 18, 2024 1

I figured out the issue.
When I build torchcule, I got an error at line 11 of setup.py.(https://github.com/NVlabs/cule/blob/master/setup.py#L11)
I modified it to "codes = ['70']", similar to line 14.
torchcule was built successfully, but I got the error message above.

I dig into the table.hpp and find that the code means the architecture of GPU.
I found '70' actually means 'sm_70', and it is for Tesla V100.
Other codes are listed in below link. For 1080TI, it is '61'
http://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

I cleaned 'build' and 'dist' folder, then rebuild torchcule.
It works great now.

from cule.

ifrosio avatar ifrosio commented on August 18, 2024

How many GPUs do you have on your machine and which one is GPU-0 (the one you are using since you are passing ('cuda', 0) as device)?

from cule.

KyunghyunLee avatar KyunghyunLee commented on August 18, 2024

I got the exact same error

python ./examples/ppo/ppo_main.py --use-cuda-env --use-openai-test-env  --gpu 0
{'ale_start_steps': 400,
 'alpha': 0.99,
 'batch_size': 256,
 'clip_epsilon': 0.1,
 'conf_file': None,
 'entropy_coef': 0.01,
 'env_name': 'PongNoFrameskip-v4',
 'episodic_life': False,
 'eps': 1e-05,
 'evaluation_episodes': 10,
 'evaluation_interval': 1000000,
 'gamma': 0.99,
 'gpu': 0,
 'local_rank': 0,
 'log_dir': 'runs',
 'loss_scale': None,
 'lr': 0.00065,
 'lr_scale': False,
 'max_episode_length': 18000,
 'max_grad_norm': 0.5,
 'multiprocessing_distributed': False,
 'no_cuda_train': False,
 'normalize': False,
 'num_ales': 16,
 'num_gpus_per_node': -1,
 'num_stack': 4,
 'num_steps': 5,
 'opt_level': 'O0',
 'output_filename': None,
 'plot': False,
 'ppo_epoch': 3,
 'profile': False,
 'save_interval': 0,
 'seed': 1565661279,
 't_max': 50000000,
 'tau': 1.0,
 'use_adam': False,
 'use_cuda_env': True,
 'use_gae': False,
 'use_openai': False,
 'use_openai_test_env': True,
 'value_loss_coef': 0.5,
 'verbose': False}

PyTorch  : 1.1.0
CUDA     : 10.0.130
CUDNN    : 7501
APEX     : 0.1.0

GPUassert: invalid device symbol /home/lkh/Codes/cule/cule/atari/cuda/tables.hpp 43

here is my nvidia-smi

Tue Aug 13 11:06:38 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 33%   57C    P0    65W / 250W |     12MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:04:00.0  On |                  N/A |
|  0%   48C    P8    16W / 250W |    501MiB / 11177MiB |      8%      Default |
+-------------------------------+----------------------+----------------------+

from cule.

Shmuma avatar Shmuma commented on August 18, 2024

from cule.

ifrosio avatar ifrosio commented on August 18, 2024

Thanks - we are modifying the code to support multiple architectures, although this may require a larger compilation time. Will close when done.

from cule.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.