Hello, great work ! With this sample code, pykeops cannot see the GP

Hi Hicham, In your issue #<a class="issue-link js-issue-link" data-

For conda, you should find an how-to in the doc here: <a href="https://conda.io/pr

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Pykeops cannot find cuda ,about getkeops/keops

jeanfeydy commented on July 24, 2024

Hi Hicham,

Once again, thanks for the detailed report!
This time, it looks like a problem with CMake and your library path: the standard FindCUDA script could not find the libcudart_static library, as referenced e.g. by this thread on the NVidia devtalk forum. Could you please report the result of the

locate libcudart_static

command? If locate is not installed on your machine,

sudo apt-get install mlocate
sudo updatedb

should do the trick. For reference, the output on my laptop:

jean@jean-XPS-15-9550:~$ locate libcudart_static
/usr/lib/x86_64-linux-gnu/libcudart_static.a

And on a new Google Colab session:

!sudo apt-get install mlocate
!sudo updatedb
!locate libcudart_static

/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcudart_static.a

If locate cannot find the libcudart_static file in your CUDA install, updating the ldpath with ldconfig may allow you to fix this problem.

N.B. for @bcharlier : FindCUDA has recently been deprecated by Kitware/CMake. Long-term, we may have to switch to their new "first-class" support for CUDA.

from keops.

bcharlier commented on July 24, 2024

Hi Hicham,

In your issue ##2 (comment) cmake seems to have found the nvidia stuff as the line tells us :
-- Autodetected CUDA architecture(s): 6.0 6.0 6.0 3.7 3.7 3.7 3.7 3.7 3.7 3.7 3.7 3.7 3.7
This should corresponds to the compute capability of the 13 GPUs (?!) available on your system. Is that correct ? if not please provide the output of nvidia-smi.

In any cases, the FindCuda module of cmake don't do much than looking in /usr/local/cuda... So, the best practice is to contact the admin to create a symbolic /usr/local/cuda link to the last cuda lib installed on your system. As documented here, https://cmake.org/cmake/help/latest/module/FindCUDA.html :

To use a different installed version of the toolkit set the environment variable CUDA_BIN_PATH before running cmake (e.g. CUDA_BIN_PATH=/usr/local/cuda1.0 instead of the default /usr/local/cuda)

That is not very handy when using keops from python... even if it could be done in your conda env.

Hope this helps.

b.

PS : @jeanfeydy, in fact, we use a mixture of "first-class" support of cuda language and the old FindCuda module. As a matter of fact, the "first class" support of CUDA by cmake was still young as of 2018/19. The good old FindCuda module has some features (mainly for detection purposes like the internal cmake variables CUDA_NVCC_EXECUTABLE, CUDA_FOUND, etc...) that are not available when simply use a enable_language(CUDA). As cmake is moving quickliy this could change in a near future...

from keops.

gdebie commented on July 24, 2024

Hi,
I am using the same server as Hicham hence encountering the same issues. Could you elaborate on how to set CUDA_BIN_PATH locally in a conda env?
Many thanks!

Gwendoline

from keops.

bcharlier commented on July 24, 2024

For conda, you should find an how-to in the doc here:
https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#macos-and-linux

b.

from keops.

jeanfeydy commented on July 24, 2024

Hi Benjamin,
Just a quick note on these two issues: according to a mail that Hicham sent me last week, #2 is on "Marco's machine" hosted by the MokaPlan Inria team at Gare de Lyon, while this issue #3 is on the machine of the Parietal Inria team at Neurospin.
As far as I can tell, the problems are not 100% the same for both configurations, so it may be best to keep both tracks separate, waiting for Hicham to come back to these issues ;-)

from keops.

bcharlier commented on July 24, 2024

This is most likely due to a system configuration problem. I close the issue for now. Do not hesitate to re-open it if needed.

from keops.

atechnicolorskye commented on July 24, 2024

Hi @bcharlier ,

I'm having similar issues where cuda exists and torch finds it but pykeops doesn't.

I ran: import pykeops
pykeops.verbose = True
pykeops.build_type = 'Debug'
pykeops.clean_pykeops()
pykeops.test_torch_bindings()

And get:

Environment variable CUDA_ROOT is set to:

/usr/local/cuda

For compatibility, CMake is ignoring the variable.
Call Stack (most recent call first):
CMakeLists.txt:15 (include)
This warning is for project developers. Use -Wno-dev to suppress it.

-- The CXX compiler identification is GNU 9.2.1
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- No GPU detected. USE_CUDA set to FALSE.

.....

RuntimeError: [KeOps] This KeOps shared object has been compiled without cuda support:

to perform computations on CPU, simply set tagHostDevice to 0
to perform computations on GPU, please recompile the formula with a working version of cuda.

Cheers

from keops.

daidedou commented on July 24, 2024

Hi,

I tried to dig on the problem on my own but I don't know anything about cmake.
When you say :

In any cases, the FindCuda module of cmake don't do much than looking in /usr/local/cuda... So, the best practice is to contact the admin to create a symbolic /usr/local/cuda link to the last cuda lib installed on your system. As documented here, https://cmake.org/cmake/help/latest/module/FindCUDA.html :

To use a different installed version of the toolkit set the environment variable CUDA_BIN_PATH before running cmake (e.g. CUDA_BIN_PATH=/usr/local/cuda1.0 instead of the default /usr/local/cuda)

You were thinking about this line:

keops/keops/cuda.cmake

Line 12 in ae0b921

find_package(CUDA QUIET)

Right?

But I tested an "empty" cmake that tries to find cuda:
https://gist.github.com/daidedou/dc0e43070195d3b4f8899eed3fc3062f.js
It appears that cmake says that cuda is found:
-- Found CUDA: /usr/local/cuda (found version "10.2")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/machin/Documents/these/build

So the problem might as well come from your handmade script that counts the GPUs, but not from finding CUDA. If you want to share any information about what you wanted to do (it seems simple but meh), I could try continue digging on my own.

It might be also interesting to reopen the issue.

from keops.

daidedou commented on July 24, 2024

Ok so you were right it was a configuration problem. I'm on Fedora and there is a lot of things to do in order to get CUDA working. There is a specific version of g++ to install and I need to pass -ccbin=cuda-g++ each time.

My problem was that this part :

keops/keops/cuda.cmake

Lines 18 to 45 in ae0b921

    
           if(USE_CUDA) 
        
             # getting some properties of GPUs to pass them as "-D..." options at compilation (adapted from caffe git repo). 
        
             function(caffe_detect_installed_gpus out_variable) 
        
               if(NOT CUDA_gpu_detect_props) 
        
                 set(__cufile ${PROJECT_BINARY_DIR}/detect_cuda_props.cu) 
        
                 file(WRITE ${__cufile} "" 
        
                      "#include <cstdio>\n" 
        
                      "int main()\n" 
        
                      "{\n" 
        
                      "  int count = 0;\n" 
        
                      "  if (cudaSuccess != cudaGetDeviceCount(&count)) return -1;\n" 
        
                      "  if (count == 0) return -1;\n" 
        
                      "  std::printf(\"-DMAXIDGPU=%d;\",count-1);\n" 
        
                      "  for (int device = 0; device < count; ++device)\n" 
        
                      "  {\n" 
        
                      "    cudaDeviceProp prop;\n" 
        
                      "    if (cudaSuccess == cudaGetDeviceProperties(&prop, device))\n" 
        
                      "      std::printf(\"-DMAXTHREADSPERBLOCK%d=%d;-DSHAREDMEMPERBLOCK%d=%d;\", device, (int)prop.maxThreadsPerBlock, device, (int)prop.sharedMemPerBlock);\n" 
        
                      "  }\n" 
        
                      "  return 0;\n" 
        
                      "}\n") 
        
                 execute_process(COMMAND "${CUDA_NVCC_EXECUTABLE}" "--run" "${__cufile}" 
        
                                 WORKING_DIRECTORY "${PROJECT_BINARY_DIR}/CMakeFiles/" 
        
                                 RESULT_VARIABLE __nvcc_res OUTPUT_VARIABLE __nvcc_out 
        
                                 ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)

use in the end nvcc without the flag. I also had to set the value CMAKE_CUDA_HOST_COMPILER cuda-g++.

But still it is too much linked to my configuration.
I guess that if you want to check whether you are able to find cuda or not, you should do the following (describing a lot for the unfamiliar with cmake like me):

create an empty folder, and put a file named CMakeLists.txt inside it, filled with this code:
https://gist.github.com/daidedou/dc0e43070195d3b4f8899eed3fc3062f
create a folder named "build", when inside it, launch 'cmake ..' in the command line
if it fails to find cuda, it means that findcuda is failing
if not, change CMakeLists.txt by this one : https://gist.github.com/daidedou/700cd87d6e24fbff1e2190fad1224c08
relaunch cmake .. inside build, it will fail and create a file named detect_cuda_props.cu
Then run nvcc --run detect_cuda_props.cu, you will have a proper error on which you can dig (if pytorch does in fact detect your GPUs!)

Hope it will help someone!

from keops.

Pykeops cannot find cuda about keops HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	if(USE_CUDA)
	# getting some properties of GPUs to pass them as "-D..." options at compilation (adapted from caffe git repo).
	function(caffe_detect_installed_gpus out_variable)
	if(NOT CUDA_gpu_detect_props)
	set(__cufile ${PROJECT_BINARY_DIR}/detect_cuda_props.cu)

	file(WRITE ${__cufile} ""
	"#include <cstdio>\n"
	"int main()\n"
	"{\n"
	" int count = 0;\n"
	" if (cudaSuccess != cudaGetDeviceCount(&count)) return -1;\n"
	" if (count == 0) return -1;\n"
	" std::printf(\"-DMAXIDGPU=%d;\",count-1);\n"
	" for (int device = 0; device < count; ++device)\n"
	" {\n"
	" cudaDeviceProp prop;\n"
	" if (cudaSuccess == cudaGetDeviceProperties(&prop, device))\n"
	" std::printf(\"-DMAXTHREADSPERBLOCK%d=%d;-DSHAREDMEMPERBLOCK%d=%d;\", device, (int)prop.maxThreadsPerBlock, device, (int)prop.sharedMemPerBlock);\n"
	" }\n"
	" return 0;\n"
	"}\n")

	execute_process(COMMAND "${CUDA_NVCC_EXECUTABLE}" "--run" "${__cufile}"
	WORKING_DIRECTORY "${PROJECT_BINARY_DIR}/CMakeFiles/"
	RESULT_VARIABLE __nvcc_res OUTPUT_VARIABLE __nvcc_out
	ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)