Giter Site home page Giter Site logo

cuda's People

Contributors

1div0 avatar scaronni avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

1div0

cuda's Issues

CMake can't find the CUDA libraries/headers

CMake seems to have made some changes to how they search for the CUDA Toolkit (see CMakeCUDAFindToolkit.cmake for details).

This doesn't seem to work with the way CUDA libs are laid out when installed from this repo. Trying to enable_language(CUDA) in any CMakeLists.txt gives:

CMake Error at /home/va_erie/.local/lib/python3.12/site-packages/cmake/data/share/cmake-3.28/Modules/Internal/CMakeCUDAFindToolkit.cmake:148 (message):
  Couldn't find CUDA library root.
Call Stack (most recent call first):
  /home/va_erie/.local/lib/python3.12/site-packages/cmake/data/share/cmake-3.28/Modules/CMakeDetermineCUDACompiler.cmake:89 (cmake_cuda_find_toolkit)
  CMakeLists.txt:296 (enable_language)

I ran into this with llama.cpp, and at least one other person ran into this on another project that uses cuBLAS, and their only solution was to uninstall everything from this repo and use the official NVIDIA one insetad.

Unable to build the CUDA 11.4 samples on Fedora 34

I have installed my NVIDIA display driver and CUDA, with samples per: https://negativo17.org/nvidia-driver/.

The highlights:

sudo dnf config-manager --add-repo=https://negativo17.org/repos/fedora-nvidia.repo
sudo dnf install nvidia-driver nvidia-driver-cuda cuda-devel cuda-samples nvidia-settings

# Dependencies of some samples.
sudo dnf install cuda-cudart-static freeglut-devel libX11-devel libXi-devel libXmu-devel make mesa-libGLU-devel glfw-devel

# CUDA host compiler, per the RPMfusion instructions
sudo dnf copr enable kwizart/cuda-gcc-10.1 -y
sudo dnf install cuda-gcc cuda-gcc-c++ -y

I previously had the RPMfusion NVIDIA driver with CUDA installed per NVIDIA's instructions, but packages wouldn't update without manual intervention. I had heard good things about Negativo17's packaging being immune to problems when updating, so I went ahead and trashed my previous NVIDIA and CUDA setup and installed the Negativo17 version. The thing to note about the old setup was that I was able to build and run most of the CUDA samples.

So the Negativo17 NVIDIA driver is working:

$ lsmod | grep nvidia
nvidia_drm             69632  2
nvidia_modeset       1200128  2 nvidia_drm
nvidia              35319808  87 nvidia_modeset
drm_kms_helper        290816  2 nvidia_drm,i915
drm                   630784  11 drm_kms_helper,nvidia,nvidia_drm,i915
$ env | grep -i cuda
CUDA_INCLUDE_DIRS=/usr/include/cuda
CUDA_INC_PATH=/usr/include/cuda
HOST_COMPILER=/usr/bin/cuda-g++

But I get various kinds of failures trying to build the CUDA samples:

$ mkdir ~/cuda-samples
$ cp -r /usr/share/cuda/samples/ cuda-samples/
$ cd ~/cuda-samples/samples
$ cd 0_Simple/vectorAdd
$ make
/usr/local/cuda/bin/nvcc --include-path /usr/include/cuda -ccbin /usr/bin/cuda-g++ -I../../common/inc -m64 --threads 0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o vectorAdd.o -c vectorAdd.cu
make: /usr/local/cuda/bin/nvcc: No such file or directory

Many CUDA samples expect CUDA to be found under /usr/local/cuda. So in response, I can set:

export CUDA_PATH=/usr

And then make succeeds, but ./vectorAdd fails:

./vectorAdd 
[Vector addition of 50000 elements]
Failed to allocate device vector A (error code unknown error)!

So I'll try a different version of that:

cd ../vectorAddMMAP
make
>>> WARNING - libcuda.so not found, CUDA Driver is not installed.  Please re-install the driver. <<<
[@] /usr/bin/nvcc --include-path /usr/include/cuda -ccbin /usr/bin/cuda-g++ -I../../common/inc -m64 --threads 0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o multidevicealloc_memmap.o -c multidevicealloc_memmap.cpp
[@] /usr/bin/nvcc --include-path /usr/include/cuda -ccbin /usr/bin/cuda-g++ -I../../common/inc -m64 --threads 0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o vectorAddMMAP.o -c vectorAddMMAP.cpp
[@] /usr/bin/nvcc --include-path /usr/include/cuda -ccbin /usr/bin/cuda-g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o vectorAddMMAP multidevicealloc_memmap.o vectorAddMMAP.o
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp vectorAddMMAP ../../bin/x86_64/linux/release
[@] /usr/bin/nvcc --include-path /usr/include/cuda -ccbin /usr/bin/cuda-g++ -I../../common/inc -m64 --threads 0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o vectorAdd_kernel64.fatbin -fatbin vectorAdd_kernel.cu
[@] mkdir -p data
[@] cp -f vectorAdd_kernel64.fatbin ./data
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp -f vectorAdd_kernel64.fatbin ../../bin/x86_64/linux/release

Fantastic! Except that none of those commands actually do anything. Because of:

>>> WARNING - libcuda.so not found, CUDA Driver is not installed.  Please re-install the driver. <<<

somehow the Makefile is doing a dry run (make -n) and not building anything.

And I absolutely do have libcuda.so:

ls -la /usr/lib64/libcuda.so*
lrwxrwxrwx. 1 root root       20 Aug 21 16:18 /usr/lib64/libcuda.so -> libcuda.so.470.63.01*
lrwxrwxrwx. 1 root root       20 Aug 21 16:18 /usr/lib64/libcuda.so.1 -> libcuda.so.470.63.01*
-rwxr-xr-x. 1 root root 24036712 Aug  4 06:21 /usr/lib64/libcuda.so.470.63.01*

Samples like 1_Utilities/deviceQuery:

./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 999
-> unknown error
Result = FAIL

fail with an error too.

Please help.

Issue building OpenCV on Fedora 30

I am building OpenCV 4.1.0-openvino on Fedora 30 using CUDA 10.0
Using cmake I am able to set all CUDA variables and the configure command shows

NVIDIA CUDA:                   YES (ver 10.0, CUFFT CUBLAS FAST_MATH)
    NVIDIA GPU arch:             60 61 70 75
    NVIDIA PTX archs:

make builds all the necessary object files.

When i run sudo make install, among the details dumped by cmake there is

CMake Warning at cmake/OpenCVFindLibsPerf.cmake:35 (message):
  OpenCV is not able to find/configure CUDA SDK (required by WITH_CUDA).

  CUDA support will be disabled in OpenCV build.

  To eliminate this warning remove WITH_CUDA=ON CMake configuration option.

Call Stack (most recent call first):
  CMakeLists.txt:763 (include)

and

--     Protobuf:                    build (3.5.1)
-- 
--   NVIDIA CUDA:                   NO
-- 
--   OpenCL:                        YES (no extra features)

and starts building libraries without CUDA.

The setup worked when I used the proprietary CUDA libs a month ago. In that case the libraries are installed to the dedicated folder /usr/local/cuda-10.0

Am I doing something wrong?

Samples fail to compile

Just installed 2 days ago the nvidia and cuda driver/libraries in Fedora 24:

dnf -y install nvidia-driver nvidia-settings kernel-devel
dnf -y install cuda nvidia-driver-cuda cuda-devel cuda-samples

Navigating to the samples directory /usr/share/cuda/samples/1_Utilities/deviceQuery and attempting to make this to verify the installation gives:

deviceQuery.cpp:19:26: fatal error: cuda_runtime.h: No such file or directory
 #include <cuda_runtime.h>
                          ^
compilation terminated.
Makefile:250: recipe for target 'deviceQuery.o' failed
make: *** [deviceQuery.o] Error 1

Adding the cuda include folder to INCLUDES in Makefile:

INCLUDES  := -I../../common/inc,/usr/include/cuda

and running make, gives:

/usr/bin/crt/link.stub:2:26: fatal error: host_defines.h: No such file or directory
 #include "host_defines.h"
                          ^
compilation terminated.
Makefile:253: recipe for target 'deviceQuery' failed
make: *** [deviceQuery] Error 1

Adding the folder /usr/include/cuda to LD_LIBRARY_PATH doesn't fix the problem.

Can't find opencl libraries

I've installed the drivers and the cuda packages from the repo, but I can't find open cl libraries. My system doesn't have libOpenCL.so. On a different system with cuda installed from some other repo (possibly nvidia's), I can see /usr/local/cuda-6.5/lib64/libOpenCL.so
/usr/local/cuda-6.5/lib64/libOpenCL.so.1

Edit: This is on RHEL7 btw
Edit2: I see this in your spec file:
Libraries in the driver package
rm -f %{buildroot}%{_libdir}/libOpenCL.so*

cuda-docs-9.2.148.1-2.fc29.noarch.rpm integrity check failing

https://negativo17.org/repos/multimedia/fedora-29/x86_64/cuda-docs-9.2.148.1-2.fc29.noarch.rpm
SHA-256 ae09aa05a01d8cfa7650dcc605cf7df1f5b50bc88b5443e82d00c61af9b45c91

Transaction check error:
package cuda-docs-1:9.2.148.1-2.fc29.noarch does not verify: Payload SHA256 digest: BAD (Expected e628823c91c9981785d5450e52143b46f025374021f0802edda107154e58097d != d25f89145b27666e8b40534debc9e582d79dba30274edc914081143325386012)

I have tried to repeat download several times.

No library found under: /usr/lib64/stubs/libcuda.so

Hi @scaronni thanks so much for your awesome work.
I switched to using your packages after the manual way didn't work for me any more with kernel 5.1, and the guys on AskFedora highly recommended negativo17 :-)

All worked fine for me but I have an error compiling tensorflow (in the manual way, it was pretty easy specifying the correct paths because all was beneath usr/local/cuda).

During configure, it says

Found CUDA 10.1 in:
    /usr/lib64
    /usr/include/cuda
Found cuDNN 7 in:
    /usr/lib64
    /usr/include/cuda

but during the build it looks for a stub library and can't find it:

ERROR: An error occurred during the fetch of repository 'local_config_cuda':
   Traceback (most recent call last):
	File "/home/key/code/tensorflow/third_party/gpus/cuda_configure.bzl", line 1266
		_create_local_cuda_repository(repository_ctx)
	File "/home/key/code/tensorflow/third_party/gpus/cuda_configure.bzl", line 1033, in _create_local_cuda_repository
		_find_libs(repository_ctx, cuda_config)
	File "/home/key/code/tensorflow/third_party/gpus/cuda_configure.bzl", line 608, in _find_libs
		_find_cuda_lib("cuda", repository_ctx, cpu_value, (cu...), ...)
	File "/home/key/code/tensorflow/third_party/gpus/cuda_configure.bzl", line 589, in _find_cuda_lib
		find_lib(repository_ctx, [("%s/%s" % (based...))], ...)))
	File "/home/key/code/tensorflow/third_party/gpus/cuda_configure.bzl", line 566, in find_lib
		auto_configure_fail(("No library found under: " + ",...)))
	File "/home/key/code/tensorflow/third_party/gpus/cuda_configure.bzl", line 325, in auto_configure_fail
		fail(("\n%sCuda Configuration Error:%...)))

Cuda Configuration Error: No library found under: /usr/lib64/stubs/libcuda.so

Would you have an idea what could be the reason?

Many thanks!!

cuBLAS compilation failure: nv/target: No such file or directory

Hi, I'm experimenting with CUDA, and I get a compilation error when trying to compile any program that uses cuBLAS. I've first noticed the problem with whisper.cpp, but the simpleCUBLAS example program from Nvidia's official cuda-samples also fails with the same error.

                 from /usr/include/cublas_v2.h:69,
                 from simpleCUBLAS.cpp:40:
/usr/include/cuda_fp16.h:4086:10: fatal error: nv/target: No such file or directory
 4086 | #include <nv/target>
      |          ^~~~~~~~~~~

I installed the cuda-devel package version 12.3.101 and everything related from the fedora-nvidia repo, together with the nvidia-driver. I'm on an up-to-date Fedora 38 system.

| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro M2000M                  Off | 00000000:01:00.0  On |                  N/A |
| N/A   43C    P8              N/A / 200W |     74MiB /  4096MiB |     19%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

I'm not sure that this is a problem with your packages or Nvidia's upstream sources.

GPG signature issue

OS: CentOS 7.9

Not sure if this is the correct place to log this, but we recently tried to update our existing nvidia-drive-cuda-libs installation and started receiving signature failure messages:

Total size: 60 M
Installed size: 229 M
Is this ok [y/d/N]: y
Downloading packages:
error: skipping package /var/cache/yum/x86_64/7/epel-nvidia/packages/nvidia-driver-cuda-libs-525.89.02-1.el7.x86_64.rpm with unverifiable signature

If we choose to skip gpg-check, yum refuses to install, with no error summary provided.

Transaction check error:
Unknown error during transaction test in RPM

Upon checking https://negativo17.org/repos/RPM-GPG-KEY-slaanesh, it looks like a recent update resulted in a much smaller key. Perhaps the key is corrupt? To confirm, below is the repo we are configured to use:

[epel-nvidia]
name=negativo17 - Nvidia
baseurl=https://negativo17.org/repos/nvidia/epel-$releasever/$basearch/
enabled=1
skip_if_unavailable=1
gpgcheck=1
gpgkey=https://negativo17.org/repos/RPM-GPG-KEY-slaanesh
enabled_metadata=1
metadata_expire=6h
type=rpm-md
repo_gpgcheck=0

Linking errors with CUDA 9.1 when building samples for FEDORA 27

CUDA 9.1 samples do not compile out of the box in my FEDORA 27 build. (I finally got a new workstation so CUDA 9.X incompatibilies with Fermi cards are no longer an issue)

make[1]: Entering directory '/home/mjg/src/NVIDIA-CUDA-9.1_Samples/samples/0_Simple/cdpSimpleQuicksort'
/usr/bin/nvcc --include-path /usr/include/cuda -ccbin g++ -I../../common/inc  -m64    -dc -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o cdpSimpleQuicksort.o -c cdpSimpleQuicksort.cu
/usr/bin/nvcc --include-path /usr/include/cuda -ccbin g++   -m64      -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o cdpSimpleQuicksort cdpSimpleQuicksort.o  -lcudadevrt
nvlink error   : Undefined reference to 'cudaStreamCreateWithFlags' in 'cdpSimpleQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'cdpSimpleQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'cdpSimpleQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaStreamDestroy' in 'cdpSimpleQuicksort.o' (target: sm_35)
make[1]: *** [Makefile:280: cdpSimpleQuicksort] Error 255
make[1]: Leaving directory '/home/mjg/src/NVIDIA-CUDA-9.1_Samples/samples/0_Simple/cdpSimpleQuicksort'
make: *** [Makefile:52: 0_Simple/cdpSimpleQuicksort/Makefile.ph_build] Error 2

Linking fails if the samples are linked statically to the following static libraries: cudadevrt, cublas_device, cudart_static, cufft_static, cufftw_static, culibos, cusparse_static, curand_static, cusolver_static....

CUDA 8 builds for Fedora 27

Hi I just upgraded my primary laptop to Fedora 27 but unfortunately my NVIDIA card is Fermi which was deprecated with CUDA 9.

I tried rebuilding your src rpms for cuda8 in Fedora 27 but I get the following error:

/usr/bin/strip: '/home/mjg/rpmbuild/BUILDROOT/cuda-8.0.61-6.fc27.x86_64/usr/share/cuda/samples/7_CUDALibraries/common/FreeImage/lib/darwin/libfreeimage.a:': No such file
/usr/bin/strip: 'Mach-O': No such file
/usr/bin/strip: 'universal': No such file
/usr/bin/strip: 'binary': No such file
/usr/bin/strip: 'with': No such file
/usr/bin/strip: '2': No such file
/usr/bin/strip: 'architectures:': No such file
/usr/bin/strip: '[i386]': No such file
/usr/bin/strip: '[x86_64]': No such file
error: Bad exit status from /var/tmp/rpm-tmp.b6UUS6 (%install)

Would it be possible to have CUDA 8 rpms in Fedora 27 and newer repositories in the future so at least I can rollback to those versions if necessary?

wrong path for includedir in pkgconfig

seem pkgconfig includedir path reverted back from /usr/include to /usr/include/cuda in b7f9f77 which causing cuda header cannot be find. in happen to me in cuda package version 8.0.61-5 on fedora 26

Cannot find nvcc when building NVIDIA CUDA 9.1 samples on Fedora 27

Hi, I am trying to install cuda 9.1 on Fedora 27. I wasn't able to build the NVIDIA CUDA samples. I suspect that the Ma`kefile wasn't able to find CUDA and its libraries.

I enabled the repository according to your website using:

dnf config-manager --add-repo=https://negativo17.org/repos/fedora-nvidia.repo

and did not forget to update after enabling the repo.

Here is the list of currently installed repos; not sure if any of the repos are conflicting:

[~/NVIDIA_CUDA-9.1_Samples]$ dnf repolist
Last metadata expiration check: 1:43:45 ago on Tue 27 Mar 2018 02:05:12 AM +07.
repo id                              repo name                                               status
WineHQ                               WineHQ packages (Fedora 26)                                288
code                                 Visual Studio Code                                          29
*fedora                              Fedora 27 - x86_64                                      54,801
fedora-nvidia                        negativo17 - Nvidia                                        201
heliocastro-hack-fonts               Copr repo for hack-fonts owned by heliocastro                2
mysql-connectors-community           MySQL Connectors Community                                  15
mysql-tools-community                MySQL Tools Community                                       17
mysql57-community                    MySQL 5.7 Community Server                                  19
*rpmfusion-free                      RPM Fusion for Fedora 27 - Free                            574
*rpmfusion-free-updates              RPM Fusion for Fedora 27 - Free - Updates                  199
*rpmfusion-nonfree                   RPM Fusion for Fedora 27 - Nonfree                         205
*rpmfusion-nonfree-updates           RPM Fusion for Fedora 27 - Nonfree - Updates                51
sublime-text                         Sublime Text - x86_64 - Stable                               1
*updates                             Fedora 27 - x86_64 - Updates                            12,099
vbatts-bazel                         Copr repo for bazel owned by vbatts                          2

I installed cuda using:

sudo dnf install cuda cuda-devel cuda-cudnn cuda-cudnn-devel

The installation seems to be successful as no error messages appeared. nvcc is also installed:

[~/NVIDIA_CUDA-9.1_Samples]$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

However, when I run the CUDA samples, nvcc cannot be found for some reasons:

~/NVIDIA_CUDA-9.1_Samples]$ make
make[1]: Entering directory '~/NVIDIA_CUDA-9.1_Samples/0_Simple/cudaOpenMP'
/usr/local/cuda-9.1/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -Xcompiler -fopenmp -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o cudaOpenMP.o -c cudaOpenMP.cu
make[1]: /usr/local/cuda-9.1/bin/nvcc: Command not found
make[1]: *** [Makefile:303: cudaOpenMP.o] Error 127
make[1]: Leaving directory '~/NVIDIA_CUDA-9.1_Samples/0_Simple/cudaOpenMP'
make: *** [Makefile:52: 0_Simple/cudaOpenMP/Makefile.ph_build] Error 2

Checking the directory in question reveals that nvcc is indeed not in the specified directory:

[~/NVIDIA_CUDA-9.1_Samples]$ which nvcc
/usr/bin/nvcc
[~/NVIDIA_CUDA-9.1_Samples]$ tree /usr/local/cuda-9.1
/usr/local/cuda-9.1
├── bin
│   └── gcc
│       └── gcc -> /usr/bin/cuda-gcc
├── src
└── targets
    └── x86_64-linux
        ├── include
        │   └── cudnn.h
        └── lib
            ├── libcudnn.so
            ├── libcudnn.so.7
            ├── libcudnn.so.7.0.5
            └── libcudnn_static.a

7 directories, 6 files

It also shows that the source / library files are not present (as they had before when I was using the official install which had some problems with Fedora 27). Soft-linking to /usr/bin/nvcc didn't solve this problem because the source files cannot be found.

Here is my linux information:

[~/NVIDIA_CUDA-9.1_Samples]$ cat /proc/version
Linux version 4.15.10-300.fc27.x86_64 ([email protected]) (gcc version 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC)) #1 SMP Thu Mar 15 17:13:04 UTC 2018

I am glad to provide more information if any is required. Thanks!

[Request] Add TensorRT

I don't know if this issue belongs here, but I'd really like to see TensorRT included in your repository. It can greatly improve the performance of TensorFlow on GPU.

Could not load dynamic library

Hey there,

I'm trying to set up a fresh Fedora 30 install with CUDA, and are running into a couple issues.

Installing tensorflow-gpu using pip causes no issues:

(env) [keo7@pc3ke07 tmp]$ pip install tensorflow-gpu keras
Collecting tensorflow-gpu
  Downloading https://files.pythonhosted.org/packages/a1/eb/bc0784af18f612838f90419cf4805c37c20ddb957f5ffe0c42144562dcfa/tensorflow_gpu-2.0.0-cp37-cp37m-manylinux2010_x86_64.whl (380.8MB)
     |████████████████████████████████| 380.8MB 100.7MB/s 
Collecting keras
  Downloading https://files.pythonhosted.org/packages/ad/fd/6bfe87920d7f4fd475acd28500a42482b6b84479832bdc0fe9e589a60ceb/Keras-2.3.1-py2.py3-none-any.whl (377kB)
     |████████████████████████████████| 378kB 41.8MB/s 
Collecting absl-py>=0.7.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/3c/0d/7cbf64cac3f93617a2b6b079c0182e4a83a3e7a8964d3b0cc3d9758ba002/absl-py-0.8.0.tar.gz (102kB)
     |████████████████████████████████| 112kB 44.5MB/s 
Collecting astor>=0.6.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/d1/4f/950dfae467b384fc96bc6469de25d832534f6b4441033c39f914efd13418/astor-0.8.0-py2.py3-none-any.whl
Collecting keras-applications>=1.0.8 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/71/e3/19762fdfc62877ae9102edf6342d71b28fbfd9dea3d2f96a882ce099b03f/Keras_Applications-1.0.8-py3-none-any.whl (50kB)
     |████████████████████████████████| 51kB 29.8MB/s 
Collecting termcolor>=1.1.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Collecting wrapt>=1.11.1 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/23/84/323c2415280bc4fc880ac5050dddfb3c8062c2552b34c2e512eb4aa68f79/wrapt-1.11.2.tar.gz
Collecting keras-preprocessing>=1.0.5 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/28/6a/8c1f62c37212d9fc441a7e26736df51ce6f0e38455816445471f10da4f0a/Keras_Preprocessing-1.1.0-py2.py3-none-any.whl (41kB)
     |████████████████████████████████| 51kB 24.6MB/s 
Collecting google-pasta>=0.1.6 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/d0/33/376510eb8d6246f3c30545f416b2263eee461e40940c2a4413c711bdf62d/google_pasta-0.1.7-py3-none-any.whl (52kB)
     |████████████████████████████████| 61kB 21.1MB/s 
Collecting opt-einsum>=2.3.2 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/b8/83/755bd5324777875e9dff19c2e59daec837d0378c09196634524a3d7269ac/opt_einsum-3.1.0.tar.gz (69kB)
     |████████████████████████████████| 71kB 23.4MB/s 
Collecting numpy<2.0,>=1.16.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/ba/e0/46e2f0540370f2661b044647fa447fef2ecbcc8f7cdb4329ca2feb03fb23/numpy-1.17.2-cp37-cp37m-manylinux1_x86_64.whl (20.3MB)
     |████████████████████████████████| 20.3MB 44.4MB/s 
Requirement already satisfied: wheel>=0.26 in ./env/lib/python3.7/site-packages (from tensorflow-gpu) (0.33.6)
Collecting tensorboard<2.1.0,>=2.0.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/9b/a6/e8ffa4e2ddb216449d34cfcb825ebb38206bee5c4553d69e7bc8bc2c5d64/tensorboard-2.0.0-py3-none-any.whl (3.8MB)
     |████████████████████████████████| 3.8MB 40.6MB/s 
Collecting gast==0.2.2 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz
Collecting protobuf>=3.6.1 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/d7/34/02a2083afc14adff644a1e29783f276f12f1f914ca4cab157d73bb3d2fed/protobuf-3.10.0-cp37-cp37m-manylinux1_x86_64.whl (1.3MB)
     |████████████████████████████████| 1.3MB 46.3MB/s 
Collecting grpcio>=1.8.6 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/68/2d/2fe51d8382994cc0d4f9734367e8c159808ef2c367c6672722a509c9d5b2/grpcio-1.24.1-cp37-cp37m-manylinux1_x86_64.whl (2.3MB)
     |████████████████████████████████| 2.3MB 38.3MB/s 
Collecting tensorflow-estimator<2.1.0,>=2.0.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/95/00/5e6cdf86190a70d7382d320b2b04e4ff0f8191a37d90a422a2f8ff0705bb/tensorflow_estimator-2.0.0-py2.py3-none-any.whl (449kB)
     |████████████████████████████████| 450kB 37.4MB/s 
Collecting six>=1.10.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Collecting h5py (from keras)
  Downloading https://files.pythonhosted.org/packages/3f/c0/abde58b837e066bca19a3f7332d9d0493521d7dd6b48248451a9e3fe2214/h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl (2.9MB)
     |████████████████████████████████| 2.9MB 47.7MB/s 
Collecting pyyaml (from keras)
  Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
     |████████████████████████████████| 266kB 39.3MB/s 
Collecting scipy>=0.14 (from keras)
  Downloading https://files.pythonhosted.org/packages/94/7f/b535ec711cbcc3246abea4385d17e1b325d4c3404dd86f15fc4f3dba1dbb/scipy-1.3.1-cp37-cp37m-manylinux1_x86_64.whl (25.2MB)
     |████████████████████████████████| 25.2MB 40.5MB/s 
Collecting markdown>=2.6.8 (from tensorboard<2.1.0,>=2.0.0->tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/c0/4e/fd492e91abdc2d2fcb70ef453064d980688762079397f779758e055f6575/Markdown-3.1.1-py2.py3-none-any.whl (87kB)
     |████████████████████████████████| 92kB 21.2MB/s 
Requirement already satisfied: setuptools>=41.0.0 in ./env/lib/python3.7/site-packages (from tensorboard<2.1.0,>=2.0.0->tensorflow-gpu) (41.4.0)
Collecting werkzeug>=0.11.15 (from tensorboard<2.1.0,>=2.0.0->tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/ce/42/3aeda98f96e85fd26180534d36570e4d18108d62ae36f87694b476b83d6f/Werkzeug-0.16.0-py2.py3-none-any.whl (327kB)
     |████████████████████████████████| 327kB 48.3MB/s 
Building wheels for collected packages: absl-py, termcolor, wrapt, opt-einsum, gast, pyyaml
  Building wheel for absl-py (setup.py) ... done
  Created wheel for absl-py: filename=absl_py-0.8.0-cp37-none-any.whl size=120987 sha256=66b415184fa6b8f869899db0a6e69f30999fd584e65558500217eb889581b72a
  Stored in directory: /home/keo7/.cache/pip/wheels/9a/1e/7a/456008eb5e47fd5de792c6139df6d5b3d5f71d51c6a0b94799
  Building wheel for termcolor (setup.py) ... done
  Created wheel for termcolor: filename=termcolor-1.1.0-cp37-none-any.whl size=4832 sha256=7eaed30986b18a6fcdd85dcc0199ee3b6df4d96107161f4d3d28e5d422031419
  Stored in directory: /home/keo7/.cache/pip/wheels/7c/06/54/bc84598ba1daf8f970247f550b175aaaee85f68b4b0c5ab2c6
  Building wheel for wrapt (setup.py) ... done
  Created wheel for wrapt: filename=wrapt-1.11.2-cp37-cp37m-linux_x86_64.whl size=74012 sha256=be4c3c805dddf93e19c7c5314596eeb7fa0ed6cc511a50d426db3d8a5a5efd85
  Stored in directory: /home/keo7/.cache/pip/wheels/d7/de/2e/efa132238792efb6459a96e85916ef8597fcb3d2ae51590dfd
  Building wheel for opt-einsum (setup.py) ... done
  Created wheel for opt-einsum: filename=opt_einsum-3.1.0-cp37-none-any.whl size=61682 sha256=4b13d497a93159ade80425d5642277acceaeb15c9da25a4a713010fdc12bbf3d
  Stored in directory: /home/keo7/.cache/pip/wheels/2c/b1/94/43d03e130b929aae7ba3f8d15cbd7bc0d1cb5bb38a5c721833
  Building wheel for gast (setup.py) ... done
  Created wheel for gast: filename=gast-0.2.2-cp37-none-any.whl size=7540 sha256=48513ee632ffcb4b2ced32e369f5e8cdbe2240f58bb88d9a8816a37c659d06a0
  Stored in directory: /home/keo7/.cache/pip/wheels/5c/2e/7e/a1d4d4fcebe6c381f378ce7743a3ced3699feb89bcfbdadadd
  Building wheel for pyyaml (setup.py) ... done
  Created wheel for pyyaml: filename=PyYAML-5.1.2-cp37-cp37m-linux_x86_64.whl size=44103 sha256=323bc3d05d24e121b0713b2d561c741cad6d241d976af0342c634d244169f230
  Stored in directory: /home/keo7/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
Successfully built absl-py termcolor wrapt opt-einsum gast pyyaml
Installing collected packages: six, absl-py, astor, numpy, h5py, keras-applications, termcolor, wrapt, keras-preprocessing, google-pasta, opt-einsum, protobuf, markdown, grpcio, werkzeug, tensorboard, gast, tensorflow-estimator, tensorflow-gpu, pyyaml, scipy, keras
Successfully installed absl-py-0.8.0 astor-0.8.0 gast-0.2.2 google-pasta-0.1.7 grpcio-1.24.1 h5py-2.10.0 keras-2.3.1 keras-applications-1.0.8 keras-preprocessing-1.1.0 markdown-3.1.1 numpy-1.17.2 opt-einsum-3.1.0 protobuf-3.10.0 pyyaml-5.1.2 scipy-1.3.1 six-1.12.0 tensorboard-2.0.0 tensorflow-estimator-2.0.0 tensorflow-gpu-2.0.0 termcolor-1.1.0 werkzeug-0.16.0 wrapt-1.11.2

But when I attempt to access my GPU via tensorflow, I ge t the following errors:

>>> from keras import backend as K
Using TensorFlow backend.
>>> K.tensorflow_backend._get_available_gpus()
2019-10-07 22:19:09.611711: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-10-07 22:19:09.613041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: TITAN V major: 7 minor: 0 memoryClockRate(GHz): 1.455
pciBusID: 0000:26:00.0
2019-10-07 22:19:09.613223: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613339: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613449: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613563: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613671: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613775: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613879: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2019-10-07 22:19:09.613917: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2019-10-07 22:19:09.614465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-07 22:19:09.614500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      

Best wishes,

Keiron

cuda packages do not provide Davinci Resolve Studio H264/H265 encode options (works with official NV repo)

In the paid version of Davinci Resolve (Studio --aka-- DRS), if you use the official nvidia CUDA repository drivers and cuda packages for fedora:

http://developer.download.nvidia.com/compute/cuda/repos/fedora35/

DRS provides proper advertised H264/H265 hardware encode options, however when using the nvidia drivers + cuda (and cuda-devel) packages from Negativo17 repo instead, those options do not show up.

It should additionally be noted that the options also do not show up when using rpmfusion packages, and rpmfusion recommends to use the official nvidia repo for full cuda support.

I figured this is a bug in this repo since per the documentation this repo should also provide fully functional cuda packages:
https://negativo17.org/nvidia-driver/#CUDA_support

Complete packaged CUDA stack has been added for all supported distributions, all the packages provide/require/obsolete the relevant packages in the [Nvidia CUDA repository](http://developer.download.nvidia.com/compute/cuda/repos/); so you can enable this repository along with the official Nvidia CUDA one (x86_64 systems only).

cuda-samples cannot be installed due to non existing package

It looks like the cuda-samples package on Fedora 27 depends on non-existing compat-gcc-64-c++ package.

$ sudo dnf install cuda-samples
Error:
Problem: conflicting requests

  • nothing provides compat-gcc-53-c++ needed by cuda-samples-1:9.0.176-1.fc27.x86_64
  • nothing provides compat-gcc-64-c++ needed by cuda-samples-1:9.0.176-2.fc27.x86_64
  • nothing provides compat-gcc-64-c++ needed by cuda-samples-1:9.1.85-1.fc27.x86_64

I assume the cuda.spec file needs to be updated to depend on cuda-gcc-c++ package instead?

diff --git a/cuda.spec b/cuda.spec
index f91487f..9473e20 100644
--- a/cuda.spec
+++ b/cuda.spec
@@ -388,7 +388,7 @@ Obsoletes:      %{name}-samples < %{?epoch:%{epoch}:}%{version}
 Provides:       %{name}-samples = %{?epoch:%{epoch}:}%{version}
 Requires:       cuda-devel = %{?epoch:%{epoch}:}%{version}
 %if 0%{?fedora}
-Requires:       compat-gcc-64-c++
+Requires:       cuda-gcc-c++
 %else
 Requires:       gcc
 %endif

CUDA 10.1 Version Update

CUDA 10.1 has been released. Any timetable on when can we expect the version to be bumped to 10.1?

  • CUDA 10.1 looks like it has official GCC 8 support, so it looks like there will be no more need to bundle gcc 7 anymore.

cmake find_package(CUDA) fails to set CUDA_INCLUDE_DIRS

Hi, I tried to build an app that uses cmake on fedora 27. Presently cmake 3.10.1.
It failed with:

-- Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found suitable version "9.1", minimum required is "7.5")

I have already dnf install cuda-devel cuda-extra-libs. Long story short, I had to hardcode CUDA_INCLUDE_DIRS to /usr/include/cuda/ inside /usr/share/cmake/Modules/FindCUDA.cmake and the app built properly.

I have totally no idea though how this can be handled. I think this issue is related to #7 but here we talk about existing apps using cmake that we have no control over. I couldn't find any way to workaround the issue by setting env variables or something. The only way possible was to edit /usr/share/cmake/Modules/FindCUDA.cmake.

failed to download metadata

Hi, thanks for the all-in-one cuda repo! However, I recently encountered the following problem updating the packages on Fedora 33.

$ sudo dnf update
negativo17 - Nvidia                                                                           15 kB/s |  37 kB     00:02    
Errors during downloading metadata for repository 'fedora-nvidia':
  - Status code: 404 for https://negativo17.org/repos/nvidia/fedora-33/x86_64/repodata/repomd.xml (IP: 217.79.184.49)
Error: Failed to download metadata for repo 'fedora-nvidia': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
Ignoring repositories: fedora-nvidia
Last metadata expiration check: 0:03:31 ago on Tue 18 Jan 2022 09:11:50 AM EST.
Dependencies resolved.
Nothing to do.
Complete!

I am not sure why this is happening and it used to be working before. Thank you very much!

/sbin/ldconfig: /lib64/libcudnn.so.5 is not a symbolic link

after installing the cuda packages (escpecially cuda-cudnn-* ) I get this following error message during dnf updates (I guess everytime ldd is called somewhere).

/sbin/ldconfig: /lib64/libcudnn.so.5 is not a symbolic link

any more infos needed? :-)

CUDA 12.5 Update

As there is a new update available, any chance to repackage and release toolkit?

How could I help?

Bad soname for cudart

Since the update to 10.1.168-1.fc30, the cuda-cudart includes libcublas.so.10 -> libcublas.so.10.2.0.168 (note the 2 at the end), while other libraries have sonames of the form libcudart.so.10.1 -> libcudart.so.10.1.168 or libcufft.so.10 -> libcufft.so.10.1.168.
This change breaks tensorflow, which expects all libraries to have sonames with the same number.

Was it intentional? Is it possible to revert it and ensure that all libraries have consistent sonames, so should tensorflow be fixed?

install CUDA related libs to /usr/local/cuda

Hi,

I'm sorry for bothering, and I have a question which for sure is very silly ...
After some experimentation, workarounds, and (finally) failures to make the current TensorFlow build work (see

tensorflow/tensorflow#29797
https://groups.google.com/a/tensorflow.org/forum/#!topic/build/AB_nEXhUF0E

I think they've made changes that necessitate a CUDA install in /usr/local/cuda.

I've tried removing the CUDA-related rpms and installing CUDA via the runfile, after which the TensorFlow build itself worked fine. But then loading it won't work (even after symlinking some stuff) and I get some "unknown error"from cuInit which maybe (just a guess) could be due to cuInitcalling nvidia-smi (which does not exist anymore as it was in the rpm).

Would you by chance have an idea what one could do? Would it, in principle, be possible to build an alternative rpm containing the CUDA related libs and install that to /usr/local/cuda?

Sorry if that's a dumb question...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.