Giter Site home page Giter Site logo

sony / nnabla-ext-cuda Goto Github PK

View Code? Open in Web Editor NEW
92.0 34.0 34.0 3.08 MB

A CUDA Extension of Neural Network Libraries

Home Page: https://nnabla.org/

License: Apache License 2.0

CMake 1.64% Python 4.66% C++ 27.42% Cuda 63.86% Batchfile 0.47% Shell 0.29% Makefile 0.85% Jupyter Notebook 0.08% Dockerfile 0.08% Cython 0.65%

nnabla-ext-cuda's Introduction

A CUDA Extension of Neural Network Libraries

This repository provides an official CUDA/cuDNN-accelerated extension of the Neural Network Libraries deep learning framework.

In order to use it, the default context needs to be changed from 'cpu' to cudnn':

from nnabla.ext_utils import get_extension_context

ctx = get_extension_context('cudnn', device_id='0')
nn.set_default_context(ctx)

Float 16-bit precision (fp16, half) can also be used by setting type_config options as following.

ctx = get_extension_context('cudnn', device_id='0', type_config='half')

See Mixed precision training tutorial for a stable training technique with fp16.

Currently, the binary package install manual and the usage documentation are integrated into the NNabla's documentation. For build instructions, see below.

Performance notes

Automatic Convolution algorithm selection

If CUDNN is enabled, the extension library uses the specific Convolution algorithms pre-optimized by CUDNN.

Optionally, this library can automatically select the fastest algorithms for your own network using the given configuration of parameters (filter size, stride, dilation, pad, etc), by exhaustively executing and measuring the time of each computation of algorithms (cudnnFindConvolution*Algorithm). The best algorithm will be cached, then re-used when an identical configuration is passed to our Convolution interface. It is very powerful in speed, even in non-static (dynamic) neural network. This mode becomes enabled by setting an environment variable NNABLA_CUDNN_ALGORITHM_BY_HEURISTIC 0.

However, it often consumes much memory due to a big workspace memory required by automatically found algorithms, and sometimes doesn't work on a GPU with small memory. To avoid this, you can specify the limit of the workspace size by setting an environment variable NNABLA_CUDNN_WORKSPACE_LIMIT (in bytes) read at runtime (not compilation time). For example, NNABLA_CUDNN_WORKSPACE_LIMIT=134217728 limits the workspace size up to 128 MB. The default value is -1 which means there is no limit of the workspace size.

In some cases it may be desired to restrict the automatic search for CUDNN Convolution algorithms to those that give deterministic (reproducable) results. This can be achived by setting an environment variable NNABLA_CUDNN_DETERMINISTIC to some value other than 0.

TensorFloat-32 (TF32)

In NNabla, the environment variable NNABLA_CUDA_ALLOW_TF32 controls whether TF32 (about TF32, see a blog post from NVIDIA) is allowed to be used. If NNABLA_CUDA_ALLOW_TF32 is not set (default) or 0, TF32 is disabled. Otherwise, it is enabled. NNABLA_CUDA_ALLOW_TF32 always takes priority of NVIDIA_TF32_OVERRIDE. NNABLA_CUDA_ALLOW_TF32 is only evaluated when initializing NNabla CUDA extension. If it is changed within the user program, the behavior is undefined.

FAQ

No FAQ so far.

nnabla-ext-cuda's People

Contributors

akiohayakawa-sony avatar enzhu-xu avatar fixstars-tetsurosakamoto avatar hakuturu583 avatar ishihara-y avatar jx-huading avatar kazukiyoshiyama-sony avatar krishnaw10 avatar qiiajia avatar qizhen-xue avatar stefanuhlich-sony avatar takuma-sony avatar takuyanarihira avatar takuyayashima avatar te-andrewshin avatar te-basavarajmurali avatar te-caojianxun avatar te-fanxl avatar te-hiroaki-mikami avatar te-naokiide avatar te-poornimabiradar avatar te-stephentiedemann avatar te-wangshi avatar te-woodyli avatar te-yongweisun avatar te-yoshiyukikobayashi avatar tomonobutsujikawa avatar yasunarizhashimoto avatar yuichimotai avatar yukiooobuchi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nnabla-ext-cuda's Issues

FFT/iFFT pytest cases fails on Win32+CUDA11.4 Cuda backend

We found that CUDA11.4 environment fails our pytest as follwoing:

nnabla\python\test\function\test_ifft.py:113: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
nnabla\python\test\nbla_test_utils.py:1064: in backward_function_tester
    sum_ograd.forward(clear_no_need_grad=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>   ???
E   RuntimeError: target_specific error in nbla::exec_cufft
E   C:/gl/builds/qXMMCsJN/1/nnabla/builders/all/nnabla-ext-cuda/include\nbla/cuda/function/utils/fft.cuh:294
E   `cufftXtMakePlanMany(plan, rank, n.data(), inembed.data(), istride, idist, input_type, onembed.data(), ostride, odist, output_type, batch, &work_size, execution_type)` failed with CUFFT_INVALID_PLAN.
_variable.pyx:582: RuntimeError

We got CUFFT_INVALID_PLAN error, but we have not found any suspicious points in our code.
So, now we're checking to see what's causing the problem, including external libraries.
One possibility is related to cufft library:
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#title-cufft-library

compile Error

I compiled with CMake version cmake-3.14.3 and I faced some compile errrors.
(no such headers,undefined reference.)
I solved this problem.
So, I want to send PR to you.

Exceeding the maximum number of blocks in transpose kernel

The issue is reported in sony/nnabla-examples#224 by @stalagmite7.

Error during forward propagation:
  TransposeCuda <-- ERROR
Traceback (most recent call last):
  File "generate.py", line 105, in <module>
    main()
  File "generate.py", line 84, in main
    pre_gen_warp.forward(clear_buffer=True)
  File "_variable.pyx", line 564, in nnabla._variable.Variable.forward
RuntimeError: target_specific error in forward_impl
/home/gitlab-runner/builds/zxvvzZDJ/0/nnabla/builders/all/nnabla-ext-cuda/src/nbla/cuda/function/./generic/transpose.cu:184
(cudaGetLastError()) failed with "invalid configuration argument" (cudaErrorInvalidConfiguration).

https://github.com/sony/nnabla-ext-cuda/blob/master/src/nbla/cuda/function/generic/transpose.cu#L183
I guess we must introduce grid-strided loops in all kernels.

Build failed with CUDA 10.0

I try to build nnabla-ext-cuda with CUDA 10.0, but it was failed with errors like below.

cmake -DNNABLA_DIR=../../nnabla -DCPPLIB_LIBRARY=../../nnabla/build/lib/libnnabla.so -DBUILD_PYTHON_PACKAGE=OFF -DBUILD_CPP_UTILS=ON ..

-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found version "10.0") 
-- Found cuDNN: /usr/include  
Updating /home/masaya/vrx_ws/src/vrx_packages/external/nnabla-ext-cuda/src/nbla/cuda/version.cpp.

-- CUDA--
Version: 10.0
Runtime: /usr/local/cuda/lib64/libcudart.so
CUBLAS: /usr/local/cuda/lib64/libcublas.soCUDA_cublas_device_LIBRARY-NOTFOUND
CURAND: /usr/local/cuda/lib64/libcurand.so
CUFFT: /usr/local/cuda/lib64/libcufft.so
cuDNN-libs: /usr/lib/x86_64-linux-gnu/libcudnn.so
cuDNN-includes: /usr/include
cuDNN version: 7.5.1
CUDA libs: /usr/local/cuda/lib64/libcudart.so;/usr/local/cuda/lib64/libcublas.so;CUDA_cublas_device_LIBRARY-NOTFOUND;/usr/local/cuda/lib64/libcurand.so;/usr/local/cuda/lib64/libcufft.so;/usr/lib/x86_64-linux-gnu/libcudnn.so
CUDA includes: /usr/local/cuda/include;/usr/include
-- Autodetected CUDA architecture(s): 6.1 
Arch: -gencode;arch=compute_61,code=sm_61
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
Python build_ext compiler is inferred as 'unix'.
You can specify a compiler manually setting a variable NBLA_PYTHON_BUILD_EXT_COMPILER. You can see a list of supported compiler by `python setup.py build_ext --help-compiler`.
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_cublas_device_LIBRARY (ADVANCED)
    linked by target "nnabla_cuda" in directory /home/masaya/vrx_ws/src/vrx_packages/external/nnabla-ext-cuda/src/nbla/cuda

-- Configuring incomplete, errors occurred!
See also "/home/masaya/vrx_ws/src/vrx_packages/external/nnabla-ext-cuda/build/CMakeFiles/CMakeOutput.log".
See also "/home/masaya/vrx_ws/src/vrx_packages/external/nnabla-ext-cuda/build/CMakeFiles/CMakeError.log".

ImportError: ../ibstdc++.so.6/ version `CXXABI_1.3.8' not found

Hi, nnabla seems very cool!

While nnabla itself is successfully installed without any trouble via pip, I got errors as below when tried to cmake. My environment is Ubuntu 16.04 and Python 2.7.

Does any setting is not completed or is there a bug? Thank you for advance.

Traceback (most recent call last):
  File "/home/user/test/nnabla-ext-cuda/build-tools/cmake/get_nnabla_dir.py", line 16, in <module>
    import nnabla
  File "/home/user/.anaconda/envs/py27/lib/python2.7/site-packages/nnabla/__init__.py", line 16, in <module>
    import _init  # Must be imported first
ImportError: /home/user/.anaconda/envs/py27/lib/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/user/.anaconda/envs/py27/lib/python2.7/site-packages/nnabla/libnnabla.so)
CMake Error at CMakeLists.txt:20 (string):
  string sub-command STRIP requires two arguments.


Traceback (most recent call last):
  File "/home/user/test/nnabla-ext-cuda/build-tools/cmake/get_nnabla_cpu_lib.py", line 17, in <module>
    import nnabla
  File "/home/user/.anaconda/envs/py27/lib/python2.7/site-packages/nnabla/__init__.py", line 16, in <module>
    import _init  # Must be imported first
ImportError: /home/user/.anaconda/envs/py27/lib/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/user/.anaconda/envs/py27/lib/python2.7/site-packages/nnabla/libnnabla.so)
CMake Error at CMakeLists.txt:26 (string):
  string sub-command STRIP requires two arguments.


Downloading googletest
Downloading eigen
python: can't open file '/build-tools/cmake/get_setup_build_dir.py': [Errno 2] No such file or directory
CMake Error at CMakeLists.txt:55 (string):
  string sub-command STRIP requires two arguments.


Traceback (most recent call last):
  File "/home/user/test/nnabla-ext-cuda/build-tools/code_generator/generate.py", line 19, in <module>
    from generator_common.init_cpp_common import generate_init_cpp
ImportError: No module named generator_common.init_cpp_common
CMake Error at CMakeLists.txt:83 (include):
  include could not find load file:

    /build-tools/cmake/Utils.cmake


CMake Error at CMakeLists.txt:103 (nbla_warnings_disable):
  Unknown CMake command "nbla_warnings_disable".


-- Configuring incomplete, errors occurred!

grad pytest cases sometimes fail on Win32 with cudnn backend

We have noticed that our test_grad sometimes fails by KeyError on Windows environment,
and it seems to fail often when auto_forward is True with cudnn context.

Here is one of the examples:

__________ test_shared_leaf_variable_basic_arithmetics[True-ctx1-311] __________
[gw1] linux -- Python 3.10.13 /usr/local/bin/python

seed = 311
ctx = Context(backend=['cuda:float', 'cpu:float'], array_class='CudaCachedArray', device_id='0')
auto_forward = True

    @pytest.mark.parametrize("seed", [311])
    @pytest.mark.parametrize("ctx", ctx_list)
    @pytest.mark.parametrize("auto_forward", [True, False])
    def test_shared_leaf_variable_basic_arithmetics(seed, ctx, auto_forward):
        def add(x, derivative=0):
            if derivative == 0:
                return x + x + x
            if derivative == 1:
                return 3 * np.ones_like(x)
            if derivative == 2:
                return np.zeros_like(x)
    
        def sub(x, derivative=0):
            if derivative == 0:
                return x - x - x
            if derivative == 1:
                return -1 * np.ones_like(x)
            if derivative == 2:
                return np.zeros_like(x)
    
        def mul(x, derivative=0):
            if derivative == 0:
                return x * x * x
            if derivative == 1:
                return 3 * x ** 2
            if derivative == 2:
                return 6 * x
    
        def div(x, derivative=0):
            if derivative == 0:
                return x / x / x
            if derivative == 1:
                return - x ** -2
            if derivative == 2:
                return 2 * x ** -3
    
        # Settings
        nn.set_default_context(ctx)
        nn.set_auto_forward(auto_forward)
    
        for math_type in [add, sub, mul, div]:
            xd = np.random.randn(2, 3) + 0.5
            x = nn.Variable.from_numpy_array(xd).apply(need_grad=True)
            x.grad.zero()
            y = math_type(x)
            # First-order gradient
>           dy_dx = nn.grad([y], [x])

../../nnabla/python/test/test_grad.py:282: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.local/lib/python3.10/site-packages/nnabla/grad.py:416: in grad
    grad_outputs = Grad()(outputs, inputs, grad_outputs=grad_outputs,
.local/lib/python3.10/site-packages/nnabla/grad.py:292: in __call__
    grad_outputs = self._connect_on_gradient_graph(grad_vars, f)
.local/lib/python3.10/site-packages/nnabla/grad.py:75: in _connect_on_gradient_graph
    v = vf_vb_map.get(o, [F.constant(0.0, o.shape)])
<constant>:3: in constant
    ???
.local/lib/python3.10/site-packages/nnabla/function_bases.py:3318: in constant
    return F.Constant(ctx, val, shape)(n_outputs=n_outputs, auto_forward=get_auto_forward(), outputs=outputs)
function.pyx:337: in nnabla.function.Function.__call__
    ???
function.pyx:310: in nnabla.function.Function._cg_call
    ???
<frozen importlib._bootstrap>:1024: in _find_and_load
    ???
<frozen importlib._bootstrap>:171: in __enter__
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = _ModuleLock('nnabla.nnabla.variable') at 140136728415920

>   ???
E   KeyError: 140137342793536

<frozen importlib._bootstrap>:123: KeyError

Now we're checking to see what's causing the problem.

Does nnabla support cuda 11.4?

cuda version: 11.4
image: nnabla/nnabla-ext-cuda-multi-gpu:py37-cuda110-mpi3.1.6-v1.24.0 from DockerHub

Error:
mpirun --allow-run-as-root -n 4 python main.py -c cudnn -d 0,1,2,3 --batch_size 4
2022-01-11 09:11:01,381 [nnabla][INFO]: Initializing CPU extension...
2022-01-11 09:11:01,381 [nnabla][INFO]: Initializing CPU extension...
2022-01-11 09:11:01,381 [nnabla][INFO]: Initializing CPU extension...
2022-01-11 09:11:01,383 [nnabla][INFO]: Initializing CPU extension...
2022-01-11 09:11:01,908 [nnabla][INFO]: Initializing CUDA extension...
2022-01-11 09:11:01,909 [nnabla][INFO]: Initializing CUDA extension...
2022-01-11 09:11:01,927 [nnabla][INFO]: Initializing CUDA extension...
2022-01-11 09:11:01,927 [nnabla][INFO]: Initializing CUDA extension...
2022-01-11 09:11:05,054 [matplotlib.font_manager][INFO]: generated new fontManager
2022-01-11 09:11:05,060 [matplotlib.font_manager][INFO]: generated new fontManager
2022-01-11 09:11:05,155 [matplotlib.font_manager][INFO]: generated new fontManager
2022-01-11 09:11:05,256 [matplotlib.font_manager][INFO]: generated new fontManager
2022-01-11 09:11:06,033 [nnabla][INFO]: Initializing cuDNN extension...
2022-01-11 09:11:06,033 [nnabla][INFO]: Initializing cuDNN extension...
2022-01-11 09:11:06,034 [nnabla][INFO]: Initializing cuDNN extension...
value error in query
/home/gitlab-runner/builds/-phDBBa6/5/nnabla/builders/all/nnabla/include/nbla/function_registry.hpp:69
Failed it != items_.end(): Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.
value error in query
/home/gitlab-runner/builds/-phDBBa6/5/nnabla/builders/all/nnabla/include/nbla/function_registry.hpp:69
Failed it != items_.end(): Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.
2022-01-11 09:11:06,036 [nnabla][INFO]: Initializing cuDNN extension...
value error in query
/home/gitlab-runner/builds/-phDBBa6/5/nnabla/builders/all/nnabla/include/nbla/function_registry.hpp:69
Failed it != items_.end(): Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.
value error in query
/home/gitlab-runner/builds/-phDBBa6/5/nnabla/builders/all/nnabla/include/nbla/function_registry.hpp:69
Failed it != items_.end(): Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.

Cannot compile with GCC 11 due to missing limits header

Thank you for great jobs.
When I tried to build on Ubuntu 22.04 with GCC 11 environment, I encountered the problem that I could not compile because the <limits> header was not found.

Condition

  • nnabla v1.35.1 setup
cd ~/opt
wget https://github.com/sony/nnabla/archive/v1.35.1.tar.gz
tar xvfz v1.35.1.tar.gz
rm v1.35.1.tar.gz

cd nnabla-1.35.1
mkdir -p build && cd build
cmake .. -DBUILD_CPP_UTILS=ON -DBUILD_PYTHON_PACKAGE=OFF -DNNABLA_UTILS_WITH_HDF5=ON
make
sudo make install
  • nnabla-ext-cuda v1.35.0 setup
cd ~/opt
wget https://github.com/sony/nnabla-ext-cuda/archive/v1.35.0.tar.gz -O ext-v1.35.0.tar.gz
tar xvfz ext-v1.35.0.tar.gz
rm ext-v1.35.0.tar.gz

cd nnabla-ext-cuda-1.35.0
pip3 install -U -r python/requirements.txt
mkdir build && cd build
cmake -DNNABLA_DIR=../../nnabla-1.35.1 -DCPPLIB_LIBRARY=../../nnabla-1.35.1/build/lib/libnnabla.so ..
make # โ† encounterd this problem

Error Message

ubuntu@DESKTOP-FU0HHQF:~/opt/nnabla-ext-cuda-1.35.0_NG/build$ make
[  0%] Building NVCC (Device) object src/nbla/cuda/CMakeFiles/cuda_compile_1.dir/utils/cuda_compile_1_generated_scan_setup.cu.o
/home/ubuntu/opt/nnabla-ext-cuda-1.35.0_NG/src/nbla/cuda/utils/scan_setup.cu(52): error: namespace "std" has no member "numeric_limits"
    if (size_input > std::numeric_limits<int32_t>::max()) {
                          ^

/home/ubuntu/opt/nnabla-ext-cuda-1.35.0_NG/src/nbla/cuda/utils/scan_setup.cu(52): error: type name is not allowed
    if (size_input > std::numeric_limits<int32_t>::max()) {
                                         ^

/home/ubuntu/opt/nnabla-ext-cuda-1.35.0_NG/src/nbla/cuda/utils/scan_setup.cu(52): error: no instance of overloaded function "max" matches the argument list
    if (size_input > std::numeric_limits<int32_t>::max()) {
                                                 ^

3 errors detected in the compilation of "/home/ubuntu/opt/nnabla-ext-cuda-1.35.0_NG/src/nbla/cuda/utils/scan_setup.cu".
CMake Error at cuda_compile_1_generated_scan_setup.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/ubuntu/opt/nnabla-ext-cuda-1.35.0_NG/build/src/nbla/cuda/CMakeFiles/cuda_compile_1.dir/utils/./cuda_compile_1_generated_scan_setup.cu.o


make[2]: *** [src/nbla/cuda/CMakeFiles/nnabla_cuda.dir/build.make:1792: src/nbla/cuda/CMakeFiles/cuda_compile_1.dir/utils/cuda_compile_1_generated_scan_setup.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:143: src/nbla/cuda/CMakeFiles/nnabla_cuda.dir/all] Error 2
make: *** [Makefile:156: all] Error 2

Cause of the problem

In GCC 11 environment, it seems that std::numeric_limits cannot be resolved in scan_setup.cu with existing header includes, and <limits> needs to be added explicitly.
The following changes have little impact, and I think they are valid in environments other than GCC 11.
I confirmed that it can be built only with this fix.

  • src/nbla/cuda/utils/scan_setup.cu
+ #include <limits>
#include <nbla/cuda/utils/scan_setup.hpp>

Thanks.


OS: WSL2 Ubuntu 22.04
GCC: v11.3.0

CUDA: v12.1.1
cuDNN: v8.9.1

nnabla_cli command cannot work on docker environment

We released nnabla v1.33.0 and its docker images, but we found that nnabla_cli command cannot work properly on docker environment.

$ docker run -it --rm nnabla/nnabla-ext-cuda-multi-gpu:py38-cuda110-mpi3.1.6-v1.33.0
==============================
== Neural Network Libraries ==
==============================
nnabla@2faabaf6bcae:~$ nnabla_cli
2023-02-02 13:02:26,645 [nnabla][INFO]: Initializing CPU extension...
/usr/local/bin/python3.8: Relink `/usr/local/lib/python3.8/site-packages/cryptography/hazmat/bindings/_rust.abi3.so' with `/lib/x86_64-linux-gnu/librt.so.1' for IFUNC symbol `clock_gettime'
Segmentation fault (core dumped)

We will fix this issue soon.

When is the next release planned?

Thank you for your hard work.

If possible, we would like to know when you are planning the next release of nnabla-ext-cuda.
We need a fixed version for a problem with GCC 11.

If it is scheduled to be released soon, we would like to wait for it.
Regards,

Module 'init' could not be found (Windows)

I'm using Anaconda and following instructions for installing on Windows.

Note: on Windows, you must conda install -c anaconda pywin32 for either version to work. This should be noted on the docs. Otherwise you get "no module named win32com.shell" error when you try to import nnabla

The CPU version loads and I have CUDA/cuDNN working (I use the GPU version of TensorFlow on the same machine but in a different environment), but when I try to import nnabla_ext.cuda.cudnn:

Screenshot of ImportError

from init import (...
# ImportError: DLL load failed: The specified module could not be found.

I think there's a dependency missing. I did notice that there is an init package out there, but I don't know if that's actually the one that the developers are using here. (Otherwise I would have made a PR for it.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.