ameli / manylinux-cuda Goto Github PK
View Code? Open in Web Editor NEWmanylinux docker images with CUDA Toolkit
Home Page: https://hub.docker.com/r/sameli/manylinux_2_28_x86_64_cuda_12.3
License: BSD 3-Clause "New" or "Revised" License
manylinux docker images with CUDA Toolkit
Home Page: https://hub.docker.com/r/sameli/manylinux_2_28_x86_64_cuda_12.3
License: BSD 3-Clause "New" or "Revised" License
I want compile TransformerEngine, please include the cudnn, nccl and nvtx,thanks.
when I start a container "docker run -it sameli/manylinux2014_x86_64_cuda_12.3" and run nvida-smi command inside it fails with the following message, am I missing something here?
[root@24c96d985c30 /]# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
I have also tried "docker run --gpus all sameli/manylinux2014_x86_64_cuda_12.3 nvidia-smi"
output:
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
ERRO[0000] error waiting for container:
To start with, I am from EECS UC Berkeley as well. Go Bears.
I am using PyO3 to compile a Python module from Rust that uses CUDA. The Rust crate that invokes CUDA is cust
, which would specifically ask for libcuda.so
. It seems that the current minimal version does not have this file right there.
So, I forked this repo and made a one-line change.
RUN yum -y install cuda-${VER}.${ARCH}
This works for me. I did eventually create a separate Docker image on Docker Hub because my GitHub actions need to. It has been tested to be needed and useful.
Nevertheless, I wonder if you would include such a full version (the image would be larger, 4.83 GB) as one of the images you are offering. Compared with pytorch's one which does not work for me, the repo here seems to have more predictable behaviors and is simplistic, and would be go-to options.
This is incredibly useful, but it would be great to have 11.8. I tried building it myself but couldn't find the 11.8 equivalent for https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-rhel7-11-7-local-11.7.0_515.43.04-1.x86_64.rpm
Hi there,
I've been using your container in github actions like so
jobs:
maketest:
runs-on: ubuntu-latest
container:
image: sameli/manylinux2014_x86_64_cuda_11
but recently I get the error
failed to register layer: write /usr/local/cuda-11.7/targets/x86_64-linux/lib/libcusolver_static.a: no space left on device
Do you know if there is any way to get the container up and running in a github action?
did github shrink their disk space available..?
Many thanks in advance 👍
Paul
First of all, thank you very much for putting this together - I had some wheels break many months ago because I couldn't get CUDA installed correctly and these enabled me to resume building those wheels.
But a question/feature-request: would it be possible to move to manylinux_2_28
instead of manylinux2014
? The reason being that manylinux2014
wheels set _GLIBCXX_USE_CXX11_ABI=0
and thus libs in those wheels cannot linked against in projects that use the new ABI (if 13 years old can be called new...). In addition, CentOS 7 (and manylinux2014
with it) are EOL this year so you're probably gonna be forced to bump soon anyway...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.