Giter Site home page Giter Site logo

gpu number无法使用 about devices HOT 8 OPEN

volcano-sh avatar volcano-sh commented on June 21, 2024
gpu number无法使用

from devices.

Comments (8)

wangyang0616 avatar wangyang0616 commented on June 21, 2024 1

/cc @wangyang0616 Can you help take a look?

ok, let me take a look

from devices.

Trainbow avatar Trainbow commented on June 21, 2024

你好,我在尝试volcano gpu number的服务调度,在根据volcano的教程步骤安装之后,每一个带gpu的node都能够正确的显示有多少块gpu,但是在创建pod的时候,container的容器中没有volcano-gpu-number这一个环境变量,在里面输入nvidia-smi能够看到该节点所有的gpu,想问一下是否需要更改yaml文件?

from devices.

Thor-wl avatar Thor-wl commented on June 21, 2024

你好,我在尝试volcano gpu number的服务调度,在根据volcano的教程步骤安装之后,每一个带gpu的node都能够正确的显示有多少块gpu,但是在创建pod的时候,container的容器中没有volcano-gpu-number这一个环境变量,在里面输入nvidia-smi能够看到该节点所有的gpu,想问一下是否需要更改yaml文件?

Hey, which version do you make use of?

from devices.

Trainbow avatar Trainbow commented on June 21, 2024

你好,我在尝试volcano gpu number的服务调度,在根据volcano的教程步骤安装之后,每一个带gpu的node都能够正确的显示有多少块gpu,但是在创建pod的时候,container的容器中没有volcano-gpu-number这一个环境变量,在里面输入nvidia-smi能够看到该节点所有的gpu,想问一下是否需要更改yaml文件?

Hey, which version do you make use of?

volcano-1.6.0

from devices.

Thor-wl avatar Thor-wl commented on June 21, 2024

/cc @wangyang0616 Can you help take a look?

from devices.

wangyang0616 avatar wangyang0616 commented on June 21, 2024

@Trainbow Is it convenient to post the yaml file for creating the test task?
By the way, can it be successfully scheduled using the default scheduler of k8s?

from devices.

Trainbow avatar Trainbow commented on June 21, 2024

@Trainbow Is it convenient to post the yaml file for creating the test task? By the way, can it be successfully scheduled using the default scheduler of k8s?

I used the sample yaml in vaolcano-gpu-number readme.

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod1
  namespace: model
spec:
  containers:
    - name: cuda-container
      image: nvidia/cuda:9.0-devel
      command: ["sleep"]
      args: ["100000"]
      resources:
        limits:
          volcano.sh/gpu-number: 1 # requesting 1 gpu cards
          # nvidia.com/gpu: 1

I also installed nvidia's k8s-device-plugin for testing. For example, when the limits field used nvidia.com/gpu, the pod's container works well, and it has one gpu devices. When i used volcano.sh/gpu-number, the container's env doesn't have the variable VOLCANO_GPU_ALLOCATED, the NVIDIA_VISIBLE_DEVICES is all.
I tried the gpu-sharing with volcano, according to the official tutorial to test, I can find the corresponding environment variables in the pod.

from devices.

wangyang0616 avatar wangyang0616 commented on June 21, 2024

Volcano Device Plugin GPUSTRATEGY default is the Share mode, that is, you can use the Volcano.sh/GPU-MEMOMORY.
If you use the volcano.sh/gpu-number, you need number`, see for details: config-the-volcano-device-plugin-binary

Hope the above information is helpful to you.

from devices.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.