Giter Site home page Giter Site logo

Comments (3)

keisukefukuda avatar keisukefukuda commented on August 16, 2024

Hi @yoshihingis , thanks for your report.

First, I directly edit your comment so it would look a bit nicer.

For the cases that the executions failed, I guess the reason is python is mis-spelled such as pyhton or pthon, as the error messages implied.

Will you fix them and try again?

Thanks!
Keisuke

from chainermn.

yoshihingis avatar yoshihingis commented on August 16, 2024

Dear keisukefukuda

Thank you for your reply.
Sorry ,I did mistake , I copied and pasted unnecessary information at case(2)(Normal Ubuntu)
I re-write my question.

At Normal Ubuntu, I set n=2 (-n 2),hence the train_mnist.py indicated two GPU's information about both GPU0 & GPU1.
On the Docker Container , I set n=2(-n 2) ,but the train_mnist.py indicated only one GPU's information about GPU0.

But the elapse times are almost the same values at case(1) and case(2).

I think that ; train_mnist.py indicated only GPU0 information on the docker container(case(1), but it used two GPUs (GPU0 & GPU1) during training.

I want to know the reason why only one GPU information was indicated at the train_mnist.py on the docker container.

Could you give me any advice?

from chainermn.

keisukefukuda avatar keisukefukuda commented on August 16, 2024

@yoshihingis ,

First, if you want to have more casual discussion, you can reach me on Twitter (keisukefukuda). (possibly in Japanese, if you feel more comfortable :) )

The output

GPU: 0
# unit: 1000
# Minibatch-size: 100
# epoch: 20

should be shown only once if the MPI is working correctly, as indicated in the code
https://github.com/chainer/chainermn/blob/master/examples/mnist/train_mnist.py#L65

Thus, (1) looks working fine and (2) not.

I'm not sure what's going on in (2). Is it raw output? non-edited?
I don't see a "Using hierarchical communicator" message in (2).

from chainermn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.