Giter Site home page Giter Site logo

Multi-GPU parallelism about vegasflow HOT 5 CLOSED

scarrazza avatar scarrazza commented on May 28, 2024
Multi-GPU parallelism

from vegasflow.

Comments (5)

scarrazza avatar scarrazza commented on May 28, 2024

I can confirm that 1 is true, by adding to the lepage example the call

tf.debugging.set_log_device_placement(True)

and observing that the log never places an operator on GPU:1 but always on GPU:0 (even if nvidia-smi says that the program is using memory from GPU:1).

Concerning points 2 and 3, I think the best tf-like approach is to do something like this:

@tf.function
def run():
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        strategy.experimental_run_v2(vegas(lepage, dim, n_iter, ncalls))

i.e. using the tf.distribute API.

from vegasflow.

scarlehoff avatar scarlehoff commented on May 28, 2024

A possibility, using MirroredStrategy https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy?version=stable, just breaks the integration into equal chunks. Not very useful for distributing. If we want to do it correctly we need to implement something not very far from one of the ones here http://jakascorner.com/blog/2016/06/omp-for-scheduling.html#the-scheduling-types
Which means creating our own strategy.

from vegasflow.

scarrazza avatar scarrazza commented on May 28, 2024

There some projects like https://github.com/horovod/horovod which may help.

from vegasflow.

scarlehoff avatar scarlehoff commented on May 28, 2024

Let's have a look.
I've been reading more into the Tensorflow distribution strategies and it seems only the keras distribution is implemented and in order to use we have to tie our hands way too much imho.

I think it is better if we deal with it in our own terms for now (and actually don't take it into consideration for the rest of the code) because we can always go to the parallel/joblib strategy.

from vegasflow.

scarrazza avatar scarrazza commented on May 28, 2024

In view of the great shape of #17 , I think we should consider the possibility to inherit from VegasFlow some extra classes which implement specific distribution techniques such as:

  1. VegasFlow, default single GPU.
  2. TPEVegasFlow, using the ThreadPoolExecutor from concurrent.future or whatever.
  3. SparkVegasFlow, using Apache Spark.
  4. MPIVegasFlow using openMPI.

from vegasflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.