After a more careful reading of <a href="https://www.tensorflow.org/guide/gpu" rel="no

A possibility, using MirroredStrategy <a href="https:

In view of the great shape of <a class="issue-link js-issue-link" data-error-text="Fai

Multi-GPU parallelism about vegasflow HOT 5 CLOSED

scarrazza commented on May 28, 2024

Multi-GPU parallelism

from vegasflow.

Comments (5)

scarrazza commented on May 28, 2024

I can confirm that 1 is true, by adding to the lepage example the call

tf.debugging.set_log_device_placement(True)

and observing that the log never places an operator on GPU:1 but always on GPU:0 (even if nvidia-smi says that the program is using memory from GPU:1).

Concerning points 2 and 3, I think the best tf-like approach is to do something like this:

@tf.function
def run():
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        strategy.experimental_run_v2(vegas(lepage, dim, n_iter, ncalls))

i.e. using the tf.distribute API.

from vegasflow.

scarlehoff commented on May 28, 2024

A possibility, using MirroredStrategy https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy?version=stable, just breaks the integration into equal chunks. Not very useful for distributing. If we want to do it correctly we need to implement something not very far from one of the ones here http://jakascorner.com/blog/2016/06/omp-for-scheduling.html#the-scheduling-types
Which means creating our own strategy.

from vegasflow.

scarrazza commented on May 28, 2024

There some projects like https://github.com/horovod/horovod which may help.

from vegasflow.

scarlehoff commented on May 28, 2024

Let's have a look.
I've been reading more into the Tensorflow distribution strategies and it seems only the keras distribution is implemented and in order to use we have to tie our hands way too much imho.

I think it is better if we deal with it in our own terms for now (and actually don't take it into consideration for the rest of the code) because we can always go to the parallel/joblib strategy.

from vegasflow.

scarrazza commented on May 28, 2024

In view of the great shape of #17 , I think we should consider the possibility to inherit from VegasFlow some extra classes which implement specific distribution techniques such as:

VegasFlow, default single GPU.
TPEVegasFlow, using the ThreadPoolExecutor from concurrent.future or whatever.
SparkVegasFlow, using Apache Spark.
MPIVegasFlow using openMPI.

from vegasflow.

Related Issues (20)

Recommend Projects

Multi-GPU parallelism about vegasflow HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent