In addition to runtime, it would be good to know what the the peak amount of GPU RAM u

i like <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Benchmark peak GPU memory used about convnet-benchmarks HOT 14 OPEN

soumith commented on May 22, 2024

Benchmark peak GPU memory used

from convnet-benchmarks.

Comments (14)

soumith commented on May 22, 2024

while i really want to see this as well, there's no easy way to do this, afaik. if someone knows how to measure GPU memory consumed exactly (other than running nvidia-smi at the exact time and hoping to catch a glimpse), i'd love to know.

from convnet-benchmarks.

cancan101 commented on May 22, 2024

Theano has theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info():
(https://github.com/Theano/Theano/blob/master/theano/sandbox/cuda/cuda_ndarray.cu#L3418)

I am not sure about the other frameworks. Also you have to make that function call.

from convnet-benchmarks.

soumith commented on May 22, 2024

right the Cuda API has cudaMemGetInfo, but the problem is that you dont know the peak memory usage while the kernels are running, you only get the mem usage before and after (or by chance in between, if the call itself is asynchronous and the timing is just right).

from convnet-benchmarks.

soumith commented on May 22, 2024

the functions that launch the kernels might have mallocs and frees before and after launching the kernel, we would want to be first-party to the mallocs before the frees happen.

from convnet-benchmarks.

nouiz commented on May 22, 2024

This can be run only between theano fct call. So you won't know the peak.

I see 2 ways to do this:

give the formula for the extra memory requested for the gpu.
run experiments bigger and bigger until it crash with missing memory.
Then you compare the biggest one that was working.

On Thu, Jul 31, 2014 at 1:31 PM, Alex Rothberg [email protected]
wrote:

Theano has theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info():
(
https://github.com/Theano/Theano/blob/master/theano/sandbox/cuda/cuda_ndarray.cu#L3418
)

I am not sure about the other frameworks. Also you have to make that
function call.

—
Reply to this email directly or view it on GitHub
#9 (comment)
.

from convnet-benchmarks.

soumith commented on May 22, 2024

i like @nouiz 's formula approach, we can calculate it from the method used.

from convnet-benchmarks.

soumith commented on May 22, 2024

@nouiz I can add the formulae, could you help out in adding time benchmarks for the backprop calls for Theano, i have finished backprop numbers for all the others.

from convnet-benchmarks.

nouiz commented on May 22, 2024

I'm pretty buzy, I don't know when I can do it. What do you need exactly, a
benchmark with the full forward/backward or just the backward? Is there an
example I can follow?

On Thu, Jul 31, 2014 at 1:50 PM, Soumith Chintala [email protected]
wrote:

@nouiz https://github.com/nouiz I can add the formulae, could you help
out in adding time benchmarks for the backprop calls for Theano, i have
finished backprop numbers for all the others.

—
Reply to this email directly or view it on GitHub
#9 (comment)
.

from convnet-benchmarks.

soumith commented on May 22, 2024

What do you need exactly, a benchmark with the full forward/backward or just the backward?

Adding backward timings to the current benchmark is what I'm looking for:
https://github.com/soumith/convnet-benchmarks/blob/master/theano/pylearn2_benchmark.py

from convnet-benchmarks.

nicholas-leonard commented on May 22, 2024

Okay so you want one timing for the entirety of the forward/backward.

from convnet-benchmarks.

nicholas-leonard commented on May 22, 2024

Oh, an just so we are clear, does this includes an inplace parameter update? (in torch, backward or backwardUpdate vs just updateGradInput).

from convnet-benchmarks.

soumith commented on May 22, 2024

Okay so you want one timing for the entirety of the forward/backward.

Correct.

Oh, an just so we are clear, does this includes an inplace parameter update?

backward (so updateGradInput + accGradParameters)

Thanks nick!

from convnet-benchmarks.

nicholas-leonard commented on May 22, 2024

French:
Ok donc Fred ce sera benchmark qui inclu le forward, ainsi qu'un update des parametres utilisant les gradient. Et le update peut être fait inplace.

from convnet-benchmarks.

nicholas-leonard commented on May 22, 2024

Okay so the theano backward is complete: #11 @f0k.

from convnet-benchmarks.

Benchmark peak GPU memory used about convnet-benchmarks HOT 14 OPEN

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent