Comments (14)
while i really want to see this as well, there's no easy way to do this, afaik. if someone knows how to measure GPU memory consumed exactly (other than running nvidia-smi at the exact time and hoping to catch a glimpse), i'd love to know.
from convnet-benchmarks.
Theano has theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info()
:
(https://github.com/Theano/Theano/blob/master/theano/sandbox/cuda/cuda_ndarray.cu#L3418)
I am not sure about the other frameworks. Also you have to make that function call.
from convnet-benchmarks.
right the Cuda API has cudaMemGetInfo, but the problem is that you dont know the peak memory usage while the kernels are running, you only get the mem usage before and after (or by chance in between, if the call itself is asynchronous and the timing is just right).
from convnet-benchmarks.
the functions that launch the kernels might have mallocs and frees before and after launching the kernel, we would want to be first-party to the mallocs before the frees happen.
from convnet-benchmarks.
This can be run only between theano fct call. So you won't know the peak.
I see 2 ways to do this:
- give the formula for the extra memory requested for the gpu.
- run experiments bigger and bigger until it crash with missing memory.
Then you compare the biggest one that was working.
On Thu, Jul 31, 2014 at 1:31 PM, Alex Rothberg [email protected]
wrote:
Theano has theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info():
(
https://github.com/Theano/Theano/blob/master/theano/sandbox/cuda/cuda_ndarray.cu#L3418
)I am not sure about the other frameworks. Also you have to make that
function call.β
Reply to this email directly or view it on GitHub
#9 (comment)
.
from convnet-benchmarks.
i like @nouiz 's formula approach, we can calculate it from the method used.
from convnet-benchmarks.
@nouiz I can add the formulae, could you help out in adding time benchmarks for the backprop calls for Theano, i have finished backprop numbers for all the others.
from convnet-benchmarks.
I'm pretty buzy, I don't know when I can do it. What do you need exactly, a
benchmark with the full forward/backward or just the backward? Is there an
example I can follow?
On Thu, Jul 31, 2014 at 1:50 PM, Soumith Chintala [email protected]
wrote:
@nouiz https://github.com/nouiz I can add the formulae, could you help
out in adding time benchmarks for the backprop calls for Theano, i have
finished backprop numbers for all the others.β
Reply to this email directly or view it on GitHub
#9 (comment)
.
from convnet-benchmarks.
What do you need exactly, a benchmark with the full forward/backward or just the backward?
Adding backward timings to the current benchmark is what I'm looking for:
https://github.com/soumith/convnet-benchmarks/blob/master/theano/pylearn2_benchmark.py
from convnet-benchmarks.
Okay so you want one timing for the entirety of the forward/backward
.
from convnet-benchmarks.
Oh, an just so we are clear, does this includes an inplace parameter update? (in torch, backward
or backwardUpdate
vs just updateGradInput
).
from convnet-benchmarks.
Okay so you want one timing for the entirety of the forward/backward.
Correct.
Oh, an just so we are clear, does this includes an inplace parameter update?
backward (so updateGradInput + accGradParameters)
Thanks nick!
from convnet-benchmarks.
French:
Ok donc Fred ce sera benchmark qui inclu le forward, ainsi qu'un update des parametres utilisant les gradient. Et le update peut Γͺtre fait inplace.
from convnet-benchmarks.
Okay so the theano backward is complete: #11 @f0k.
from convnet-benchmarks.
Related Issues (20)
- CPU Convnet Benchmarks: Caffe vs. Torch Discrepancies (20x) on Jetson TX1 A57 CPU HOT 7
- CuDNN[R4]-fp16 (Torch) results
- Torch 7 HOT 5
- After FFT & Winograd, what next? HOT 3
- Number of kernels in alexnet_benmark HOT 1
- Use Tensorflow benchmark without GPU HOT 1
- Problem running on older GPUs HOT 2
- Add PyTorch Benchmarks HOT 5
- Enable XLA support for Tensorflow HOT 2
- Tensorflow benchmarks cause error when running run_forward_backward HOT 1
- Updating benchmarks for recent cuDNN v6 HOT 1
- Tesla GV100 results? HOT 1
- Issue running tf_cnn_benchmark on Xeon Phi
- Amd Vega results with MIOpen?
- Cudnn 7 support?
- Tensorflow benchmark files not updated after migration?
- this project has stop update?
- worse chainer convnet-benchmarks performance on cupy-2.0.0 as compared to cupy-1.0.0.1 HOT 2
- cltorch googlenet.lua: attempt to index global 'cudnn' (a nil value)
- convnet-benchmark is not working with tensorflow 1.8 on AMD or Nvidia cards HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from convnet-benchmarks.