Giter Site home page Giter Site logo

gpu_benchmark_tools's People

Contributors

fychao avatar twsc-aug avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

twsc-aug

gpu_benchmark_tools's Issues

caffe over imagenet

I0321 01:19:19.885327  4390 sgd_solver.cpp:396] Snapshotting solver state to binary proto file snapshot/inception_resnet_v2_iter_110194.solverstate
I0321 01:19:20.014502  4390 solver.cpp:462] [0] Optimization stopped early.
I0321 01:19:20.016074  4311 parallel.cpp:67] Root Solver performance on device 0: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016113  4311 parallel.cpp:72]      Solver performance on device 1: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016136  4311 parallel.cpp:72]      Solver performance on device 2: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016156  4311 parallel.cpp:72]      Solver performance on device 3: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016176  4311 parallel.cpp:72]      Solver performance on device 4: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016198  4311 parallel.cpp:72]      Solver performance on device 5: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016218  4311 parallel.cpp:72]      Solver performance on device 6: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016239  4311 parallel.cpp:72]      Solver performance on device 7: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016254  4311 parallel.cpp:75] Overall multi-GPU performance: 15.3254 img/sec
I0321 01:19:30.776834  4311 caffe.cpp:231] Optimization Done in 16h 34m 53s

using torch

   75  git clone https://github.com/torch/distro.git ~/torch --recursive
   76  cd ~/torch; bash install-deps;
   77  bash install-deps
   78  vi install-deps
   79  bash install-deps
   80  ./install.sh
   81  export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
   82  ./install.sh
   83  cd ../
   84  ls
   85  cd OpenNMT/
   86  ls
   87  cd
   88  ls
   89  source .bashrc
   90  ls
   91  vi .bashrc
   92  th
   93  ls
   94  cd OpenNMT/
   95  ls
   96  vi README.md
   97  luarocks install tds
   98  vi README.md
   99  luarocks install bit32

IB+TF+DOCKER

root@7b8cb7323ac8:/# ib_write_bw -a -F 172.16.130.2 -d mlx5_1 --report_gbits
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF          Device         : mlx5_1
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 TX depth        : 128
 CQ Moderation   : 100
 Mtu             : 4096[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x11 QPN 0x0105 PSN 0xf34dc6 RKey 0x006b01 VAddr 0x007ff701e41000
 remote address: LID 0x12 QPN 0x0106 PSN 0x9fc3f6 RKey 0x004de5 VAddr 0x007f43ce6e9000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
 2          5000           0.054671            0.054299            3.393695
 4          5000             0.16               0.16               4.848040
 8          5000             0.32               0.31               4.903680
 16         5000             0.64               0.62               4.861019
 32         5000             1.28               1.25               4.899429
 64         5000             2.56               2.52               4.925522
 128        5000             5.12               4.97               4.851018
 256        5000             10.28              9.97               4.867019
 512        5000             23.11              22.80              5.565797
 1024       5000             50.07              46.70              5.700940
 2048       5000             76.47              74.62              4.554210
 4096       5000             86.03              85.82              2.619163
 8192       5000             93.48              93.46              1.426129
 16384      5000             94.40              94.36              0.719899
 32768      5000             95.17              95.17              0.363046
 65536      5000             95.25              95.25              0.181671
 131072     5000             95.91              95.91              0.091469
 262144     5000             95.68              95.67              0.045619
 524288     5000             95.70              95.70              0.022818
 1048576    5000             95.70              95.02              0.011328
 2097152    5000             95.65              95.65              0.005701
 4194304    5000             95.67              95.66              0.002851
 8388608    5000             95.30              95.30              0.001420
---------------------------------------------------------------------------------------
root@7b8cb7323ac8:/#

OpenNMT test

on test data

[01/24/18 09:54:24 INFO] Epoch 13 ; Iteration 50/92 ; Optim SGD LR 0.240100 ; Source tokens/s 7603 ; Perplexity 262.84
[01/24/18 09:54:38 INFO] Epoch 13 ; Iteration 92/92 ; Optim SGD LR 0.240100 ; Source tokens/s 7492 ; Perplexity 270.82
[01/24/18 09:54:38 INFO] Evaluating on the validation dataset...
[01/24/18 09:54:42 INFO] Validation perplexity: 804.17
[01/24/18 09:54:42 INFO] Saving checkpoint to 'demo_epoch13_804.17.t7'...

real    8m27.912s
user    10m41.568s
sys     2m6.540s

tmp for response

In inRange function prepare.py Line: 91, it compares timestamps from each image file, located in ./cctv_imgs/$CCTV_CODE, and converted speed timestamp, which is provided from extracted xml.gz file in ./cctv_imgs/speed/. If difference between image and speed timestamps is within range 0-60, this function will return an tuple (point, tm) and back to Line: 136.

So if your server time is not synchronized with an up-to-date ntpdate server, time convert function will not be accurate and cause mis-match condition, so Line: 139 would have no image/speed-pairs to do.

sling benchmark

INFO:tensorflow:Saving flow to /tmp/sempar-conll/sempar.flow
INFO:tensorflow:Best checkpoint written to /tmp/sempar-conll/checkpoints/best
2018-01-23 05:33:03.319294: I third_party/syntaxnet/dragnn/core/compute_session_pool.cc:53] Destroying pool: total number of sessions created = 1
Done.

real    155m1.453s
user    460m3.800s
sys     121m4.520s

resnet on cifar10

INFO:tensorflow:loss = 0.15291552, step = 97669 (5.215 sec)
2018-01-23 07:12:37.500129: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: End of sequence
         [[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,32,32,3], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
2018-01-23 07:12:37.501307: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: End of sequence
         [[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,32,32,3], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
2018-01-23 07:12:37.501393: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: End of sequence
         [[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,32,32,3], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
INFO:tensorflow:Saving checkpoints for 97675 into /tmp/cifar10_model/model.ckpt.
INFO:tensorflow:Loss for final step: 0.1549716.
INFO:tensorflow:Starting evaluation at 2018-01-23-07:12:38
2018-01-23 07:12:39.422733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla P100-PCIE-12GB, pci bus id: 0000:03:00.0, compute capability: 6.0)
2018-01-23 07:12:39.422844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla P100-PCIE-12GB, pci bus id: 0000:82:00.0, compute capability: 6.0)
INFO:tensorflow:Restoring parameters from /tmp/cifar10_model/model.ckpt-97675
INFO:tensorflow:Finished evaluation at 2018-01-23-07:12:41
INFO:tensorflow:Saving dict for global step 97675: accuracy = 0.9253, global_step = 97675, loss = 0.4823422
{'loss': 0.4823422, 'global_step': 97675, 'accuracy': 0.9253}

real    89m49.148s
user    333m33.540s
sys     73m16.852s

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.