fychao / gpu_benchmark_tools Goto Github PK
View Code? Open in Web Editor NEWthis is collection of gpu benchmark tools
License: GNU General Public License v3.0
this is collection of gpu benchmark tools
License: GNU General Public License v3.0
I0321 01:19:19.885327 4390 sgd_solver.cpp:396] Snapshotting solver state to binary proto file snapshot/inception_resnet_v2_iter_110194.solverstate
I0321 01:19:20.014502 4390 solver.cpp:462] [0] Optimization stopped early.
I0321 01:19:20.016074 4311 parallel.cpp:67] Root Solver performance on device 0: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016113 4311 parallel.cpp:72] Solver performance on device 1: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016136 4311 parallel.cpp:72] Solver performance on device 2: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016156 4311 parallel.cpp:72] Solver performance on device 3: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016176 4311 parallel.cpp:72] Solver performance on device 4: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016198 4311 parallel.cpp:72] Solver performance on device 5: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016218 4311 parallel.cpp:72] Solver performance on device 6: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016239 4311 parallel.cpp:72] Solver performance on device 7: 1.916 * 1 = 1.916 img/sec (110194 itr in 5.752e+04 sec)
I0321 01:19:20.016254 4311 parallel.cpp:75] Overall multi-GPU performance: 15.3254 img/sec
I0321 01:19:30.776834 4311 caffe.cpp:231] Optimization Done in 16h 34m 53s
75 git clone https://github.com/torch/distro.git ~/torch --recursive
76 cd ~/torch; bash install-deps;
77 bash install-deps
78 vi install-deps
79 bash install-deps
80 ./install.sh
81 export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
82 ./install.sh
83 cd ../
84 ls
85 cd OpenNMT/
86 ls
87 cd
88 ls
89 source .bashrc
90 ls
91 vi .bashrc
92 th
93 ls
94 cd OpenNMT/
95 ls
96 vi README.md
97 luarocks install tds
98 vi README.md
99 luarocks install bit32
root@7b8cb7323ac8:/# ib_write_bw -a -F 172.16.130.2 -d mlx5_1 --report_gbits
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_1
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x11 QPN 0x0105 PSN 0xf34dc6 RKey 0x006b01 VAddr 0x007ff701e41000
remote address: LID 0x12 QPN 0x0106 PSN 0x9fc3f6 RKey 0x004de5 VAddr 0x007f43ce6e9000
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
2 5000 0.054671 0.054299 3.393695
4 5000 0.16 0.16 4.848040
8 5000 0.32 0.31 4.903680
16 5000 0.64 0.62 4.861019
32 5000 1.28 1.25 4.899429
64 5000 2.56 2.52 4.925522
128 5000 5.12 4.97 4.851018
256 5000 10.28 9.97 4.867019
512 5000 23.11 22.80 5.565797
1024 5000 50.07 46.70 5.700940
2048 5000 76.47 74.62 4.554210
4096 5000 86.03 85.82 2.619163
8192 5000 93.48 93.46 1.426129
16384 5000 94.40 94.36 0.719899
32768 5000 95.17 95.17 0.363046
65536 5000 95.25 95.25 0.181671
131072 5000 95.91 95.91 0.091469
262144 5000 95.68 95.67 0.045619
524288 5000 95.70 95.70 0.022818
1048576 5000 95.70 95.02 0.011328
2097152 5000 95.65 95.65 0.005701
4194304 5000 95.67 95.66 0.002851
8388608 5000 95.30 95.30 0.001420
---------------------------------------------------------------------------------------
root@7b8cb7323ac8:/#
on test data
[01/24/18 09:54:24 INFO] Epoch 13 ; Iteration 50/92 ; Optim SGD LR 0.240100 ; Source tokens/s 7603 ; Perplexity 262.84
[01/24/18 09:54:38 INFO] Epoch 13 ; Iteration 92/92 ; Optim SGD LR 0.240100 ; Source tokens/s 7492 ; Perplexity 270.82
[01/24/18 09:54:38 INFO] Evaluating on the validation dataset...
[01/24/18 09:54:42 INFO] Validation perplexity: 804.17
[01/24/18 09:54:42 INFO] Saving checkpoint to 'demo_epoch13_804.17.t7'...
real 8m27.912s
user 10m41.568s
sys 2m6.540s
In inRange
function prepare.py
Line: 91, it compares timestamps from each image file, located in ./cctv_imgs/$CCTV_CODE
, and converted speed timestamp, which is provided from extracted xml.gz
file in ./cctv_imgs/speed/
. If difference between image and speed timestamps is within range 0-60, this function will return an tuple (point, tm)
and back to Line: 136.
So if your server time is not synchronized with an up-to-date ntpdate server, time convert function will not be accurate and cause mis-match condition, so Line: 139 would have no image/speed-pairs to do.
INFO:tensorflow:Saving flow to /tmp/sempar-conll/sempar.flow
INFO:tensorflow:Best checkpoint written to /tmp/sempar-conll/checkpoints/best
2018-01-23 05:33:03.319294: I third_party/syntaxnet/dragnn/core/compute_session_pool.cc:53] Destroying pool: total number of sessions created = 1
Done.
real 155m1.453s
user 460m3.800s
sys 121m4.520s
haha
INFO:tensorflow:loss = 0.15291552, step = 97669 (5.215 sec)
2018-01-23 07:12:37.500129: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: End of sequence
[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,32,32,3], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
2018-01-23 07:12:37.501307: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: End of sequence
[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,32,32,3], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
2018-01-23 07:12:37.501393: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: End of sequence
[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,32,32,3], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
INFO:tensorflow:Saving checkpoints for 97675 into /tmp/cifar10_model/model.ckpt.
INFO:tensorflow:Loss for final step: 0.1549716.
INFO:tensorflow:Starting evaluation at 2018-01-23-07:12:38
2018-01-23 07:12:39.422733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla P100-PCIE-12GB, pci bus id: 0000:03:00.0, compute capability: 6.0)
2018-01-23 07:12:39.422844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla P100-PCIE-12GB, pci bus id: 0000:82:00.0, compute capability: 6.0)
INFO:tensorflow:Restoring parameters from /tmp/cifar10_model/model.ckpt-97675
INFO:tensorflow:Finished evaluation at 2018-01-23-07:12:41
INFO:tensorflow:Saving dict for global step 97675: accuracy = 0.9253, global_step = 97675, loss = 0.4823422
{'loss': 0.4823422, 'global_step': 97675, 'accuracy': 0.9253}
real 89m49.148s
user 333m33.540s
sys 73m16.852s
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.