Comments (4)
Hi,
sorry, I thought you were talking about the W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
line.
- Yes, evaluation is not based on epochs, but iterations. This was very useful when dealing with datasets of different sizes. For example RealEstate10K has a huge number of datapoints. Here, evaluating only every epoch would not give you a good overview of the training.
is_preprocessed
speeds up training because you don't have to load the full-sized images, but the already resized ones. (Theses files can be obtained from the script in the datasets/kitti-360 directory.- I have not really used multi-GPU training during development. Therefore, this issue might have not appeared
from behindthescenes.
Hi!
I dont observe the same behaviour on any of our machines. I think this is an issue with your system setup.
Best,
Felix
from behindthescenes.
I see the same kind of logs as described by @zsz-pro.
Is the reason simply because there are multiple validation/visualisation steps per epoch? These lines of the default configuration file suggest that is true.
Running the KITTI-360 experiment: The things I changed:
- change
is_preprocess
in this dataset config to befalse
(avoiding the resize) - adding (hacky) fix at script entrypoint due to this issue, where torch's linalg module evaluates lazily, which breaks multi-GPU training with CUDA version < 11.7. I am surprised
I am kind of surprised this is not an issue for the repo's given conda environment as it uses pytorch-cuda=11.6
.
Below is a chunk of the training logs that show Epochs 13, 14 and 15. Like all other epochs, some logs are printed multiple times.
Expand to see logs
Trained on single A100 GPU
[2023-07-12 14:11:15,398][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 14:11:22,900][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:06.947
[2023-07-12 14:11:22,902][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:07.502
2023-07-12 14:11:23,019 kitti_360 INFO:
Epoch 13 - Evaluation time (seconds): 7.50 - Vis metrics:
abs_rel: 0.08554093948269809
sq_rel: 0.6244590130093236
rmse: 3.4999701248984034
rmse_log: 0.19446041207287681
a1: 0.898856520652771
a2: 0.94862300157547
a3: 0.9731572866439819
2023-07-12 14:17:47,428 kitti_360 INFO: Epoch[13] Complete. Time taken: 05:34:09.799
[2023-07-12 14:37:25,148][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
[2023-07-12 14:40:19,743][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:02:54.265
[2023-07-12 14:40:20,375][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:02:55.225
2023-07-12 14:40:20,487 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 175.22 - Test metrics:
abs_rel: 0.11480931034032174
sq_rel: 0.6637996911791815
rmse: 3.7153490281445607
rmse_log: 0.21579428924271787
a1: 0.8733996527735144
a2: 0.9506910068448633
a3: 0.9723349793348461
[2023-07-12 14:40:20,488][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 14:40:25,339][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:04.576
[2023-07-12 14:40:25,340][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:04.852
2023-07-12 14:40:25,478 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 4.85 - Vis metrics:
abs_rel: 0.08070596869247891
sq_rel: 0.5057030975542547
rmse: 3.585757716686993
rmse_log: 0.20001286568907975
a1: 0.8933269381523132
a2: 0.949320912361145
a3: 0.9724593758583069
[2023-07-12 15:06:29,778][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 15:06:37,338][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:07.219
[2023-07-12 15:06:37,339][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:07.559
2023-07-12 15:06:37,456 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 7.56 - Vis metrics:
abs_rel: 0.08622214936775972
sq_rel: 0.6045194685131523
rmse: 3.762468358054172
rmse_log: 0.19322613491407561
a1: 0.8915016055107117
a2: 0.952703058719635
a3: 0.9756804704666138
[2023-07-12 15:33:00,013][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 15:33:06,200][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:05.874
[2023-07-12 15:33:06,201][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:06.187
2023-07-12 15:33:06,333 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 6.19 - Vis metrics:
abs_rel: 0.08986934827280171
sq_rel: 0.6382087551104668
rmse: 3.652709459751095
rmse_log: 0.2037586631711583
a1: 0.8867235779762268
a2: 0.946153461933136
a3: 0.9722446203231812
[2023-07-12 15:59:45,414][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 15:59:53,692][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:07.959
[2023-07-12 15:59:53,693][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:08.278
2023-07-12 15:59:53,835 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 8.28 - Vis metrics:
abs_rel: 0.08863173995398226
sq_rel: 0.7067783246498143
rmse: 3.700866907608836
rmse_log: 0.2063800082754662
a1: 0.9044398069381714
a2: 0.9475492835044861
a3: 0.9684866070747375
[2023-07-12 16:26:36,550][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
[2023-07-12 16:29:40,259][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:03:03.395
[2023-07-12 16:29:40,260][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:03:03.709
2023-07-12 16:29:40,373 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 183.71 - Test metrics:
abs_rel: 0.1219142335086286
sq_rel: 0.8344372441241323
rmse: 3.9208930653213243
rmse_log: 0.2237279891813233
a1: 0.874464736552909
a2: 0.9490371746942401
a3: 0.970153178088367
[2023-07-12 16:29:40,374][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 16:29:45,322][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:04.692
[2023-07-12 16:29:45,324][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:04.950
2023-07-12 16:29:45,456 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 4.95 - Vis metrics:
abs_rel: 0.08858976934422431
sq_rel: 0.7473974579866556
rmse: 3.7746566867415696
rmse_log: 0.20740206238851125
a1: 0.8989638686180115
a2: 0.9448649883270264
a3: 0.9682719111442566
[2023-07-12 16:56:27,753][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 16:56:35,994][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:07.910
[2023-07-12 16:56:35,995][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:08.241
2023-07-12 16:56:36,136 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 8.24 - Vis metrics:
abs_rel: 0.08209793065801733
sq_rel: 0.5959427187126923
rmse: 3.6324440924085453
rmse_log: 0.21365165189088933
a1: 0.8960111737251282
a2: 0.944381833076477
a3: 0.9666076302528381
[2023-07-12 17:23:36,441][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 17:23:45,494][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:08.728
[2023-07-12 17:23:45,495][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:09.054
2023-07-12 17:23:45,627 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 9.05 - Vis metrics:
abs_rel: 0.0834830856404954
sq_rel: 0.5985272677240563
rmse: 3.5497722144365085
rmse_log: 0.19657217609012884
a1: 0.8934342861175537
a2: 0.9491061568260193
a3: 0.9759489297866821
[2023-07-12 17:50:43,280][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 17:50:53,359][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:09.749
[2023-07-12 17:50:53,361][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:10.080
2023-07-12 17:50:53,477 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 10.08 - Vis metrics:
Epoch 14 - Evaluation time (seconds): 9.05 - Vis metrics:
abs_rel: 0.0834830856404954
sq_rel: 0.5985272677240563
rmse: 3.5497722144365085
rmse_log: 0.19657217609012884
a1: 0.8934342861175537
a2: 0.9491061568260193
a3: 0.9759489297866821
[2023-07-12 17:50:43,280][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 17:50:53,359][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:09.749
[2023-07-12 17:50:53,361][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:10.080
2023-07-12 17:50:53,477 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 10.08 - Vis metrics:
abs_rel: 0.08506939417545267
sq_rel: 0.7149858369983374
rmse: 3.7219778391891336
rmse_log: 0.20664187296963754
a1: 0.8980512619018555
a2: 0.9488377571105957
a3: 0.9705266952514648
[2023-07-12 18:17:51,587][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
[2023-07-12 18:21:12,396][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:03:20.469
[2023-07-12 18:21:13,021][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:03:21.433
2023-07-12 18:21:13,181 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 201.43 - Test metrics:
abs_rel: 0.11314249655844683
sq_rel: 0.7012545761475909
rmse: 3.7195900263479533
rmse_log: 0.21453171459305867
a1: 0.8770193115342408
a2: 0.9501760615967214
a3: 0.9723244274500757
[2023-07-12 18:21:13,182][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 18:21:18,254][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:04.809
[2023-07-12 18:21:18,256][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:05.074
2023-07-12 18:21:18,366 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 5.07 - Vis metrics:
abs_rel: 0.07578395172581752
sq_rel: 0.5024875081138518
rmse: 3.4798857048593157
rmse_log: 0.194624591324213
a1: 0.8971922397613525
a2: 0.9471734762191772
a3: 0.9724593758583069
[2023-07-12 18:48:05,497][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 18:48:13,010][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:07.221
[2023-07-12 18:48:13,011][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:07.512
2023-07-12 18:48:13,150 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 7.51 - Vis metrics:
abs_rel: 0.08523062198386379
sq_rel: 0.6145007152112157
rmse: 3.817461628217154
rmse_log: 0.21513257298443078
a1: 0.8903741836547852
a2: 0.9435228705406189
a3: 0.966446578502655
[2023-07-12 19:14:46,552][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 19:14:53,621][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:06.755
[2023-07-12 19:14:53,622][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:07.068
2023-07-12 19:14:53,732 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 7.07 - Vis metrics:
abs_rel: 0.08337322146351114
sq_rel: 0.6457452155974626
rmse: 3.5514027400976587
rmse_log: 0.2043717522969885
a1: 0.9019165635108948
a2: 0.9489988088607788
a3: 0.9712782502174377
[2023-07-12 19:41:40,019][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 19:41:47,555][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:07.096
[2023-07-12 19:41:47,557][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:07.536
2023-07-12 19:41:47,695 kitti_360 INFO:
Epoch 14 - Evaluation time (seconds): 7.54 - Vis metrics:
abs_rel: 0.08835147521652986
sq_rel: 0.6003459391090917
rmse: 3.6419516338590174
rmse_log: 0.21035350637508463
a1: 0.894454300403595
a2: 0.9486766457557678
a3: 0.9699361324310303
2023-07-12 19:55:05,461 kitti_360 INFO: Epoch[14] Complete. Time taken: 05:37:18.031
[2023-07-12 20:08:45,942][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
[2023-07-12 20:11:52,930][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:03:06.664
[2023-07-12 20:11:52,931][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:03:06.987
2023-07-12 20:11:53,049 kitti_360 INFO:
Epoch 15 - Evaluation time (seconds): 186.99 - Test metrics:
abs_rel: 0.11556165418269916
sq_rel: 0.7672341478655641
rmse: 3.770929400000447
rmse_log: 0.21133477194124622
a1: 0.8862405351828784
a2: 0.9514126554131508
a3: 0.9731135093607008
[2023-07-12 20:11:53,050][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 20:11:57,990][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:04.633
[2023-07-12 20:11:57,991][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:04.941
2023-07-12 20:11:58,109 kitti_360 INFO:
Epoch 15 - Evaluation time (seconds): 4.94 - Vis metrics:
abs_rel: 0.0882263961585384
sq_rel: 0.6482227879655088
rmse: 3.5954508476388582
rmse_log: 0.19819850572473746
a1: 0.8984270095825195
a2: 0.9485155940055847
a3: 0.9729425311088562
[2023-07-12 20:38:44,346][ignite.engine.engine.Engine][INFO] - Engine run starting with max_epochs=1.
Evaluation (val): [1/1] 100%|████████████████████████████████████████████████████████████████████████████ [00:00<?]Visualizing
[2023-07-12 20:38:50,976][ignite.engine.engine.Engine][INFO] - Epoch[1] Complete. Time taken: 00:00:06.318
[2023-07-12 20:38:50,977][ignite.engine.engine.Engine][INFO] - Engine run complete. Time taken: 00:00:06.630
2023-07-12 20:38:51,096 kitti_360 INFO:
Epoch 15 - Evaluation time (seconds): 6.63 - Vis metrics:
abs_rel: 0.08941568872297122
sq_rel: 0.7008410126920626
rmse: 3.5520300447032693
rmse_log: 0.20019262230258883
a1: 0.9054061770439148
a2: 0.9499651193618774
a3: 0.9685403108596802
I think we see Epoch [1]
for every round of evaluation.
After epochs 13 and 14 finish (in my logs above), we do actually see a log line stating that is finished:
2023-07-12 19:55:05,461 kitti_360 INFO: Epoch[14] Complete. Time taken: 05:37:18.031
from behindthescenes.
I got it!Thanks for your reply!
from behindthescenes.
Related Issues (20)
- question about visualizing ground truth depth HOT 5
- Is the result of this method on KITTI-Raw based on stereo cameras? HOT 3
- Why the profile results are different with yours? HOT 5
- Question about of generate novel view animations HOT 1
- Please add license
- Will the evaluation code be released? HOT 1
- Occupancy visualization code
- Missing some files in KITTI-RAW Poses in datasets/kitti_raw/orb-slam_poses. HOT 4
- question about change the rotation to get the novel view in image custome HOT 1
- These files are not in the link mentioned by you. HOT 1
- Experiment settings of other works mentioned in the paper HOT 2
- Some details I feel confused about HOT 3
- Train on KITTI-360 with fisheye HOT 2
- details with the keyParam ray_batch_size: 4096 HOT 2
- FisheyeToPinholeSampler HOT 2
- Does larger MLP affects the final results? HOT 1
- Inferior scores of both provided models and trained models HOT 1
- Wrong depth projection of kitti360 dataset
- Some problem about this architecture
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from behindthescenes.