Giter Site home page Giter Site logo

Comments (4)

paynie avatar paynie commented on May 14, 2024

Please paste all logs and threadstacks

from angel.

QingdiMeng avatar QingdiMeng commented on May 14, 2024

2017-07-13 10:23:54,482 INFO [pool-8-thread-3] org.ehcache.sizeof.impl.AgentLoader: Agent successfully loaded and available!
2017-07-13 10:23:54,525 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRTrainTask: Task[129] preprocessed 162 samples, 145 for train, 17 for validation. feanum=1024
2017-07-13 10:23:54,525 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRTrainTask: Task[128] preprocessed 163 samples, 146 for train, 17 for validation. feanum=1024
2017-07-13 10:23:54,525 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRTrainTask: Task[130] preprocessed 163 samples, 146 for train, 17 for validation. feanum=1024
2017-07-13 10:23:54,525 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRTrainTask: Task[131] preprocessed 163 samples, 146 for train, 17 for validation. feanum=1024
2017-07-13 10:23:54,534 INFO [pool-8-thread-2] com.tencent.angel.ml.model.PSModel: After training matrix lr_weight will be saved to hdfs://c3prc-hadoop/user/h_miui_ad/develop/mengqingdi/dmp_example/sim/model
2017-07-13 10:23:54,534 INFO [pool-8-thread-1] com.tencent.angel.ml.model.PSModel: After training matrix lr_weight will be saved to hdfs://c3prc-hadoop/user/h_miui_ad/develop/mengqingdi/dmp_example/sim/model
2017-07-13 10:23:54,534 INFO [pool-8-thread-4] com.tencent.angel.ml.model.PSModel: After training matrix lr_weight will be saved to hdfs://c3prc-hadoop/user/h_miui_ad/develop/mengqingdi/dmp_example/sim/model
2017-07-13 10:23:54,534 INFO [pool-8-thread-3] com.tencent.angel.ml.model.PSModel: After training matrix lr_weight will be saved to hdfs://c3prc-hadoop/user/h_miui_ad/develop/mengqingdi/dmp_example/sim/model
2017-07-13 10:23:54,598 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: Starting to train a LR model...
2017-07-13 10:23:54,598 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: Sample Ratio per Batch=1.0, Sample Size Per 14
2017-07-13 10:23:54,598 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=10, initLearnRate=1.0, learnRateDecay=0.1, L2Reg=0.0
2017-07-13 10:23:54,598 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=0 start.
2017-07-13 10:23:54,620 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: Starting to train a LR model...
2017-07-13 10:23:54,620 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: Starting to train a LR model...
2017-07-13 10:23:54,621 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: Sample Ratio per Batch=1.0, Sample Size Per 14
2017-07-13 10:23:54,621 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: Sample Ratio per Batch=1.0, Sample Size Per 14
2017-07-13 10:23:54,621 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: Starting to train a LR model...
2017-07-13 10:23:54,621 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=10, initLearnRate=1.0, learnRateDecay=0.1, L2Reg=0.0
2017-07-13 10:23:54,621 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: Sample Ratio per Batch=1.0, Sample Size Per 14
2017-07-13 10:23:54,621 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=0 start.
2017-07-13 10:23:54,621 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=10, initLearnRate=1.0, learnRateDecay=0.1, L2Reg=0.0
2017-07-13 10:23:54,621 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=0 start.
2017-07-13 10:23:54,622 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=10, initLearnRate=1.0, learnRateDecay=0.1, L2Reg=0.0
2017-07-13 10:23:54,622 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=0 start.
2017-07-13 10:23:54,800 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=0 mini-batch update success. cost 179 ms. batch loss = 63.08811830293313
2017-07-13 10:23:54,802 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=0 mini-batch update success. cost 204 ms. batch loss = 53.709265469770756
2017-07-13 10:23:54,802 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=0 mini-batch update success. cost 181 ms. batch loss = 65.25486122405704
2017-07-13 10:23:54,803 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=0 mini-batch update success. cost 181 ms. batch loss = 66.32487218147045
2017-07-13 10:23:54,807 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=0 trainData loss=51.348154803886416 precision=0.8356164383561644 auc=0.913695652173913 trueRecall=0.6739130434782609 falseRecall=0.91
2017-07-13 10:23:54,807 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=0 trainData loss=44.2528355368481 precision=0.8904109589041096 auc=0.9208976157082749 trueRecall=0.6451612903225806 falseRecall=0.9565217391304348
2017-07-13 10:23:54,807 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=0 validationData loss=6.597905087865923 precision=0.8823529411764706 auc=0.8181818181818182 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:23:54,807 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=0 validationData loss=8.243051217861991 precision=0.8235294117647058 auc=0.6538461538461539 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:23:54,810 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=0 trainData loss=55.26277598705981 precision=0.8344827586206897 auc=0.8944909001475653 trueRecall=0.7368421052631579 falseRecall=0.8691588785046729
2017-07-13 10:23:54,810 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=0 trainData loss=67.22076975645209 precision=0.8082191780821918 auc=0.8902310924369747 trueRecall=0.8529411764705882 falseRecall=0.7946428571428571
2017-07-13 10:23:54,810 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=0 validationData loss=6.967587101396562 precision=0.7647058823529411 auc=0.75 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:23:54,810 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=0 validationData loss=11.990722408102025 precision=0.5294117647058824 auc=0.7833333333333333 trueRecall=0.8 falseRecall=0.4166666666666667
2017-07-13 10:23:54,827 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=0 success. epoch cost 205 ms. train cost 182 ms. validation cost 23 ms.
2017-07-13 10:23:54,827 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=0 success. epoch cost 206 ms. train cost 181 ms. validation cost 25 ms.
2017-07-13 10:23:54,827 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=0 success. epoch cost 206 ms. train cost 179 ms. validation cost 27 ms.
2017-07-13 10:23:54,827 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=0 success. epoch cost 229 ms. train cost 204 ms. validation cost 25 ms.
2017-07-13 10:23:54,831 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=1 start.
2017-07-13 10:23:54,831 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=1 start.
2017-07-13 10:23:54,832 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=1 start.
2017-07-13 10:23:54,832 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=1 start.
2017-07-13 10:23:54,911 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=1 mini-batch update success. cost 79 ms. batch loss = 52.764568125246896
2017-07-13 10:23:54,912 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=1 mini-batch update success. cost 80 ms. batch loss = 61.73724878686478
2017-07-13 10:23:54,912 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=1 trainData loss=44.141213001623036 precision=0.8835616438356164 auc=0.9197755960729314 trueRecall=0.6129032258064516 falseRecall=0.9565217391304348
2017-07-13 10:23:54,912 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=1 mini-batch update success. cost 80 ms. batch loss = 64.04465923557609
2017-07-13 10:23:54,913 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=1 validationData loss=8.155025414050046 precision=0.8235294117647058 auc=0.6538461538461539 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:23:54,913 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=1 trainData loss=51.30385286312969 precision=0.8424657534246576 auc=0.9132608695652173 trueRecall=0.6521739130434783 falseRecall=0.93
2017-07-13 10:23:54,913 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=1 trainData loss=54.752887512425325 precision=0.8482758620689655 auc=0.8939990162321693 trueRecall=0.7105263157894737 falseRecall=0.897196261682243
2017-07-13 10:23:54,913 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=1 validationData loss=6.607728406422354 precision=0.8823529411764706 auc=0.8181818181818182 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:23:54,914 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=1 validationData loss=6.865887813894462 precision=0.7647058823529411 auc=0.75 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:23:54,914 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=1 success. epoch cost 82 ms. train cost 80 ms. validation cost 2 ms.
2017-07-13 10:23:54,915 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=1 success. epoch cost 83 ms. train cost 80 ms. validation cost 3 ms.
2017-07-13 10:23:54,915 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=1 success. epoch cost 83 ms. train cost 79 ms. validation cost 4 ms.
2017-07-13 10:23:54,916 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=2 start.
2017-07-13 10:23:54,915 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=2 start.
2017-07-13 10:23:54,916 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=2 start.
2017-07-13 10:23:54,924 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=1 mini-batch update success. cost 92 ms. batch loss = 64.90443058674745
2017-07-13 10:23:54,924 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=1 trainData loss=65.89659534222488 precision=0.8082191780821918 auc=0.8894432773109244 trueRecall=0.8529411764705882 falseRecall=0.7946428571428571
2017-07-13 10:23:54,925 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=1 validationData loss=11.759569831458737 precision=0.5294117647058824 auc=0.7833333333333333 trueRecall=0.8 falseRecall=0.4166666666666667
2017-07-13 10:23:54,926 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=1 success. epoch cost 95 ms. train cost 93 ms. validation cost 2 ms.
2017-07-13 10:23:54,927 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=2 start.
2017-07-13 10:23:55,049 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=2 mini-batch update success. cost 122 ms. batch loss = 60.41682730446033
2017-07-13 10:23:55,049 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=2 mini-batch update success. cost 133 ms. batch loss = 60.76322261661788
2017-07-13 10:23:55,050 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=2 trainData loss=63.50737784399707 precision=0.7945205479452054 auc=0.8891806722689076 trueRecall=0.7941176470588235 falseRecall=0.7946428571428571
2017-07-13 10:23:55,050 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=2 validationData loss=12.073797759087194 precision=0.5294117647058824 auc=0.7999999999999999 trueRecall=0.8 falseRecall=0.4166666666666667
2017-07-13 10:23:55,050 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=2 mini-batch update success. cost 134 ms. batch loss = 49.299024014646534
2017-07-13 10:23:55,050 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=2 trainData loss=52.71327260152222 precision=0.8344827586206897 auc=0.8930152484013772 trueRecall=0.6842105263157895 falseRecall=0.8878504672897196
2017-07-13 10:23:55,051 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=2 mini-batch update success. cost 135 ms. batch loss = 57.38095397031147
2017-07-13 10:23:55,051 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=2 validationData loss=7.18289134537331 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:23:55,051 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=2 trainData loss=42.5530008621651 precision=0.8904109589041096 auc=0.9231416549789621 trueRecall=0.6129032258064516 falseRecall=0.9652173913043478
2017-07-13 10:23:55,051 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=2 success. epoch cost 124 ms. train cost 122 ms. validation cost 2 ms.
2017-07-13 10:23:55,051 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=2 validationData loss=7.996619734762594 precision=0.8235294117647058 auc=0.7115384615384616 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:23:55,051 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=2 trainData loss=50.04364645653406 precision=0.8493150684931506 auc=0.9147826086956522 trueRecall=0.6739130434782609 falseRecall=0.93
2017-07-13 10:23:55,052 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=2 success. epoch cost 136 ms. train cost 134 ms. validation cost 2 ms.
2017-07-13 10:23:55,052 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=2 validationData loss=6.309107306679452 precision=0.8823529411764706 auc=0.8181818181818182 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:23:55,052 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=2 success. epoch cost 136 ms. train cost 134 ms. validation cost 2 ms.
2017-07-13 10:23:55,052 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=2 success. epoch cost 136 ms. train cost 135 ms. validation cost 1 ms.
2017-07-13 10:23:55,053 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=3 start.
2017-07-13 10:23:55,054 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=3 start.
2017-07-13 10:23:55,054 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=3 start.
2017-07-13 10:23:55,054 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=3 start.
2017-07-13 10:23:55,261 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=3 mini-batch update success. cost 206 ms. batch loss = 47.86753246741145
2017-07-13 10:23:55,261 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=3 mini-batch update success. cost 207 ms. batch loss = 58.174276977658835
2017-07-13 10:23:55,262 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=3 trainData loss=41.85989561067765 precision=0.8835616438356164 auc=0.9223001402524544 trueRecall=0.5806451612903226 falseRecall=0.9652173913043478
2017-07-13 10:23:55,262 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=3 trainData loss=62.02810472926772 precision=0.7876712328767124 auc=0.8870798319327732 trueRecall=0.7647058823529411 falseRecall=0.7946428571428571
2017-07-13 10:23:55,262 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=3 mini-batch update success. cost 208 ms. batch loss = 55.34227459849863
2017-07-13 10:23:55,262 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=3 validationData loss=7.776794085556771 precision=0.8235294117647058 auc=0.75 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:23:55,262 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=3 validationData loss=12.219627887539769 precision=0.5294117647058824 auc=0.7999999999999999 trueRecall=0.8 falseRecall=0.4166666666666667
2017-07-13 10:23:55,262 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=3 mini-batch update success. cost 208 ms. batch loss = 58.7643116482565
2017-07-13 10:23:55,262 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=3 trainData loss=49.25011131428253 precision=0.8356164383561644 auc=0.9147826086956522 trueRecall=0.6521739130434783 falseRecall=0.92
2017-07-13 10:23:55,263 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=3 validationData loss=6.128729399106418 precision=0.8823529411764706 auc=0.8333333333333333 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:23:55,263 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=3 trainData loss=51.449966632305305 precision=0.8275862068965517 auc=0.8949827840629611 trueRecall=0.6842105263157895 falseRecall=0.8785046728971962
2017-07-13 10:23:55,263 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=3 validationData loss=7.502164785017019 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:23:55,264 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=3 success. epoch cost 210 ms. train cost 208 ms. validation cost 2 ms.
2017-07-13 10:23:55,265 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=4 start.
2017-07-13 10:23:55,265 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=3 success. epoch cost 211 ms. train cost 207 ms. validation cost 4 ms.
2017-07-13 10:23:55,266 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=3 success. epoch cost 212 ms. train cost 207 ms. validation cost 5 ms.
2017-07-13 10:23:55,266 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=3 success. epoch cost 212 ms. train cost 208 ms. validation cost 4 ms.
2017-07-13 10:23:55,266 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=4 start.
2017-07-13 10:23:55,266 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=4 start.
2017-07-13 10:23:55,267 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=4 start.
2017-07-13 10:24:05,009 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=4 mini-batch update success. cost 9741 ms. batch loss = 57.373005498154555
2017-07-13 10:24:05,009 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=4 mini-batch update success. cost 9744 ms. batch loss = 54.12770339372634
2017-07-13 10:24:05,009 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=4 trainData loss=50.61406686413628 precision=0.8344827586206897 auc=0.8991637973438269 trueRecall=0.7105263157894737 falseRecall=0.8785046728971962
2017-07-13 10:24:05,009 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=4 trainData loss=48.76878078875962 precision=0.8356164383561644 auc=0.9158695652173913 trueRecall=0.6521739130434783 falseRecall=0.92
2017-07-13 10:24:05,010 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=4 validationData loss=7.735325383614606 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:24:05,010 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=4 validationData loss=6.017274367017166 precision=0.8823529411764706 auc=0.8333333333333333 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:24:05,010 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=4 success. epoch cost 9743 ms. train cost 9742 ms. validation cost 1 ms.
2017-07-13 10:24:05,010 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=4 mini-batch update success. cost 9744 ms. batch loss = 47.17412203170076
2017-07-13 10:24:05,011 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=4 success. epoch cost 9746 ms. train cost 9744 ms. validation cost 2 ms.
2017-07-13 10:24:05,011 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=4 trainData loss=41.557507760384716 precision=0.8835616438356164 auc=0.9223001402524544 trueRecall=0.5806451612903226 falseRecall=0.9652173913043478
2017-07-13 10:24:05,011 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=4 validationData loss=7.556852656800136 precision=0.8235294117647058 auc=0.7692307692307693 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:24:05,012 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=4 success. epoch cost 9746 ms. train cost 9745 ms. validation cost 1 ms.
2017-07-13 10:24:05,015 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=4 mini-batch update success. cost 9749 ms. batch loss = 56.836056089726746
2017-07-13 10:24:05,015 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=5 start.
2017-07-13 10:24:05,015 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=5 start.
2017-07-13 10:24:05,016 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=4 trainData loss=60.99166263103096 precision=0.7876712328767124 auc=0.885766806722689 trueRecall=0.7647058823529411 falseRecall=0.7946428571428571
2017-07-13 10:24:05,015 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=5 start.
2017-07-13 10:24:05,021 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=4 validationData loss=12.2442753660371 precision=0.5882352941176471 auc=0.7999999999999999 trueRecall=0.8 falseRecall=0.5
2017-07-13 10:24:05,021 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=4 success. epoch cost 9755 ms. train cost 9749 ms. validation cost 6 ms.
2017-07-13 10:24:05,023 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=5 start.
2017-07-13 10:24:05,248 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=5 mini-batch update success. cost 225 ms. batch loss = 55.93693942191179
2017-07-13 10:24:05,249 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=5 mini-batch update success. cost 228 ms. batch loss = 56.303530508698415
2017-07-13 10:24:05,249 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=5 trainData loss=60.196608514819516 precision=0.773972602739726 auc=0.8828781512605042 trueRecall=0.7058823529411765 falseRecall=0.7946428571428571
2017-07-13 10:24:05,249 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=5 mini-batch update success. cost 234 ms. batch loss = 53.284264926464715
2017-07-13 10:24:05,249 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=5 trainData loss=50.00407011532065 precision=0.8413793103448276 auc=0.9006394490900147 trueRecall=0.7105263157894737 falseRecall=0.8878504672897196
2017-07-13 10:24:05,249 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=5 validationData loss=12.214002891828798 precision=0.5882352941176471 auc=0.7999999999999999 trueRecall=0.8 falseRecall=0.5
2017-07-13 10:24:05,249 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=5 mini-batch update success. cost 229 ms. batch loss = 46.76562609712182
2017-07-13 10:24:05,249 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=5 validationData loss=7.909134325653273 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:24:05,249 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=5 trainData loss=48.450828260118435 precision=0.8424657534246576 auc=0.9160869565217391 trueRecall=0.6739130434782609 falseRecall=0.92
2017-07-13 10:24:05,250 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=5 success. epoch cost 230 ms. train cost 229 ms. validation cost 1 ms.
2017-07-13 10:24:05,250 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=5 trainData loss=41.420098600230396 precision=0.8835616438356164 auc=0.9197755960729314 trueRecall=0.5806451612903226 falseRecall=0.9652173913043478
2017-07-13 10:24:05,250 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=5 validationData loss=7.357013633041193 precision=0.8235294117647058 auc=0.7884615384615384 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:24:05,250 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=6 start.
2017-07-13 10:24:05,251 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=5 success. epoch cost 231 ms. train cost 229 ms. validation cost 2 ms.
2017-07-13 10:24:05,251 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=6 start.
2017-07-13 10:24:05,252 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=5 validationData loss=5.94052527057723 precision=0.8823529411764706 auc=0.8333333333333333 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:24:05,252 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=5 success. epoch cost 229 ms. train cost 225 ms. validation cost 4 ms.
2017-07-13 10:24:05,252 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=5 success. epoch cost 237 ms. train cost 234 ms. validation cost 3 ms.
2017-07-13 10:24:05,253 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=6 start.
2017-07-13 10:24:05,253 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=6 start.
2017-07-13 10:24:05,362 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=6 mini-batch update success. cost 109 ms. batch loss = 46.50289157546116
2017-07-13 10:24:05,362 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=6 mini-batch update success. cost 109 ms. batch loss = 55.29026204757763
2017-07-13 10:24:05,362 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=6 trainData loss=41.36687241987158 precision=0.8835616438356164 auc=0.9186535764375876 trueRecall=0.5806451612903226 falseRecall=0.9652173913043478
2017-07-13 10:24:05,362 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=6 mini-batch update success. cost 112 ms. batch loss = 55.447993940196824
2017-07-13 10:24:05,362 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=6 validationData loss=7.181764238745375 precision=0.8235294117647058 auc=0.8269230769230769 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:24:05,363 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=6 trainData loss=59.55879877068623 precision=0.7876712328767124 auc=0.8820903361344538 trueRecall=0.7058823529411765 falseRecall=0.8125
2017-07-13 10:24:05,363 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=6 mini-batch update success. cost 110 ms. batch loss = 52.66039087142734
2017-07-13 10:24:05,363 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=6 trainData loss=49.53819464501965 precision=0.8413793103448276 auc=0.9003935071323167 trueRecall=0.7105263157894737 falseRecall=0.8878504672897196
2017-07-13 10:24:05,363 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=6 validationData loss=12.155701005977466 precision=0.5882352941176471 auc=0.7833333333333333 trueRecall=0.8 falseRecall=0.5
2017-07-13 10:24:05,363 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=6 validationData loss=8.04230349182461 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:24:05,363 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=6 trainData loss=48.232532471625404 precision=0.8356164383561644 auc=0.9167391304347826 trueRecall=0.6521739130434783 falseRecall=0.92
2017-07-13 10:24:05,363 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=6 validationData loss=5.883292928658214 precision=0.8823529411764706 auc=0.8333333333333333 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:24:05,363 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=6 success. epoch cost 111 ms. train cost 110 ms. validation cost 1 ms.
2017-07-13 10:24:05,363 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=6 success. epoch cost 110 ms. train cost 109 ms. validation cost 1 ms.
2017-07-13 10:24:05,364 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=6 success. epoch cost 114 ms. train cost 112 ms. validation cost 2 ms.
2017-07-13 10:24:05,364 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=7 start.
2017-07-13 10:24:05,364 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=7 start.
2017-07-13 10:24:05,364 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=7 start.
2017-07-13 10:24:05,365 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=6 success. epoch cost 112 ms. train cost 110 ms. validation cost 2 ms.
2017-07-13 10:24:05,365 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=7 start.
2017-07-13 10:24:05,505 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=7 mini-batch update success. cost 141 ms. batch loss = 54.74379227267895
2017-07-13 10:24:05,506 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=7 trainData loss=49.17231841985381 precision=0.8482758620689655 auc=0.8996556812592228 trueRecall=0.7368421052631579 falseRecall=0.8878504672897196
2017-07-13 10:24:05,506 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=7 mini-batch update success. cost 142 ms. batch loss = 54.805792297963066
2017-07-13 10:24:05,506 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=7 validationData loss=8.146259191722173 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:24:05,506 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=7 trainData loss=59.032008519994605 precision=0.7945205479452054 auc=0.8810399159663865 trueRecall=0.7058823529411765 falseRecall=0.8214285714285714
2017-07-13 10:24:05,506 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=7 mini-batch update success. cost 141 ms. batch loss = 52.17387297643545
2017-07-13 10:24:05,506 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=7 validationData loss=12.08546546840281 precision=0.5882352941176471 auc=0.7999999999999999 trueRecall=0.8 falseRecall=0.5
2017-07-13 10:24:05,506 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=7 success. epoch cost 142 ms. train cost 141 ms. validation cost 1 ms.
2017-07-13 10:24:05,507 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=7 trainData loss=48.078054173047484 precision=0.8356164383561644 auc=0.9180434782608695 trueRecall=0.6521739130434783 falseRecall=0.92
2017-07-13 10:24:05,507 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=7 success. epoch cost 143 ms. train cost 142 ms. validation cost 1 ms.
2017-07-13 10:24:05,507 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=7 mini-batch update success. cost 143 ms. batch loss = 46.314305974739334
2017-07-13 10:24:05,507 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=7 validationData loss=5.839129142004707 precision=0.8823529411764706 auc=0.8333333333333333 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:24:05,507 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=8 start.
2017-07-13 10:24:05,507 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=7 success. epoch cost 142 ms. train cost 141 ms. validation cost 1 ms.
2017-07-13 10:24:05,507 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=7 trainData loss=41.35692130894118 precision=0.8835616438356164 auc=0.9189340813464235 trueRecall=0.5806451612903226 falseRecall=0.9652173913043478
2017-07-13 10:24:05,507 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=8 start.
2017-07-13 10:24:05,508 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=7 validationData loss=7.0308316647627525 precision=0.8235294117647058 auc=0.8269230769230769 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:24:05,508 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=8 start.
2017-07-13 10:24:05,508 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=7 success. epoch cost 144 ms. train cost 143 ms. validation cost 1 ms.
2017-07-13 10:24:05,525 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=8 start.
2017-07-13 10:24:05,599 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=8 mini-batch update success. cost 92 ms. batch loss = 54.43970736447796
2017-07-13 10:24:05,599 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=8 mini-batch update success. cost 91 ms. batch loss = 51.77980350617134
2017-07-13 10:24:05,599 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=8 trainData loss=58.59161500077375 precision=0.7945205479452054 auc=0.8813025210084033 trueRecall=0.7058823529411765 falseRecall=0.8214285714285714
2017-07-13 10:24:05,599 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=8 validationData loss=12.010690527210821 precision=0.5882352941176471 auc=0.7999999999999999 trueRecall=0.8 falseRecall=0.5
2017-07-13 10:24:05,599 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=8 trainData loss=47.96944253025117 precision=0.8424657534246576 auc=0.9182608695652174 trueRecall=0.6739130434782609 falseRecall=0.92
2017-07-13 10:24:05,599 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=8 validationData loss=5.804020964728994 precision=0.8823529411764706 auc=0.8333333333333333 trueRecall=0.6666666666666666 falseRecall=1.0
2017-07-13 10:24:05,600 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=8 success. epoch cost 93 ms. train cost 92 ms. validation cost 1 ms.
2017-07-13 10:24:05,600 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=8 mini-batch update success. cost 75 ms. batch loss = 46.16955714116006
2017-07-13 10:24:05,600 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=8 success. epoch cost 92 ms. train cost 91 ms. validation cost 1 ms.
2017-07-13 10:24:05,600 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=8 trainData loss=41.372660650440466 precision=0.8767123287671232 auc=0.9180925666199158 trueRecall=0.5483870967741935 falseRecall=0.9652173913043478
2017-07-13 10:24:05,600 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=8 mini-batch update success. cost 93 ms. batch loss = 54.14588638067622
2017-07-13 10:24:05,600 INFO [pool-8-thread-4] com.tencent.angel.ml.classification.lr.LRLearner: Task[131]: epoch=9 start.
2017-07-13 10:24:05,600 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=8 validationData loss=6.8998509700424115 precision=0.8235294117647058 auc=0.8269230769230769 trueRecall=0.25 falseRecall=1.0
2017-07-13 10:24:05,600 INFO [pool-8-thread-3] com.tencent.angel.ml.classification.lr.LRLearner: Task[130]: epoch=9 start.
2017-07-13 10:24:05,601 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=8 trainData loss=48.87584780694351 precision=0.8413793103448276 auc=0.9008853910477127 trueRecall=0.6842105263157895 falseRecall=0.897196261682243
2017-07-13 10:24:05,601 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=8 success. epoch cost 76 ms. train cost 75 ms. validation cost 1 ms.
2017-07-13 10:24:05,601 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=8 validationData loss=8.228877856029492 precision=0.7647058823529411 auc=0.6875 trueRecall=1.0 falseRecall=0.75
2017-07-13 10:24:05,601 INFO [pool-8-thread-1] com.tencent.angel.ml.classification.lr.LRLearner: Task[128]: epoch=9 start.
2017-07-13 10:24:05,601 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=8 success. epoch cost 94 ms. train cost 93 ms. validation cost 1 ms.
2017-07-13 10:24:05,602 INFO [pool-8-thread-2] com.tencent.angel.ml.classification.lr.LRLearner: Task[129]: epoch=9 start.
2017-07-13 10:49:59,441 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking renewLease of class ClientNamenodeProtocolTranslatorPB over c3-hadoop-prc-ct05.bj/10.108.84.32:11200. Trying to fail over immediately.
java.net.ConnectException: Call From c3-hadoop-prc-st949.bj/10.118.33.8 to c3-hadoop-prc-ct05.bj:11200 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1481)
at org.apache.hadoop.ipc.Client.call(Client.java:1408)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy12.renewLease(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:615)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy13.renewLease(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:907)
at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:713)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1530)
at org.apache.hadoop.ipc.Client.call(Client.java:1447)
... 16 more
2017-07-13 10:49:59,449 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000 sessionTimeout=5000 watcher=org.apache.hadoop.hdfs.server.namenode.ha.ZkConfiguredFailoverProxyProvider@8724a58
2017-07-13 10:49:59,450 INFO [LeaseRenewer:work@c3prc-hadoop-SendThread(10.108.37.30:11000)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server 10.108.37.30/10.108.37.30:11000. Will not attempt to authenticate using SASL (unknown error)
2017-07-13 10:49:59,451 INFO [LeaseRenewer:work@c3prc-hadoop-SendThread(10.108.37.30:11000)] org.apache.zookeeper.ClientCnxn: Socket connection established to 10.108.37.30/10.108.37.30:11000, initiating session
2017-07-13 10:49:59,452 INFO [LeaseRenewer:work@c3prc-hadoop-SendThread(10.108.37.30:11000)] org.apache.zookeeper.ClientCnxn: Session establishment complete on server 10.108.37.30/10.108.37.30:11000, sessionid = 0x5cd5fc696f5df9, negotiated timeout = 5000
2017-07-13 10:49:59,456 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.zookeeper.ZooKeeper: Session: 0x5cd5fc696f5df9 closed
2017-07-13 10:49:59,456 INFO [LeaseRenewer:work@c3prc-hadoop-EventThread] org.apache.zookeeper.ClientCnxn: EventThread shut down
2017-07-13 10:50:01,456 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.hadoop.hdfs.server.namenode.ha.ZkConfiguredFailoverProxyProvider: Failover to namenode c3-hadoop-prc-ct05.bj/10.108.84.32:11200
2017-07-13 12:32:18,153 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking renewLease of class ClientNamenodeProtocolTranslatorPB over c3-hadoop-prc-ct05.bj/10.108.84.32:11200. Trying to fail over immediately.
java.net.ConnectException: Call From c3-hadoop-prc-st949.bj/10.118.33.8 to c3-hadoop-prc-ct05.bj:11200 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1481)
at org.apache.hadoop.ipc.Client.call(Client.java:1408)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy12.renewLease(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:615)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy13.renewLease(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:907)
at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:713)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1530)
at org.apache.hadoop.ipc.Client.call(Client.java:1447)
... 16 more
2017-07-13 12:32:18,154 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000 sessionTimeout=5000 watcher=org.apache.hadoop.hdfs.server.namenode.ha.ZkConfiguredFailoverProxyProvider@8724a58
2017-07-13 12:32:18,157 INFO [LeaseRenewer:work@c3prc-hadoop-SendThread(10.108.39.30:11000)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server 10.108.39.30/10.108.39.30:11000. Will not attempt to authenticate using SASL (unknown error)
2017-07-13 12:32:18,158 INFO [LeaseRenewer:work@c3prc-hadoop-SendThread(10.108.39.30:11000)] org.apache.zookeeper.ClientCnxn: Socket connection established to 10.108.39.30/10.108.39.30:11000, initiating session
2017-07-13 12:32:18,159 INFO [LeaseRenewer:work@c3prc-hadoop-SendThread(10.108.39.30:11000)] org.apache.zookeeper.ClientCnxn: Session establishment complete on server 10.108.39.30/10.108.39.30:11000, sessionid = 0x25ce3b42a98df19, negotiated timeout = 5000
2017-07-13 12:32:18,161 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.zookeeper.ZooKeeper: Session: 0x25ce3b42a98df19 closed
2017-07-13 12:32:18,161 INFO [LeaseRenewer:work@c3prc-hadoop-EventThread] org.apache.zookeeper.ClientCnxn: EventThread shut down
2017-07-13 12:32:20,161 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.hadoop.hdfs.server.namenode.ha.ZkConfiguredFailoverProxyProvider: Failover to namenode c3-hadoop-prc-ct05.bj/10.108.84.32:11200

from angel.

hellodengfei avatar hellodengfei commented on May 14, 2024

Looks like your HDFS cluster is in an unhealthy state, the log shows HDFS the active NameNode has failedover or dead, and the other NameNode is unreachable.

2017-07-13 12:32:18,153 INFO [LeaseRenewer:work@c3prc-hadoop] org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking renewLease of class ClientNamenodeProtocolTranslatorPB over c3-hadoop-prc-ct05.bj/10.108.84.32:11200. Trying to fail over immediately.
java.net.ConnectException: Call From c3-hadoop-prc-st949.bj/10.118.33.8 to c3-hadoop-prc-ct05.bj:11200 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

So not hanging at connect to standby RM, your're should configured namenode and resourcemanager at same node,please check your cluster,include HDFS.

from angel.

QingdiMeng avatar QingdiMeng commented on May 14, 2024

worker`s thread stack:

threadid: 104375812 threadname: IPC Client (969972672) connection to c3-hadoop-prc-ct05.bj/10.108.84.32:11200 from work threadstate: TIMED_WAITING
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:928)
org.apache.hadoop.ipc.Client$Connection.run(Client.java:973)

threadid: 104364428 threadname: pool-3-thread-4432 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104350804 threadname: pool-3-thread-4431 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104346404 threadname: pool-3-thread-4430 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104320940 threadname: pool-3-thread-4429 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104318659 threadname: pool-3-thread-4428 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104258557 threadname: pool-3-thread-4427 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104206850 threadname: pool-3-thread-4426 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104197710 threadname: pool-3-thread-4425 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104123041 threadname: pool-3-thread-4424 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104116415 threadname: pool-3-thread-4423 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104015130 threadname: pool-3-thread-4422 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 104004769 threadname: pool-3-thread-4421 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 103653083 threadname: pool-3-thread-4411 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 103470486 threadname: pool-3-thread-4406 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 103385858 threadname: pool-3-thread-4404 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 103381515 threadname: pool-3-thread-4403 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 103252517 threadname: pool-3-thread-4397 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 48 threadname: nioEventLoopGroup-2-21 threadstate: RUNNABLE
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
java.lang.Thread.run(Thread.java:745)

threadid: 49 threadname: nioEventLoopGroup-2-22 threadstate: RUNNABLE
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
java.lang.Thread.run(Thread.java:745)

threadid: 51 threadname: nioEventLoopGroup-2-24 threadstate: RUNNABLE
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
java.lang.Thread.run(Thread.java:745)

threadid: 50 threadname: nioEventLoopGroup-2-23 threadstate: RUNNABLE
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
java.lang.Thread.run(Thread.java:745)

threadid: 1959 threadname: Thread-1823 threadstate: TIMED_WAITING
java.lang.Object.wait(Native Method)
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:572)

threadid: 1958 threadname: Thread-1825 threadstate: TIMED_WAITING
java.lang.Object.wait(Native Method)
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:572)

threadid: 1957 threadname: Thread-1824 threadstate: TIMED_WAITING
java.lang.Object.wait(Native Method)
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:572)

threadid: 1956 threadname: LeaseRenewer:work@c3prc-hadoop threadstate: TIMED_WAITING
java.lang.Thread.sleep(Native Method)
org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:438)
org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
java.lang.Thread.run(Thread.java:745)

threadid: 1955 threadname: Thread-1821 threadstate: TIMED_WAITING
java.lang.Object.wait(Native Method)
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:572)

threadid: 1773 threadname: Attach Listener threadstate: RUNNABLE

threadid: 1212 threadname: IPC Parameter Sending Thread #1 threadstate: TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 600 threadname: DestroyJavaVM threadstate: RUNNABLE

threadid: 599 threadname: pool-8-thread-4 threadstate: TIMED_WAITING
java.lang.Thread.sleep(Native Method)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.waitForClock(MatrixClientAdapter.java:406)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.getRow(MatrixClientAdapter.java:154)
com.tencent.angel.psagent.consistency.ConsistencyController.getRow(ConsistencyController.java:100)
com.tencent.angel.psagent.matrix.MatrixClientImpl.getRow(MatrixClientImpl.java:53)
com.tencent.angel.ml.model.PSModel.getRow(PSModel.scala:203)
com.tencent.angel.ml.optimizer.sgd.GradientDescent$.miniBatchGD(GradientDescent.scala:36)
com.tencent.angel.ml.classification.lr.LRLearner.trainOneEpoch(LRLearner.scala:72)
com.tencent.angel.ml.classification.lr.LRLearner.train(LRLearner.scala:103)
com.tencent.angel.ml.classification.lr.LRTrainTask.train(LRTrainTask.scala:55)
com.tencent.angel.worker.task.TrainTask.run(TrainTask.scala:28)
com.tencent.angel.worker.task.Task.runUser(Task.java:95)
com.tencent.angel.worker.task.Task.run(Task.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 597 threadname: pool-8-thread-3 threadstate: TIMED_WAITING
java.lang.Thread.sleep(Native Method)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.waitForClock(MatrixClientAdapter.java:406)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.getRow(MatrixClientAdapter.java:154)
com.tencent.angel.psagent.consistency.ConsistencyController.getRow(ConsistencyController.java:100)
com.tencent.angel.psagent.matrix.MatrixClientImpl.getRow(MatrixClientImpl.java:53)
com.tencent.angel.ml.model.PSModel.getRow(PSModel.scala:203)
com.tencent.angel.ml.optimizer.sgd.GradientDescent$.miniBatchGD(GradientDescent.scala:36)
com.tencent.angel.ml.classification.lr.LRLearner.trainOneEpoch(LRLearner.scala:72)
com.tencent.angel.ml.classification.lr.LRLearner.train(LRLearner.scala:103)
com.tencent.angel.ml.classification.lr.LRTrainTask.train(LRTrainTask.scala:55)
com.tencent.angel.worker.task.TrainTask.run(TrainTask.scala:28)
com.tencent.angel.worker.task.Task.runUser(Task.java:95)
com.tencent.angel.worker.task.Task.run(Task.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 580 threadname: pool-8-thread-2 threadstate: TIMED_WAITING
java.lang.Thread.sleep(Native Method)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.waitForClock(MatrixClientAdapter.java:406)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.getRow(MatrixClientAdapter.java:154)
com.tencent.angel.psagent.consistency.ConsistencyController.getRow(ConsistencyController.java:100)
com.tencent.angel.psagent.matrix.MatrixClientImpl.getRow(MatrixClientImpl.java:53)
com.tencent.angel.ml.model.PSModel.getRow(PSModel.scala:203)
com.tencent.angel.ml.optimizer.sgd.GradientDescent$.miniBatchGD(GradientDescent.scala:36)
com.tencent.angel.ml.classification.lr.LRLearner.trainOneEpoch(LRLearner.scala:72)
com.tencent.angel.ml.classification.lr.LRLearner.train(LRLearner.scala:103)
com.tencent.angel.ml.classification.lr.LRTrainTask.train(LRTrainTask.scala:55)
com.tencent.angel.worker.task.TrainTask.run(TrainTask.scala:28)
com.tencent.angel.worker.task.Task.runUser(Task.java:95)
com.tencent.angel.worker.task.Task.run(Task.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 573 threadname: pool-8-thread-1 threadstate: TIMED_WAITING
java.lang.Thread.sleep(Native Method)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.waitForClock(MatrixClientAdapter.java:406)
com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter.getRow(MatrixClientAdapter.java:154)
com.tencent.angel.psagent.consistency.ConsistencyController.getRow(ConsistencyController.java:100)
com.tencent.angel.psagent.matrix.MatrixClientImpl.getRow(MatrixClientImpl.java:53)
com.tencent.angel.ml.model.PSModel.getRow(PSModel.scala:203)
com.tencent.angel.ml.optimizer.sgd.GradientDescent$.miniBatchGD(GradientDescent.scala:36)
com.tencent.angel.ml.classification.lr.LRLearner.trainOneEpoch(LRLearner.scala:72)
com.tencent.angel.ml.classification.lr.LRLearner.train(LRLearner.scala:103)
com.tencent.angel.ml.classification.lr.LRTrainTask.train(LRTrainTask.scala:55)
com.tencent.angel.worker.task.TrainTask.run(TrainTask.scala:28)
com.tencent.angel.worker.task.Task.runUser(Task.java:95)
com.tencent.angel.worker.task.Task.run(Task.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

threadid: 151 threadname: Worker Heartbeat threadstate: TIMED_WAITING
java.lang.Thread.sleep(Native Method)
com.tencent.angel.worker.Worker$2.run(Worker.java:309)
java.lang.Thread.run(Thread.java:745)

from angel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.