Comments (5)
One weird thing: when I resume the training from a saved model from the above figure, the phenomena did not appear again.
from models.
抱歉,我用中文吧 :(
基于libri.train-clean-100
,14个pass开始收敛比较奇怪,突然上涨,然后下降:
Pass: 13, Batch: 800, TrainCost: 33.875851
.................................................
Pass: 13, Batch: 850, TrainCost: 32.388756
.........................................
------- Time: 2996 sec, Pass: 13, ValidationCost: 270.575763434
Pass: 14, Batch: 0, TrainCost: 45.968803
.................................................
Pass: 14, Batch: 50, TrainCost: 492.662450
去掉batch-shuffle中的下面几行,即扔掉开头一些短样本,和不够组batch的长样本,
res_len = len(manifest) - shift_len - len(batch_manifest)
batch_manifest.extend(manifest[-res_len:])
batch_manifest.extend(manifest[0:shift_len])
收敛情况没有出现突然上升,看着都比较正常:
.........
Pass: 19, Batch: 890, TrainCost: 25.492654 CurCost: 14.879614
------- Time: 2977 sec, Pass: 19, ValidationCost: 61.139245818
Pass: 20, Batch: 0, TrainCost: 27.037848 CurCost: 27.037848
from models.
I've given up the attempt to reproduce the phenomenon from a pre-trained model.
Now I've started three from-scratch jobs with three different shuffle methods, i.e.
- instance shuffle
- batch shuffle
- batch shuffle with clipping
(For more details, please refer here)
with full LibriSpeech data, in order to reproduce what @qingqing01 has observed in a small dataset.
from models.
Here is the results for batch size = 32, with all three shuffle methods running into an abnormal convergence. Besides, all bumping points are not located in the first batches of some epoch any more (This is contradictory to what we have observed previously).
However, when we change the batch size from 32 to 256, the convergence is much more stable and we haven't seen the abnormal phenomenon by far.
Larger batches reduce the gradient variance, thus stabilizing the convergence.
Conclusion: Batch size 32 is too small for a stable training, use 256 or larger instead.
TODO:
- Try smaller learning rate for batch size 32.
- Train more epochs to see whether batch size 256 can really stabilize the training.
from models.
您好,此issue在近一个月内暂无更新,我们将于今天内关闭。若在关闭后您仍需跟进提问,可重新开启此问题,我们将在24小时内回复您。因关闭带来的不便我们深表歉意,请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!
from models.
Related Issues (20)
- 【提问】关于使用PaddlePaddle SMOKE模型进行对象检测时的cam_info参数的疑惑 HOT 1
- 提供的这几个预训练模型链接全部失效了 HOT 2
- generate_sequence_by_rnn_lm
- ocr 竖排文本 和长文本 HOT 1
- train_val_kitti.yaml中的dim_ref是什么含义?
- 导出模型时,发现有些维数是?号 HOT 1
- 【笔误】models/tutorials/reprod_log/README.md HOT 1
- pr HOT 2
- NULL
- 说明文档和下载的代码不一样
- 希望能够给这个页面加一个目录,要不然从上往下翻非常不方便
- 在跑官方教程手写数字识别任务时,loss全为0 HOT 1
- win10-paddleSpeech,语音识别报错 ImportError: cannot import name 'load' from 'paddleaudio.backends'
- 问一下resnet50_vd_ssld的教师模型和学生模型分别是哪个 HOT 1
- 模型滤波器问题
- paddle LAC模型输出模型报告问题。
- Compiled with WITH_GPU, but no GPU found in runtime
- paddle fleet分布式框架 设置参数不同学习率时报错
- 自定义切词错误
- paddle.where not support auto-broad-casting like tensorflow
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from models.