Comments (4)
Hi SparkJiao,
Could you also test with an SGD that its lr
equals to the final_lr
of AdaBound?
Current info is too less to make an educated guess. What I am most confident is AdaBound could outperform SGD (viz. faster, and not worse than SGD) with similar settings.
We did find that AdaBound is more robust on CV tasks than NLP. The reason might be that adaptive methods are more useful in unbalanced data (word embedding is a typical example of this). On this kind of task SGD might be worse than Adam, therefore transforming to SGD cannot help.
from adabound.
@SparkJiao besides, I am currently testing AdaBound on reading comprehension task as well. What dataset and model did you test? It seems that my preliminary result shows that AdaBound is still the best.
from adabound.
OK, if I have free gpu, I will test the performance with SGD.
By the way, I'm working on CoQA, Conversational Question Answering Challenge and the model being tested is modified from FlowQA, whose author has pushed his code to Github and the paper is also under open review for ICLR2019.
Thank you!
from adabound.
@SparkJiao thanks. I have tested CoQA but not with FlowQA. I will have a check of their paper.
from adabound.
Related Issues (20)
- Why python 3.6 requirement? HOT 1
- The provided new optimizer is sensitive on tiny batchsize HOT 4
- Nan loss in RCAN model HOT 9
- AttributeError: no attribute 'base_lrs' HOT 10
- Don't work properly with higher lr
- update pip package please~ HOT 2
- Be careful when using adaptive gradient methods HOT 3
- Can you provide the code and parameters of related LSTM experiments~
- grammar police... "as well as adam"
- Merge with Ranger or over9000
- LSTM hyparameters for language modeling
- About clip (α / √Vt, ηl, ηu) in the paper
- Learning rate changing
- Pytorch 1.6 warning HOT 1
- When did the optimizer switch to SGD? HOT 1
- Can this deal with complex numbers?
- Question about the code HOT 2
- What is up with Epoch 150 HOT 8
- Tensorflow version coming when? HOT 3
- lr_scheduler affect the actual learning rate HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adabound.