Comments (2)
Hi, thanks for your interest and sorry for the late reply. I would suggest to tune the \alpha_bkw parameter in the https://github.com/amirgholami/powernorm/blob/2f23ae75c4f29904175bfd2c6b8248399ff99440/fairseq/modules/norms/mask_powernorm.py#L103. The larger it is, the smaller variance it will introduce to the later training phase.
from powernorm.
Hi, thank you for your reply. However, the link you sent is not work for me. I saw Page not found
...
from powernorm.
Related Issues (15)
- Language Modelling code? HOT 1
- PowerNorm link broken HOT 1
- Is MaskPowerNorm the powernorm proposed by the paper? HOT 1
- a question about the image of layer normalization in README.md HOT 1
- Question regarding the batch norm vs masked batch norm
- Comparisons with RMSNorm?
- A few questions regarding fairseq/modules/norms/mask_powernorm.py
- Different backward implementation from the content written in paper HOT 2
- The broken affine parameter and the redundancy codes HOT 1
- ImportError: cannot import name 'libbleu' from 'fairseq' HOT 1
- Feature request: improved documentation HOT 2
- Cannot reproduce the results on IWSLT14. HOT 1
- Why use group scaling? HOT 2
- Does PowerNorm still work for NMT task after removing the GroupScaling layer? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from powernorm.