As you all know, gradient accumulation is not directly compatible with batch normaliza

Better compatibility with batch normalization about gradientaccumulator HOT 4 CLOSED

andreped commented on May 30, 2024

Better compatibility with batch normalization

from gradientaccumulator.

Comments (4)

andreped commented on May 30, 2024

I implemented a custom batch norm layer recently, which was made available in v0.3.2.

Implementation can be seen here.

The plan is to add accumulation to the call step similar as done in the Model wrapper.

The performance on the custom layer reaches very similar results to keras' BN layer, but we should get identical before we can say that it is working as intended.

from gradientaccumulator.

andreped commented on May 30, 2024

To add accumulation support to the custom BN layer, relevant resources are this, this, and this.

from gradientaccumulator.

andreped commented on May 30, 2024

I have added gradient accumulation support to the custom BN layer now in 05fb499.

However, unit tests show that the results are not equivalent when increasing accum_steps. Hence, it is not yet working as intended.

from gradientaccumulator.

andreped commented on May 30, 2024

Support for gradient accumulation to batch normalization has been added. Even though it is not perfect, at least people could test it and improve it further.

I also updated the documentations regarding how to use it in f54e389.

We can open a separate issue to benchmark it properly. Might also be that opening a discussion in the Discussions tab is the way to go. Hence, closing this issue.

from gradientaccumulator.

Better compatibility with batch normalization about gradientaccumulator HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent