It seems convenient to average the gradients over samples by calling <code class="notr

Empirical Fisher Estimation about continual-learning HOT 3 CLOSED

gmvandeven commented on May 24, 2024

Empirical Fisher Estimation

from continual-learning.

Comments (3)

GMvandeVen commented on May 24, 2024

Thanks for your comment! You’re right that the order in which the square and average operations are applied matters, exactly for the reason you point out. But the code in my repository does use the correct order, because the Fisher Information matrix is calculated with batches of size 1 (see here).

I admit this might not be the most efficient implementation (alternative suggestions are very welcome!), but the reason I chose this implementation for now is that (as far as I’m aware) in PyTorch it is currently not possible to access the gradients of individual elements in a sum.

from continual-learning.

tangbinh commented on May 24, 2024

I see. Estimating the Fisher information matrix with a single example seems to result in high variance (see this notebook for example). Have you seen any difference if a larger batch size is used? As you said, I don't think it's possible to access individual gradients with just one backward call.

from continual-learning.

GMvandeVen commented on May 24, 2024

Sorry, I should have been clearer, I realise my use of ‘batches’ is a bit confusing here. What I meant is that the backward passes(*) are done one-by-one for each sample used the calculate the Fisher Information matrix. The number of samples used to calculate the Fisher Information matrix is typically not 1 (this is set by the option --fisher-n; the default is using the full training-set).

(*) actually also the forward passes; I now realise this could be made more efficient by at least performing the forward passes with larger ‘batches’. I’ll look into that when I get some time.

from continual-learning.

Related Issues (20)

Recommend Projects

Empirical Fisher Estimation about continual-learning HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent