Giter Site home page Giter Site logo

Comments (9)

GMvandeVen avatar GMvandeVen commented on May 28, 2024

Hi, thanks for your interest in the code! To reproduce the results from BI-R + SI as reported in the papers "Brain-inspired replay for continual learning with artificial neural networks" and "Class-incremental learning with generative classifiers", it is best to use not this repository, but the repository accompanying the brain-inspired replay paper (https://github.com/GMvandeVen/brain-inspired-replay). In that repository, the command python main_cl.py --experiment=CIFAR100 --scenario=class --brain-inspired --replay=generative --si --dg-prop=0.6 --c=100000000. can be used to run the same BI-R + SI experiment on the class-incremental version of Split-CIFAR100 as reported in the above two papers.

In this repository (https://github.com/GMvandeVen/continual-learning) there are a few things implemented slightly differently compared to the brain-inspired replay paper. For example, the default in this repository is to run class-incremental learning experiments with an output layer in which always the output units of all classes are set to active, while in the brain-inspired replay paper an "expanding head" was used. (See for example the explanation under the header "BI-R" in the methods section (top of p.14) of this paper.) In the paper accompanying this repository I only tested BI-R by itself, and not combined with SI. I expect the reason that the second experiment you describe fails, is because using SI with a very high regularization strength might be problematic in a class-incremental learning experiment with always all units set to active (while it is OK with an expanding head).

Hope this helps!

from continual-learning.

valeriya-khan avatar valeriya-khan commented on May 28, 2024

Thank you very much for such a detailed answer :)
Can I ask why you excluded BI-R + SI from new paper results table? It gave one of the best results if I am not mistaken, especially among generative methods.
Also I still tried to reproduce BI-R+SI on new repository too with one feature extractor, given in the repository, and with argument --active-classes="all-so-far". But the results differ a lot. What can be the reason? Was something else changed not mentioned in your paper?
Thank you very much for you time :)

from continual-learning.

GMvandeVen avatar GMvandeVen commented on May 28, 2024

Regarding the first part of your comment, that's a good question. In this new paper (although the preprint of this new paper is older than the paper on brain-inspired replay) I didn't include BI-R + SI in the comparison because the goal of the experiments in this paper is not to verify/champion a method as achieving state-of-the-art performance, but rather the goal is to compare the performance of different computational strategies for continual learning, and to do that on each of the three continual learning scenarios. To do this I tried to select for each strategy a few representative example methods. As the approach BI-R + SI combines two of those strategies, it wasn't suitable for this comparison. (But if you are interested in doing as well as possible on some continual learning problem, it might indeed often be best to combine multiple strategies.)

Regarding the second part of your comment, it seems you are right. Thank you for pointing this out. It indeed seems to be the case that also when using the argument --active-classes="all-so-far", the performance of BI-R + SI with the code in this repository is somewhat lower than the performance of BI-R + SI with the code in the repository of the brain-inspired replay paper. I will try to figure out what is causing this difference!

from continual-learning.

valeriya-khan avatar valeriya-khan commented on May 28, 2024

Thank you very much for the answer! Can I leave this issue open as a mean of communication? If you will find what causes the difference I would be happy to hear from you :)

from continual-learning.

GMvandeVen avatar GMvandeVen commented on May 28, 2024

Yes, please leave the issue open. I'm intending to get back on this when I figure it out!

from continual-learning.

GMvandeVen avatar GMvandeVen commented on May 28, 2024

Hi, I found one difference in the implementation of BI-R + SI between this repository and the repository of the BI-R paper, which seems to explain at least most of the difference in results you got. In the repository of the BI-R paper the method SI is only applied to the layers of the classifier (so not to the the layers of the decoder network), while in this repository the method SI is by default applied to all the layers of the network (so also to the layers of the decoder network).
The lines in the repository of the BI-R paper where this is specified are here: https://github.com/GMvandeVen/brain-inspired-replay/blob/1a030f75666c656416e1ca02466758ca32cf2fe4/train.py#L296-L318
To mimic this behavior in this repository, you could replace the following line:

self.param_list = [self.named_parameters] #-> lists the parameters to regularize with SI or diagonal Fisher
# (default is to apply it to all parameters of the network)

by self.param_list = [self.convE.named_parameters, self.fcE.named_parameters, self.classifier.named_parameters].
Note that the approach BI-R + SI can also work quite well when SI is applied to all the layers of the network, but this setting has different optimal hyper-parameter values (in particular the hyperparameter --dg-prop should be lower).
Hope this helps!

from continual-learning.

valeriya-khan avatar valeriya-khan commented on May 28, 2024

Hi! Thank you very much for your help. Now, I was able to obtain 30% accuracy which is much better than before. If it is not difficult, can you tell what can be the reason of 2-5% difference between brain-inspired repository and this repository? Even with all-so-far option. Are there any other implementation differences? Thank you for your help :)

from continual-learning.

GMvandeVen avatar GMvandeVen commented on May 28, 2024

Hi, there are quite some other differences between the code in this repository and the other repository, but I haven’t been able to figure out which of those differences could cause a difference in performance when combining BI-R and SI. For example, one quite large difference is that, when a fixed feature extractor is used (i.e., the option --freeze-convE), in this repository all data are put through the feature extractor once at the beginning (which speeds up things considerably), while in the other repository the data are put through the feature extractor every time they are presented to the network. In principle I don’t think this difference should lead to a difference in performance, but perhaps for some reason it does.

If it is important to replicate the performance reported in the brain-inspired replay paper, my suggestion would be the use the original repository accompanying that paper (this one). Otherwise it should be fine to use this repository.

from continual-learning.

valeriya-khan avatar valeriya-khan commented on May 28, 2024

Thank you very much for your explanations. I want to replicate the results here, as I like that this repository includes other methods additionally to generative and regularization. Thank you very much for your help. I will close this issue, if you will remember something else, please reopen it, or write on my email: [email protected]. Have a nice day!

from continual-learning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.