Comments (9)
Hi, thanks for your interest in the code! To reproduce the results from BI-R + SI as reported in the papers "Brain-inspired replay for continual learning with artificial neural networks" and "Class-incremental learning with generative classifiers", it is best to use not this repository, but the repository accompanying the brain-inspired replay paper (https://github.com/GMvandeVen/brain-inspired-replay). In that repository, the command python main_cl.py --experiment=CIFAR100 --scenario=class --brain-inspired --replay=generative --si --dg-prop=0.6 --c=100000000.
can be used to run the same BI-R + SI experiment on the class-incremental version of Split-CIFAR100 as reported in the above two papers.
In this repository (https://github.com/GMvandeVen/continual-learning) there are a few things implemented slightly differently compared to the brain-inspired replay paper. For example, the default in this repository is to run class-incremental learning experiments with an output layer in which always the output units of all classes are set to active, while in the brain-inspired replay paper an "expanding head" was used. (See for example the explanation under the header "BI-R" in the methods section (top of p.14) of this paper.) In the paper accompanying this repository I only tested BI-R by itself, and not combined with SI. I expect the reason that the second experiment you describe fails, is because using SI with a very high regularization strength might be problematic in a class-incremental learning experiment with always all units set to active (while it is OK with an expanding head).
Hope this helps!
from continual-learning.
Thank you very much for such a detailed answer :)
Can I ask why you excluded BI-R + SI from new paper results table? It gave one of the best results if I am not mistaken, especially among generative methods.
Also I still tried to reproduce BI-R+SI on new repository too with one feature extractor, given in the repository, and with argument --active-classes="all-so-far". But the results differ a lot. What can be the reason? Was something else changed not mentioned in your paper?
Thank you very much for you time :)
from continual-learning.
Regarding the first part of your comment, that's a good question. In this new paper (although the preprint of this new paper is older than the paper on brain-inspired replay) I didn't include BI-R + SI in the comparison because the goal of the experiments in this paper is not to verify/champion a method as achieving state-of-the-art performance, but rather the goal is to compare the performance of different computational strategies for continual learning, and to do that on each of the three continual learning scenarios. To do this I tried to select for each strategy a few representative example methods. As the approach BI-R + SI combines two of those strategies, it wasn't suitable for this comparison. (But if you are interested in doing as well as possible on some continual learning problem, it might indeed often be best to combine multiple strategies.)
Regarding the second part of your comment, it seems you are right. Thank you for pointing this out. It indeed seems to be the case that also when using the argument --active-classes="all-so-far"
, the performance of BI-R + SI with the code in this repository is somewhat lower than the performance of BI-R + SI with the code in the repository of the brain-inspired replay paper. I will try to figure out what is causing this difference!
from continual-learning.
Thank you very much for the answer! Can I leave this issue open as a mean of communication? If you will find what causes the difference I would be happy to hear from you :)
from continual-learning.
Yes, please leave the issue open. I'm intending to get back on this when I figure it out!
from continual-learning.
Hi, I found one difference in the implementation of BI-R + SI between this repository and the repository of the BI-R paper, which seems to explain at least most of the difference in results you got. In the repository of the BI-R paper the method SI is only applied to the layers of the classifier (so not to the the layers of the decoder network), while in this repository the method SI is by default applied to all the layers of the network (so also to the layers of the decoder network).
The lines in the repository of the BI-R paper where this is specified are here: https://github.com/GMvandeVen/brain-inspired-replay/blob/1a030f75666c656416e1ca02466758ca32cf2fe4/train.py#L296-L318
To mimic this behavior in this repository, you could replace the following line:
continual-learning/models/cl/continual_learner.py
Lines 19 to 20 in b4bd69a
by
self.param_list = [self.convE.named_parameters, self.fcE.named_parameters, self.classifier.named_parameters]
.Note that the approach BI-R + SI can also work quite well when SI is applied to all the layers of the network, but this setting has different optimal hyper-parameter values (in particular the hyperparameter
--dg-prop
should be lower).Hope this helps!
from continual-learning.
Hi! Thank you very much for your help. Now, I was able to obtain 30% accuracy which is much better than before. If it is not difficult, can you tell what can be the reason of 2-5% difference between brain-inspired repository and this repository? Even with all-so-far option. Are there any other implementation differences? Thank you for your help :)
from continual-learning.
Hi, there are quite some other differences between the code in this repository and the other repository, but I haven’t been able to figure out which of those differences could cause a difference in performance when combining BI-R and SI. For example, one quite large difference is that, when a fixed feature extractor is used (i.e., the option --freeze-convE
), in this repository all data are put through the feature extractor once at the beginning (which speeds up things considerably), while in the other repository the data are put through the feature extractor every time they are presented to the network. In principle I don’t think this difference should lead to a difference in performance, but perhaps for some reason it does.
If it is important to replicate the performance reported in the brain-inspired replay paper, my suggestion would be the use the original repository accompanying that paper (this one). Otherwise it should be fine to use this repository.
from continual-learning.
Thank you very much for your explanations. I want to replicate the results here, as I like that this repository includes other methods additionally to generative and regularization. Thank you very much for your help. I will close this issue, if you will remember something else, please reopen it, or write on my email: [email protected]. Have a nice day!
from continual-learning.
Related Issues (20)
- Empirical Fisher Estimation HOT 3
- Datasets more complicated than MNIST HOT 1
- Just a request
- Grad in SI HOT 4
- Wrong dataset? HOT 2
- why batch_size has to be 1 when update fisher? HOT 1
- Lower/Upper Bound Experiments HOT 2
- one little confusion about the loss_fn_kd function HOT 1
- Suspicious Precision HOT 3
- Link error HOT 2
- about kafc fisher infromation matrix HOT 1
- How to create Resnet34 HOT 2
- Joint training results different for different types of incremental learning? HOT 3
- Task-IL evaluation HOT 2
- Single head or multihead task incremental HOT 1
- 0 accuracy values for task-free setting HOT 9
- Whether context identity must be inferred in case of domain increment? HOT 1
- About printing results of experimental output
- Results for None ("lower target")
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from continual-learning.