Comments (4)
Sorry for the late reply, I hadn’t noticed that your comment had been edited. Yes, it seems I had misunderstood your initial question.
First, let me explain the reason why for the class-incremental learning scenario we decided to always set the output-units of all classes seen so far to ‘active’. For each of the three scenarios, we always train a model on what it will be tested on later. In the class-incremental learning scenario, the model will need to learn to chose between all classes seen so far, and so that is what we train it on. Although I don’t think there is necessarily a ‘right’ way here, this seems to me the most logical approach.
That said, you are right that in practice in certain circumstances it is possible to somewhat boost the performance for the class-incremental learning scenario by always only setting the output-units of classes in the current task to ‘active’. It is indeed interesting that this trick sometimes work, and I have recently played around with it as well. Essentially, I believe this trick depends on a training protocol with a (very) precise balance between the different tasks. In my opinion, this makes it questionable to what extent the difference between classes/tasks is really being ‘learned’ with this setup, for example because it is not robust against small variations in the training protocol (as exemplified by the large variance you mention).
from continual-learning.
Thanks for you question!
Firstly I should note that for the Class-IL scenario, only the output-units corresponding to classes of future tasks (and that thus have not yet been seen) are excluded. That means that for the Class-IL scenarios always all classes seen so far 'active' and included. I now realise that the comments within the code are not very clear about this, apologies for that.
But it's still a good question why the output-units corresponding to classes of future tasks are excluded. Especially as it is indeed true that in some cases always including all classes would lead to (somewhat) better performance. The main reason I decided against always including all classes, is that it is only possible to do so when it is a priori known how many classes there are. Only including classes that have been seen so far is thus more general. (Although in the code the entire network--including all output-units--is generated at the start, this is just an implementation issue, as it is also possible to add new output-units 'on-the-fly' when a new class is encountered.) But if for the problem you are interested in the total number of classes is known beforehand, you could indeed consider training with always the output-units of all classes included.
from continual-learning.
Thanks for your reply.
However , I think the right way is to calculate loss basing on the current classes, instead of all the active classes.
In your code, for the previous task, all the samples are negative and this kind of training do harm to the model severely and make the accuracy drop to zero very fast.
If you train only on the current classes, the accuracy of baseline becomes 15%~50%. The variance is large.
(splitMNIST, Class-IL)
I think the regularization-based method will perform not that bad under this kind of training.
from continual-learning.
Any response?
from continual-learning.
Related Issues (20)
- Empirical Fisher Estimation HOT 3
- Datasets more complicated than MNIST HOT 1
- Just a request
- Grad in SI HOT 4
- Wrong dataset? HOT 2
- why batch_size has to be 1 when update fisher? HOT 1
- Lower/Upper Bound Experiments HOT 2
- one little confusion about the loss_fn_kd function HOT 1
- Suspicious Precision HOT 3
- Link error HOT 2
- Reproducing BI+SI method HOT 9
- about kafc fisher infromation matrix HOT 1
- How to create Resnet34 HOT 2
- Joint training results different for different types of incremental learning? HOT 3
- Task-IL evaluation HOT 2
- Single head or multihead task incremental HOT 1
- 0 accuracy values for task-free setting HOT 9
- Whether context identity must be inferred in case of domain increment? HOT 1
- About printing results of experimental output
- Results for None ("lower target")
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from continual-learning.