Giter Site home page Giter Site logo

Comments (4)

GMvandeVen avatar GMvandeVen commented on May 28, 2024 2

Sorry for the late reply, I hadn’t noticed that your comment had been edited. Yes, it seems I had misunderstood your initial question.

First, let me explain the reason why for the class-incremental learning scenario we decided to always set the output-units of all classes seen so far to ‘active’. For each of the three scenarios, we always train a model on what it will be tested on later. In the class-incremental learning scenario, the model will need to learn to chose between all classes seen so far, and so that is what we train it on. Although I don’t think there is necessarily a ‘right’ way here, this seems to me the most logical approach.

That said, you are right that in practice in certain circumstances it is possible to somewhat boost the performance for the class-incremental learning scenario by always only setting the output-units of classes in the current task to ‘active’. It is indeed interesting that this trick sometimes work, and I have recently played around with it as well. Essentially, I believe this trick depends on a training protocol with a (very) precise balance between the different tasks. In my opinion, this makes it questionable to what extent the difference between classes/tasks is really being ‘learned’ with this setup, for example because it is not robust against small variations in the training protocol (as exemplified by the large variance you mention).

from continual-learning.

GMvandeVen avatar GMvandeVen commented on May 28, 2024

Thanks for you question!
Firstly I should note that for the Class-IL scenario, only the output-units corresponding to classes of future tasks (and that thus have not yet been seen) are excluded. That means that for the Class-IL scenarios always all classes seen so far 'active' and included. I now realise that the comments within the code are not very clear about this, apologies for that.
But it's still a good question why the output-units corresponding to classes of future tasks are excluded. Especially as it is indeed true that in some cases always including all classes would lead to (somewhat) better performance. The main reason I decided against always including all classes, is that it is only possible to do so when it is a priori known how many classes there are. Only including classes that have been seen so far is thus more general. (Although in the code the entire network--including all output-units--is generated at the start, this is just an implementation issue, as it is also possible to add new output-units 'on-the-fly' when a new class is encountered.) But if for the problem you are interested in the total number of classes is known beforehand, you could indeed consider training with always the output-units of all classes included.

from continual-learning.

suanrong avatar suanrong commented on May 28, 2024

Thanks for your reply.
However , I think the right way is to calculate loss basing on the current classes, instead of all the active classes.

In your code, for the previous task, all the samples are negative and this kind of training do harm to the model severely and make the accuracy drop to zero very fast.

If you train only on the current classes, the accuracy of baseline becomes 15%~50%. The variance is large.
(splitMNIST, Class-IL)

I think the regularization-based method will perform not that bad under this kind of training.

from continual-learning.

suanrong avatar suanrong commented on May 28, 2024

Any response?

from continual-learning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.