Giter Site home page Giter Site logo

Comments (8)

emilysilcock avatar emilysilcock commented on May 27, 2024 1

At the moment I've written something simple that just copies the current classification model, which I'm using for now, so no particular timeframe on my end! This ties you to using the same hyperparameter when training the DAL classifier and the standard classifier, but that wasn't a particular problem for me

class DiscriminativeActiveLearning_amended(small_text.query_strategies.strategies.DiscriminativeActiveLearning):

    # Amended to use the most recent topic classifier as per the DAL paper

    def _train_and_get_most_confident(self, ds, indices_unlabeled, indices_labeled, q):

        ###
        # if self.clf_ is not None:
        #     del self.clf_

        # clf = self.classifier_factory.new()

        clf = active_learner._clf
        ###

        num_unlabeled = min(indices_labeled.shape[0] * self.unlabeled_factor,
                            indices_unlabeled.shape[0])

        indices_unlabeled_sub = np.random.choice(indices_unlabeled,
                                                 num_unlabeled,
                                                 replace=False)

        ds_discr = DiscriminativeActiveLearning_amended.get_relabeled_copy(ds,
                                                                   indices_unlabeled_sub,
                                                                   indices_labeled)

        self.clf_ = clf.fit(ds_discr)

        proba = clf.predict_proba(ds[indices_unlabeled])
        proba = proba[:, self.LABEL_UNLABELED_POOL]

        # return instances which most likely belong to the "unlabeled" class (higher is better)
        return np.argpartition(-proba, q)[:q]

from small-text.

chschroeder avatar chschroeder commented on May 27, 2024

Hi Emily,
Thank you for the kind feedback! I'm happy to hear that small-text is useful to you.

I am open to any extensions of the discriminative active learning implementation. The only thing I am trying to ensure that the default parameters give you the (original) method described in the accompagnying paper. It has been a while since I implemented this, but after a brief glance at both implementation and paper I would say that the current state matches the paper. Do you agree on this?

Regarding the extension: Thank you for the link to this blog post; I was unaware of this. If I understand your proposal correctly, the idea is to use the current classification model as a starting point for the (discriminative) binary model. Is this accurate? If so, then this should be quite easy to add. you have a specific use case in mind for applying this strategy? This might be a good opportunity to test whether the extended implementation works as intended.

Best regards
Christopher

from small-text.

emilysilcock avatar emilysilcock commented on May 27, 2024

Hi Christopher,

Thanks for the super quick reply!

My understanding of the DAL paper is that their default implementation is with this extension - though they don't go into a huge amount of detail, and I might have misinterpreted! In this paragraph below, they say that using the current classification model as a starting point for the (discriminative) binary model is important for performance.

image

Thanks!
Emily

from small-text.

chschroeder avatar chschroeder commented on May 27, 2024

Yes, I think this part, while lacking detail, explains that you can either use the original representation $\mathcal{X}$ or the learned representation $\hat{\mathcal{X}}$, where the latter is reported to be more effective. Luckily, there seems to be an implementation by the original authors to answer the remaining questions. It seems my implementation matches the "basic" discriminative active learning that operates on the original representations, but this does not change the fact that the learned representation is likely better (and also what you want in this case).

One thing I forgot to consider is that the proposed extension's models should take vector representations as input, as opposed to the current dataset abstractions. While it is easy to build this for one specific model, it will be more difficult to implement it in a way that works for different model classes. This could be the reason that I stopped at the current implementation; unfortunately I cannot remember as quite some time has passed since then.

I can try to come up with something but it will likely not be this week. What is your time frame for this? Which model are you planning to use?

from small-text.

chschroeder avatar chschroeder commented on May 27, 2024

Quick update: I have very little time at the moment, but I have a first version for discriminative active learning on representations, specifically for transformer models, which should be considerable more efficient.

At the same time I have added automatic mixed precision in the 2.0.0 branch, which might be interesting for this strategy as well, considering I found the runtime to be the main drawback of discriminative active learning.

I have to properly finish this, and then I will run some sanity checks and runtime comparisons. If you or anyone else is interested, this could be turned into a similar blog post as above ;). Unfortunately, I have to stop before that.

from small-text.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.