Comments (8)
At the moment I've written something simple that just copies the current classification model, which I'm using for now, so no particular timeframe on my end! This ties you to using the same hyperparameter when training the DAL classifier and the standard classifier, but that wasn't a particular problem for me
class DiscriminativeActiveLearning_amended(small_text.query_strategies.strategies.DiscriminativeActiveLearning):
# Amended to use the most recent topic classifier as per the DAL paper
def _train_and_get_most_confident(self, ds, indices_unlabeled, indices_labeled, q):
###
# if self.clf_ is not None:
# del self.clf_
# clf = self.classifier_factory.new()
clf = active_learner._clf
###
num_unlabeled = min(indices_labeled.shape[0] * self.unlabeled_factor,
indices_unlabeled.shape[0])
indices_unlabeled_sub = np.random.choice(indices_unlabeled,
num_unlabeled,
replace=False)
ds_discr = DiscriminativeActiveLearning_amended.get_relabeled_copy(ds,
indices_unlabeled_sub,
indices_labeled)
self.clf_ = clf.fit(ds_discr)
proba = clf.predict_proba(ds[indices_unlabeled])
proba = proba[:, self.LABEL_UNLABELED_POOL]
# return instances which most likely belong to the "unlabeled" class (higher is better)
return np.argpartition(-proba, q)[:q]
from small-text.
Hi Emily,
Thank you for the kind feedback! I'm happy to hear that small-text is useful to you.
I am open to any extensions of the discriminative active learning implementation. The only thing I am trying to ensure that the default parameters give you the (original) method described in the accompagnying paper. It has been a while since I implemented this, but after a brief glance at both implementation and paper I would say that the current state matches the paper. Do you agree on this?
Regarding the extension: Thank you for the link to this blog post; I was unaware of this. If I understand your proposal correctly, the idea is to use the current classification model as a starting point for the (discriminative) binary model. Is this accurate? If so, then this should be quite easy to add. you have a specific use case in mind for applying this strategy? This might be a good opportunity to test whether the extended implementation works as intended.
Best regards
Christopher
from small-text.
Hi Christopher,
Thanks for the super quick reply!
My understanding of the DAL paper is that their default implementation is with this extension - though they don't go into a huge amount of detail, and I might have misinterpreted! In this paragraph below, they say that using the current classification model as a starting point for the (discriminative) binary model is important for performance.
Thanks!
Emily
from small-text.
Yes, I think this part, while lacking detail, explains that you can either use the original representation
One thing I forgot to consider is that the proposed extension's models should take vector representations as input, as opposed to the current dataset abstractions. While it is easy to build this for one specific model, it will be more difficult to implement it in a way that works for different model classes. This could be the reason that I stopped at the current implementation; unfortunately I cannot remember as quite some time has passed since then.
I can try to come up with something but it will likely not be this week. What is your time frame for this? Which model are you planning to use?
from small-text.
Quick update: I have very little time at the moment, but I have a first version for discriminative active learning on representations, specifically for transformer models, which should be considerable more efficient.
At the same time I have added automatic mixed precision in the 2.0.0 branch, which might be interesting for this strategy as well, considering I found the runtime to be the main drawback of discriminative active learning.
I have to properly finish this, and then I will run some sanity checks and runtime comparisons. If you or anyone else is interested, this could be turned into a similar blog post as above ;). Unfortunately, I have to stop before that.
from small-text.
Related Issues (20)
- Provide an example of how to construct datasets without labels HOT 1
- Separate query functions from query strategy classes
- Provide animated GIF showing an active learning experiment HOT 2
- Add query strategy: QueryByCommittee
- Dataset cloning wraps the label
- Question: Regarding potential bottleneck during training of Classifiers HOT 18
- Class weighting causes nan values in loss
- TransformerBasedClassification: validations_per_epoch > 2 leaves the model in eval mode
- LightweightCoreset should allow for other distance metrics
- Add auto mixed precision flag to Pytorch-based Classifiers
- Make more SetFitClassification parameters configurable
- Cloning a nested DatasetsViews raises an AttributeError
- Inconsistency: TransformerBaseClassification.embed() returns np.float32 while SetFitClassification returns np.float64
- Add ActivePETs query strategy HOT 1
- Batch size in greedy coreset batching is different than expected
- Device-side assertion not passed when training on cuda device and when there are added tokens to the tokenizer HOT 2
- When using EmbeddingBasedQueryStrategy with some transformers, model has an unsupported input `token_type_ids` when creating embeddings. HOT 1
- Update setfit version (>1.0.0)
- Unify the classifiers' initialize() methods
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from small-text.