Comments (10)
Great, thanks! It is certainly not added. I'll take a look at the paper as soon as possible!
from modal.
Thanks for the PR! Fixed by #29.
from modal.
Hmm... This issue and PR #29 actually are addressing different problems.
This issue is on the problem that the best cold start instance is not added to the first batch (in ranked_batch), while PR #29 address the problem that the computation of instance index is incorrect (in select_instance).
I haven't started a pull request for this issue since solving it very likely requires changing the API of select_cold_start_instance.
If needed, I can start a PR for this issue later today.
from modal.
Sorry, remembered the issue wrong and didn't read the post again before commenting. Issue reopened! :)
from modal.
In any case, no need to rush with the PR! Probably I won't have time to work on this week, so any help is appreciated!
from modal.
Hi, I have opened PR #30 for this issue.
By the way, I think it will be great if we can compose the cold start handling mechanism that currently works for ranked batch sampling (and possibly other cold start handling strategies in the future) with other active sampling strategies supported by modAL.
from modal.
Alright, thanks for the PR! I finally had the time to review and merge it. Currently, some cold start is implemented for the utility measure functions, but it only checks whether the estimator has been fitted yet, and if not, it returns a zero array. Implementing the same density based cold start criteria for a general query function is a good idea.
from modal.
I see. I will take a look at how to integrate cold start handling mechanisms.
One thing that I have been thinking about is whether it is better to pass the cold start function to the query strategy functions or to the Learner when initialized. It is logically sounder to pass cold start criteria to a query strategy since "cold start criteria" are part of the "query strategy", while in implementation, it seems much easier to do it the other way. If we pass the cold start criteria to the Learner, it seems that we only need to change the Learner.query method to support cold start handling for all the query strategies. By comparison, if the cold start criteria is to be passed to the query strategy functions, all the query strategy functions may need to be revised.
Thanks.
from modal.
I agree completely. I think it is better if the cold start strategy is passed to the query strategy, even so if all query strategy functions need to be modified.
In connection with this, I also plan to do a refactor of the query strategy functions. If you check the code, for instance here, the implementation of the uncertainty_sampling
, margin_sampling
, entropy_sampling
functions is almost identical, aside from the function they call for calculating the utility. This can be solved with a function factory or some other construct. The only reason I implemented it this way because I wanted to avoid adding docstrings one by one later. Do you have any idea which might be good for this? We might hit two birds with one stone, because this would solve the problem outlined by you.
from modal.
Hmm... I don't have better ideas than using a function factory.
A possible alternative is to lift the query strategies from functions to instances of a QueryStrategy class. Different instances of this QueryStrategy class can have different scorers (e.g., classifier_entropy
) and cold start handlers. This solution doesn't seem to have a clear advantage over the function factory solution.
from modal.
Related Issues (20)
- How to extract the image names and labels in the training set after completing the active learning loop and write them to a CSV file
- decision_function instead of predict_proba HOT 5
- AttributeError: bootstrap_init HOT 3
- TypeError: cannot concatenate object of type '<class 'numpy.ndarray'>'; only Series and DataFrame objs are valid
- Can I use modAL with estimators from other libraries than scikit-learn like xgboost? HOT 1
- Which sampling method is best for very unbalanced data? HOT 1
- Encountering error with number of batches per epoch
- mmdetection integration with modAL
- Adding active learning regression implementations based on greedy sampling HOT 2
- modAL not installable via pypi anymore HOT 3
- the modAL package has been changed into modal in the pip repository HOT 7
- Data augmentation with `skorch`
- QBC approach for multi-class classification
- Suggestion on how to improve acquisition.UCB for active GP example HOT 1
- QBC stratified bootstrapping HOT 1
- Use modAL on BERT models HOT 1
- Spacy NER HOT 1
- raise ImportError( ImportError: C extension: None not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext' to build the C extensions first.
- uncertainty query for 2d classifier output
- The installation guide in the docs is wrong
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modal.