Comments (3)
@prasanna08 do we have to "implement" a classifier? I thought we would retrieve the classifier (instance or class) from the registry and use that to train a classifier.
This reminds me of another important issue - parameter tuning. If the user chooses to use scikit classifier then he has the additional option of tuning the parameters. He can use GridSearchCV to select the best model parameters. So, there has to be one more service - let's say "parameter_tuning_service" which will use GridSearchCV for scikit models and a custom cross validation function for other models.
from oppia-ml.
We don't have to implement classifier like MLPClassifier
but we still have to implement a class which is going to use MLPClassifier
and return the trained classifier dict. So when I say implement classifier, I mean that we implement our own class on top of 3rd party libs class. We can't directly feed training data as it is, sometimes it may be necessary to do some sort of pre-processing and convert input to desired format.
Parameter tuning -- this can all be covered in classifier that developer will be implementing. It is upto you to decide how you want to process the inputs and whether you will be using fixed parameters or some sort of tuning method.
So basically algorithm_registry will return instance of appropriate class which will take raw training data obtained from job request. Now it is developers responsibility to use 3rd party libs whatever way he/she wants to use.
Does that explain the flow? I thought we were on the same page on this?
from oppia-ml.
@prasanna08 earlier, we followed this approach (of wrapping the classifier in a Class with functions like train, predict etc). In the new scenario, I think it still makes sense to do the same by creating a BaseClassifier class and other derived classes (so people can have their own custom things).
For parameter tuning, at least for scikit I would suggest that we keep it in a separate layer (or a Class or a utility function) so that it can be used/extended by anyone testing new algorithms. It can look as simple as this, but, instead of saying svc_parameter_selection we can just have parameter_selection which also accepts the classifier instance. By saying that the person implementing this should think about it, we are not enforcing the idea of parameter selection which in my opinion is a very important step to obtain well performing models. But if someone is not using scikit for some reason, then, it is difficult to do enforce that.
from oppia-ml.
Related Issues (11)
- Update README file HOT 2
- Fix metadata platform services. HOT 2
- Add necessary directory structure and startup scripts.
- Backend tests are failing on develop branch HOT 4
- Move all CI tests to GitHub Actions HOT 2
- You are using an old version of issue templates. Please update to the new issue template workflow HOT 1
- Implement necessary functions for remote communication HOT 7
- Implement polling and save data functions HOT 1
- Implement main worker process. HOT 5
- Research and implement code classifier
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from oppia-ml.