Comments (3)
I'll have a look at fixing this. Basically you'd just need to add these to the model .pyx files:
def get_params(self): return self.__getstate__()
def set_params(self, params): self.__setstate__(params):
This could work. I fixed most of the incompatibilities with sklearn earlier, but it seems I missed this.
On model hyperparameter estimation, I've used a Gaussian random search algorithm described in Section 6.1.4 of my thesis: https://arxiv.org/pdf/1602.02332.pdf . You can use any of the Python hyperparameter packages, such as cmaes, hyperopt, skopt, Ray Tune, etc. There's a lot of them by now. I'll try to add hyperpameter optimization into the package when I get the chance, since this is useful with the supported models, and makes good use of the supported distributed computing backends.
from wordbatch.
This is bit more complicated to fix. It seem sklearn would need a fix.
For ftrl.pyx, get_params can be defined:
def get_params(self, deep=False):
param_names= ["alpha", "beta", "L1", "L2", "e_clip", "D", "init", "seed", "iters", "w", "z", "n", "inv_link",
"threads", "bias_term", "verbose"]
params= {x:y for x, y in zip(param_names, self.__getstate__())}
if params['inv_link']==1: params['inv_link']= "sigmoid"
else: params['inv_link']= "identity"
return params
Also, the estimator _init_ function needs adding the model parameters w, z, n as optional arguments (np.ndarray w= None, np.ndarray z= None, np.ndarray n= None)
After this you'll still get the following validation error from sklearn:
RuntimeError: Cannot clone object <wordbatch.models.ftrl.FTRL object at 0x56245d510ee0>, as the constructor either does not set or modifies parameter alpha
You can debug sklearn base.py clone() to print the variables:
print(name, param1, param2, type(param1), type(param2), param1 is param2, param1==param2)
Which prints out this:
alpha 0.1 0.1 <class 'float'> <class 'float'> False True
So the first "param1 is param2" comparison fails, whereas the "param1 == param2" comparison works. This is due to difference how the Python "is" and "==" comparisons work. Here any float or numpy array will fail the comparison, so the validation raises an error. I'm not sure why they use "is" instead of "==" in the clone() validation, since this validation should be comparing values of different objects, not their references.
I'll set up a ticket for sklearn developers for fixing the above. If that gets fixed then there's not that many changes to do to fix this issue.
from wordbatch.
Thank you for looking into this!
from wordbatch.
Related Issues (20)
- WordVec extractor failing due to decode error HOT 1
- cannot install on windows 8.1 HOT 4
- "Illegal operation" when importing wordbatch.extractors HOT 2
- Licensing for commercial use without open source? HOT 1
- Tried to pickle the fitted wordbatch model, but bumped into this Error: AttributeError: 'function' object has no attribute 'im_self' HOT 3
- Import FTRL fails HOT 1
- Error on trying to import FM_FTRL HOT 1
- predict() takes a very long time HOT 1
- from wordbatch.data_utils import * HOT 3
- IndexError: too many indices for array HOT 1
- Illegal instruction (core dumped) HOT 1
- TypeError: only size-1 arrays can be converted to Python scalars (Windows, Python 3.5) HOT 1
- Multiprocessing Hanging in Python 3.6+ HOT 7
- are this times normal? HOT 2
- AttributeError: Can't get attribute 'normalize_text' on <module '__main__'> HOT 1
- About Wordbatch HOT 2
- pip install wordbatch on macos---error: command 'gcc-7' failed with exit status 1
- 'tuple' object has no attribute 'transform' HOT 3
- will it work for Windows ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wordbatch.