Comments (12)
Thanks for reporting this @Visdoom. Could you please add a minimal working example that would produce such an error?
from python-glmnet.
I will try to work on it
from python-glmnet.
Thank you!
from python-glmnet.
Hey there, I found some examples that reliably reproduce that error in my code:
m = LogitNet(alpha=0.8,max_iter=2000,tol=0.3,n_splits=3)
X_train = array([[ 8], [ 9], [ 8], [ 4], [ 8],[ 9],[10], [ 4], [ 5], [ 7],[ 6], [ 7],[ 9],[ 9],[ 6],[ 6],[ 4], [10], [ 5], [ 8], [ 8],[ 9],[ 8],[ 6],[ 7],[ 7]]
y_train = array(['DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'DC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC', 'SC'], dtype=object)
m.fit(X_train, y_train)
from python-glmnet.
Other data would be:
X_train = array([[ 7., 7., 15., 10., 13., 14., 9., 13., 11., 10., 10.,
10., 13., 14., 10., 8., 8., 10., 11., 12., 8., 10.,
18., 5., 15., 12., 12., 10., 10., 10., 12., 8., 11.,
11., 8., 15., 11., 13.]]),
y_train = array(['CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1',
'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1', 'CBC1',
'CBC1', 'CBC1', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2',
'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2',
'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC1'], dtype=object)
or
X_train = array([[ 9. , 17. , 20. , 11. ,
13. , 14. , 15. , 17. ,
15. , 16. , 13. , 16. ,
14. , 16. , 11. , 17. ,
12. , 18. , 11. , 9. ,
16. , 16. , 15. , 18. ,
16. , 13. , 11. , 14. ,
14. , 15. , 15. , 18. ,
15. , 13. , 15. , 18. ,
15. , 9.43743297]]),
y_train = array(['CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2',
'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2', 'CBC2',
'CBC2', 'CBC2', 'CBC2', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T',
'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T',
'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T', 'CBC5T'], dtype=object)
- Here is transposed
X_train
for visualization reasons.
I hope that helps
from python-glmnet.
glmnet.LogitNet
is expecting numbers for the dependent variable, not strings (or np.object
s). You will want to cast your dependent variables to integers. For example:
y = (y_train == 'DC').astype(int)
will set 'DC'
to 1
and everything else to 0
.
from python-glmnet.
I do classification on a large scale and it works for most cases even though I use the dependent variable as it is. I don't think, that this is the problem.
If you want I can get an example with the same dependent variable that does the trick, so you can compare.
Best,
S.
from python-glmnet.
Huh, that's surprising. When I run the example you posted, it does raise the same "Math domain error," however when I replaced y_train
with integers like I showed in my comment, the error goes away. Do you see the same thing?
from python-glmnet.
Hmmmm, now I'm starting to think the issue is something else. I'm guessing a number <= 0 is being passed to one of the np.log
calls on line 124 in that last block of the traceback you posted. I'll reopen and investigate...
from python-glmnet.
yes, when I replace the strings in y_train
with booleans
or int
it works for me as well.
from python-glmnet.
It seems the third CV fold returns a lambda path mostly filled with zeros, and this is causing the error you're seeing. In this fold the covariates for the 'DC' class are effectively identical to those of the 'SC' class, so the best fit coefficient would be zero. Therefore it makes sense that the Fortran code would return a lambda path full of zeros, since no penalty is necessary to shrink the best fit coefficient of zero.
I don't know your use case, but as @wlattner mentioned to me offline, it typically doesn't make much sense to use glment in a univariate problem. It may be worthwhile to add a warning against the univariate case, but personally I don't think this issue merits any changes to python-glmnet, since it is the result of fairly pathological data that doesn't make for a well formed problem.
I could be persuaded otherwise, however, so I'm curious to hear what you think. Thank you for filing issues here!
from python-glmnet.
Hey @kcrum
Thanks for investigating! I've encountered that error when searching a feature space automatically so it is indeed a rather seldom case. I agree that it does not make sense to use glmnet on uni variate cases but I personally are in favor of adding a warning, since those are better caught in an automated approach of i.e. feature selection with the goodness of fit being the selection criterion.
from python-glmnet.
Related Issues (20)
- Extraneous __init__.py in root directory
- ValueError: _glmnet.lognet() 3rd keyword (isd) can't be converted to int
- Create new release for PIP past 2.0 HOT 2
- ElasticNet.predict returns 0d array on 1-row inputs
- attribute n_lambda_ may be less than specified n_lambda, but the selected coefficients vary a lot HOT 10
- Where to put 'lambda.1se' HOT 3
- Use external joblib and six packages, instead of sklearn.externals versions HOT 2
- lol
- Mac wheel library linking issue HOT 7
- BUG: ModuleNotFoundError: No module named '_glmnet' HOT 5
- Error with sklearn's make_scorer
- Error importing glmnetcv
- installing issue on Windows 10
- ElasticNet.fit raises ValueError when not converging instead of just issuing a warning HOT 1
- What can be the reason for different results compared to R glmnet?
- Python 3.9 Compatibility HOT 9
- LogitNet doesn't implement the `_estimator_type` attribute
- Package does not build with `--use-pep517` (with Poetry)
- Inaccurate porting of covariance vs naive method
- Unable to install for python in Windows using pip or conda
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-glmnet.