Comments (9)
Please upload the data too; as little as needed for us to reproduce the error.
from mca.
Here are the data that I have used. I added the code I wrote to call MCA as well. what happens if I have only one column? What I am interested is to get the first dimension for all individuals as a representation of my data. Is there a way to get one dimension with high explained variance?
code:
data="""X Y Z
5 6 2
12 0 4
12 8 13
1 5 13
4 10 7
9 1 5
7 5 8
8 6 9
2 5 3
9 2 6
3 1 0
7 0 5
10 11 8
6 7 8
1 11 8
3 11 0
11 8 4
8 7 6
7 13 1
0 12 10
6 10 1
13 12 6
3 12 3
8 5 0
6 4 9
13 2 2
12 9 1
2 11 10
4 11 11
3 9 9
11 3 12
9 4 0
4 3 7
11 5 2
6 13 12
1 8 5
2 10 13
6 3 0
2 4 13
1 1 7
5 12 0
2 7 5
10 7 12
1 10 8
3 3 2
2 2 4
4 7 10
0 9 4
8 0 8
5 1 1
7 10 5
9 7 3
10 13 7
3 6 4
13 6 5
3 4 3
2 5 6
7 7 8
0 11 9
4 6 7
1 10 3
10 8 3
2 0 12
12 13 3
12 8 8
13 10 7
5 1 11
12 1 11
1 3 10
8 5 3
5 4 7
7 5 13
13 1 6
13 2 1
11 13 12
9 6 4
6 0 4
12 2 12
7 13 10
8 6 12
5 11 3
6 8 7
11 0 11
13 10 10
5 12 0
13 3 11
9 10 4
5 3 7
4 12 1
1 6 10
7 4 1
6 9 7
10 10 11
12 0 9
2 0 13
10 2 7
7 3 2
9 10 9
10 13 9
9 2 6
11 12 7
2 12 5
9 11 9
3 0 1
0 12 0
6 0 9
3 6 5
0 7 2
8 6 0
6 1 13
7 11 1
10 10 12
4 11 11
11 7 10
0 2 9
3 5 3
10 3 9
13 12 0
10 8 8
10 8 3
0 1 9
4 8 5
8 4 9
8 5 6
7 9 13
10 2 7
13 3 11
9 12 6
5 12 0
5 6 5
11 4 7
0 6 3
13 3 1
6 4 5
12 8 8
4 2 3
2 9 0
1 4 10
9 8 6
3 3 2
5 0 5
2 8 12
0 7 11
6 11 10
3 2 8
10 13 1
3 0 0
5 9 11
9 11 6
9 12 3
10 2 13
7 4 5
12 13 12
12 7 12
11 1 4
12 11 13
8 9 2
10 9 8
12 10 1
7 7 10
0 3 12
1 6 11
4 1 2
0 2 4
7 13 2
9 0 2
4 5 11
8 0 6
1 3 8
4 12 4
2 5 1
8 1 0
11 8 10
1 9 4
0 13 2
4 10 8
0 7 5
0 0 2
1 9 12
12 4 6
2 9 1
6 7 6
3 13 12
3 2 0
0 13 6
5 1 4
8 6 10
13 1 10
11 5 13
6 8 13
11 9 11
7 11 13
4 3 11
5 13 13
11 7 2
13 5 10
8 4 3
13 9 4
11 4 9
1 10 7"""
import pandas, mca, io
X = mca.MCA(pandas.read_csv(io.StringIO(data), sep='\t'),
benzecri=True, TOL=1e-4, cols=None, ncols=None))
print(X.fs_r())
from mca.
What's going on in your situation is that Benzecri correction is eliminating all your eigenvalues:
> print(mca.MCA(pandas.read_csv(io.StringIO(data), sep='\t'), benzecri=True).E)
array([0, 0, 0])
The problem goes away if you simply do not use Benzecri correction. Does this satisfy your concern, or do you think the package should be acting otherwise?
from mca.
Thank you for your quick reposes. I still get the same error. Do you mean I should use benzecri= False? In either cases it gives me the same error. In your answer what all element in the array are zero.
from mca.
If you set benzecri=False fs_r()
runs on the above data without error. If it doesn't you might have to install the latest copy from github rather than pypi. The array in my previous answer is the Benzecri corrected eigenvalue matrix. Recall that Benzecri correction involves thresholding the eigenvalues (cf. equation 7). In your case all the eigenvalues are less than the reciprocal of the number of dimensions (i.e., 1/3), hence they are mapped to zero:
> print(mca.MCA(pandas.read_csv(io.StringIO(data), sep='\s', header=None), benzecri=True).s**2)
array([ 1.44950626e-01, 1.32977051e-01, 3.88439951e-31])
from mca.
In fact, I used the latest version from github while the pypi version doesnt have mca.MCA(), instead it should be called by mca.mca(), Now I re-install it from pypi and it worked as you mentioned. Do we have "expl_var" in pypi version? what happens if we just have one or two columns? many times I get this error the pypi version:
File "/Library/Python/2.7/site-packages/mca-1.0-py2.7.egg/mca.py", line 52, in init
self.P, self.s, self.Q = scipy.linalg.svd(_mul(self.D_r, Z_c, self.D_c))
File "/Library/Python/2.7/site-packages/scipy-0.15.1-py2.7-macosx-10.9-intel.egg/scipy/linalg/decomp_svd.py", line 88, in svd
a1 = asarray_chkfinite(a)
File "/Library/Python/2.7/site-packages/numpy/lib/function_base.py", line 613, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
Data:
0 1 2 3
0 0 0 0 0
1 4 4 2 3
2 2 1 1 1
3 0 3 2 1
4 3 1 4 2
5 1 0 0 0
6 2 2 2 2
7 4 2 4 3
8 4 3 4 5
9 6 4 6 4
10 6 1 5 4
11 0 0 0 0
12 5 6 6 5
13 4 2 1 1
14 6 5 6 6
15 3 4 3 4
16 6 4 3 4
17 2 1 1 0
18 1 5 2 2
19 0 0 0 0
20 6 5 6 5
21 5 6 6 6
22 3 6 6 5
23 3 0 2 1
24 0 2 0 1
25 5 1 4 1
26 1 1 1 2
27 0 0 0 0
28 2 6 4 5
29 2 0 1 0
30 5 6 5 6
31 4 3 4 3
32 1 3 4 3
33 4 6 6 6
34 6 6 5 6
35 5 4 3 3
36 3 2 2 2
37 6 4 5 5
38 5 5 5 5
39 5 3 5 3
40 2 5 3 4
41 3 3 3 4
42 0 3 1 2
43 3 5 3 4
44 1 4 3 6
45 1 2 2 3
46 2 1 1 2
47 1 2 0 1
48 4 5 5 6
49 0 0 0 0
Thank you.
from mca.
Since the last pypi release, one contributor renamed mca to MCA, while another introduced the expl_var
method.
It makes no sense to use MCA with one/two-dimensional data; it's a dimensionality reduction method, and you have nothing to reduce.
from mca.
With one column, I believe that method should return the original data instead of giving error, but with two columns, it still should work. The error I got , as the data shows, can happen for more than 2 columns.
from mca.
In this case the problem is the all-zero entries, which causes division by zero during the calculation of the normalization factor D_r
. My suggested remedy is to drop them:
data = ... # the 49-line string from your last post
newdf = pandas.read_csv(io.StringIO(data), sep='\s', index_col=0)
mca.MCA(newdf[newdf.sum(axis=1) != 0], benzecri=False).fs_r()
(Benzecri correction fails for the same reason as before.)
from mca.
Related Issues (14)
- Verify factor scores under Benzecri correction
- Fix unit tests under 2.x
- Include generic data samples in docs
- MCA having problems with pandas CategoricalIndex HOT 1
- MCA from pypi is outdated HOT 1
- Possible fix for error: “ValueError: array must not contain infs or NaNs”
- MemoryError when data set is large HOT 4
- ncols can't be larger than number of rows of the dataframe HOT 2
- Fails to return results and gets killed HOT 1
- MCA throwing Memory Error HOT 1
- Improvements Suggestions HOT 2
- Should number samples must be greater than reduced dimension value HOT 1
- Functionality of fs_r_sup() HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mca.