Giter Site home page Giter Site logo

IndexError in fs_r about mca HOT 9 CLOSED

grahnavard avatar grahnavard commented on June 11, 2024
IndexError in fs_r

from mca.

Comments (9)

esafak avatar esafak commented on June 11, 2024

Please upload the data too; as little as needed for us to reproduce the error.

from mca.

grahnavard avatar grahnavard commented on June 11, 2024

Here are the data that I have used. I added the code I wrote to call MCA as well. what happens if I have only one column? What I am interested is to get the first dimension for all individuals as a representation of my data. Is there a way to get one dimension with high explained variance?
code:

data="""X   Y   Z
5   6   2
12  0   4
12  8   13
1   5   13
4   10  7
9   1   5
7   5   8
8   6   9
2   5   3
9   2   6
3   1   0
7   0   5
10  11  8
6   7   8
1   11  8
3   11  0
11  8   4
8   7   6
7   13  1
0   12  10
6   10  1
13  12  6
3   12  3
8   5   0
6   4   9
13  2   2
12  9   1
2   11  10
4   11  11
3   9   9
11  3   12
9   4   0
4   3   7
11  5   2
6   13  12
1   8   5
2   10  13
6   3   0
2   4   13
1   1   7
5   12  0
2   7   5
10  7   12
1   10  8
3   3   2
2   2   4
4   7   10
0   9   4
8   0   8
5   1   1
7   10  5
9   7   3
10  13  7
3   6   4
13  6   5
3   4   3
2   5   6
7   7   8
0   11  9
4   6   7
1   10  3
10  8   3
2   0   12
12  13  3
12  8   8
13  10  7
5   1   11
12  1   11
1   3   10
8   5   3
5   4   7
7   5   13
13  1   6
13  2   1
11  13  12
9   6   4
6   0   4
12  2   12
7   13  10
8   6   12
5   11  3
6   8   7
11  0   11
13  10  10
5   12  0
13  3   11
9   10  4
5   3   7
4   12  1
1   6   10
7   4   1
6   9   7
10  10  11
12  0   9
2   0   13
10  2   7
7   3   2
9   10  9
10  13  9
9   2   6
11  12  7
2   12  5
9   11  9
3   0   1
0   12  0
6   0   9
3   6   5
0   7   2
8   6   0
6   1   13
7   11  1
10  10  12
4   11  11
11  7   10
0   2   9
3   5   3
10  3   9
13  12  0
10  8   8
10  8   3
0   1   9
4   8   5
8   4   9
8   5   6
7   9   13
10  2   7
13  3   11
9   12  6
5   12  0
5   6   5
11  4   7
0   6   3
13  3   1
6   4   5
12  8   8
4   2   3
2   9   0
1   4   10
9   8   6
3   3   2
5   0   5
2   8   12
0   7   11
6   11  10
3   2   8
10  13  1
3   0   0
5   9   11
9   11  6
9   12  3
10  2   13
7   4   5
12  13  12
12  7   12
11  1   4
12  11  13
8   9   2
10  9   8
12  10  1
7   7   10
0   3   12
1   6   11
4   1   2
0   2   4
7   13  2
9   0   2
4   5   11
8   0   6
1   3   8
4   12  4
2   5   1
8   1   0
11  8   10
1   9   4
0   13  2
4   10  8
0   7   5
0   0   2
1   9   12
12  4   6
2   9   1
6   7   6
3   13  12
3   2   0
0   13  6
5   1   4
8   6   10
13  1   10
11  5   13
6   8   13
11  9   11
7   11  13
4   3   11
5   13  13
11  7   2
13  5   10
8   4   3
13  9   4
11  4   9
1   10  7"""
import pandas, mca, io
X = mca.MCA(pandas.read_csv(io.StringIO(data), sep='\t'),
    benzecri=True, TOL=1e-4, cols=None, ncols=None))
print(X.fs_r())

from mca.

esafak avatar esafak commented on June 11, 2024

What's going on in your situation is that Benzecri correction is eliminating all your eigenvalues:

> print(mca.MCA(pandas.read_csv(io.StringIO(data), sep='\t'), benzecri=True).E)

array([0, 0, 0])

The problem goes away if you simply do not use Benzecri correction. Does this satisfy your concern, or do you think the package should be acting otherwise?

from mca.

grahnavard avatar grahnavard commented on June 11, 2024

Thank you for your quick reposes. I still get the same error. Do you mean I should use benzecri= False? In either cases it gives me the same error. In your answer what all element in the array are zero.

from mca.

esafak avatar esafak commented on June 11, 2024

If you set benzecri=False fs_r() runs on the above data without error. If it doesn't you might have to install the latest copy from github rather than pypi. The array in my previous answer is the Benzecri corrected eigenvalue matrix. Recall that Benzecri correction involves thresholding the eigenvalues (cf. equation 7). In your case all the eigenvalues are less than the reciprocal of the number of dimensions (i.e., 1/3), hence they are mapped to zero:

> print(mca.MCA(pandas.read_csv(io.StringIO(data), sep='\s', header=None), benzecri=True).s**2)

array([  1.44950626e-01,   1.32977051e-01,   3.88439951e-31])

from mca.

grahnavard avatar grahnavard commented on June 11, 2024

In fact, I used the latest version from github while the pypi version doesnt have mca.MCA(), instead it should be called by mca.mca(), Now I re-install it from pypi and it worked as you mentioned. Do we have "expl_var" in pypi version? what happens if we just have one or two columns? many times I get this error the pypi version:
File "/Library/Python/2.7/site-packages/mca-1.0-py2.7.egg/mca.py", line 52, in init
self.P, self.s, self.Q = scipy.linalg.svd(_mul(self.D_r, Z_c, self.D_c))
File "/Library/Python/2.7/site-packages/scipy-0.15.1-py2.7-macosx-10.9-intel.egg/scipy/linalg/decomp_svd.py", line 88, in svd
a1 = asarray_chkfinite(a)
File "/Library/Python/2.7/site-packages/numpy/lib/function_base.py", line 613, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
Data:
0 1 2 3
0 0 0 0 0
1 4 4 2 3
2 2 1 1 1
3 0 3 2 1
4 3 1 4 2
5 1 0 0 0
6 2 2 2 2
7 4 2 4 3
8 4 3 4 5
9 6 4 6 4
10 6 1 5 4
11 0 0 0 0
12 5 6 6 5
13 4 2 1 1
14 6 5 6 6
15 3 4 3 4
16 6 4 3 4
17 2 1 1 0
18 1 5 2 2
19 0 0 0 0
20 6 5 6 5
21 5 6 6 6
22 3 6 6 5
23 3 0 2 1
24 0 2 0 1
25 5 1 4 1
26 1 1 1 2
27 0 0 0 0
28 2 6 4 5
29 2 0 1 0
30 5 6 5 6
31 4 3 4 3
32 1 3 4 3
33 4 6 6 6
34 6 6 5 6
35 5 4 3 3
36 3 2 2 2
37 6 4 5 5
38 5 5 5 5
39 5 3 5 3
40 2 5 3 4
41 3 3 3 4
42 0 3 1 2
43 3 5 3 4
44 1 4 3 6
45 1 2 2 3
46 2 1 1 2
47 1 2 0 1
48 4 5 5 6
49 0 0 0 0

Thank you.

from mca.

esafak avatar esafak commented on June 11, 2024

Since the last pypi release, one contributor renamed mca to MCA, while another introduced the expl_var method.

It makes no sense to use MCA with one/two-dimensional data; it's a dimensionality reduction method, and you have nothing to reduce.

from mca.

grahnavard avatar grahnavard commented on June 11, 2024

With one column, I believe that method should return the original data instead of giving error, but with two columns, it still should work. The error I got , as the data shows, can happen for more than 2 columns.

from mca.

esafak avatar esafak commented on June 11, 2024

In this case the problem is the all-zero entries, which causes division by zero during the calculation of the normalization factor D_r. My suggested remedy is to drop them:

data = ... # the 49-line string from your last post
newdf = pandas.read_csv(io.StringIO(data), sep='\s', index_col=0)
mca.MCA(newdf[newdf.sum(axis=1) != 0], benzecri=False).fs_r()

(Benzecri correction fails for the same reason as before.)

from mca.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.