Giter Site home page Giter Site logo

pca-magic's People

Contributors

allentran avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pca-magic's Issues

ValueError: Must specify axis=0 or 1

from ppca import PPCA
ppca = PPCA()
ppca.fit(X)
<ipython-input-36-931cb5915c29> in <module>()
----> 1 ppca.fit(X)

/usr/local/lib/python2.7/dist-packages/ppca/_ppca.pyc in fit(self, data, d, tol, min_obs, verbose)
     26 
     27         self.raw = data
---> 28         self.raw[np.isinf(self.raw)] = np.max(self.raw[np.isfinite(self.raw)])
     29 
     30         valid_series = np.sum(~np.isnan(self.raw), axis=0) >= min_obs

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in __setitem__(self, key, value)
   2514             self._setitem_array(key, value)
   2515         elif isinstance(key, DataFrame):
-> 2516             self._setitem_frame(key, value)
   2517         else:
   2518             # set column

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in _setitem_frame(self, key, value)
   2552         self._check_inplace_setting(value)
   2553         self._check_setitem_copy()
-> 2554         self._where(-key, value, inplace=True)
   2555 
   2556     def _ensure_valid_index(self, value):

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in _where(self, cond, other, inplace, axis, level, errors, try_cast)
   5902 
   5903                 _, other = self.align(other, join='left', axis=axis,
-> 5904                                       level=level, fill_value=np.nan)
   5905 
   5906                 # if we are NOT aligned, raise as we cannot where index

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in align(self, other, join, axis, level, copy, fill_value, method, limit, fill_axis, broadcast_axis)
   2917                                             method=method, limit=limit,
   2918                                             fill_axis=fill_axis,
-> 2919                                             broadcast_axis=broadcast_axis)
   2920 
   2921     @Appender(_shared_docs['reindex'] % _shared_doc_kwargs)

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in align(self, other, join, axis, level, copy, fill_value, method, limit, fill_axis, broadcast_axis)
   5728                                       copy=copy, fill_value=fill_value,
   5729                                       method=method, limit=limit,
-> 5730                                       fill_axis=fill_axis)
   5731         else:  # pragma: no cover
   5732             raise TypeError('unsupported type: %s' % type(other))

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in _align_series(self, other, join, axis, level, copy, fill_value, method, limit, fill_axis)
   5827                     fdata = fdata.reindex_indexer(join_index, lidx, axis=0)
   5828             else:
-> 5829                 raise ValueError('Must specify axis=0 or 1')
   5830 
   5831             if copy and fdata is self._data:

ValueError: Must specify axis=0 or 1```

ModuleNotFoundError

Never mind, I realized this is for python 2.7.
I get the following error

import ppca
Traceback (most recent call last):
File "", line 1, in
File "/Users/BadWizard/anaconda3/envs/py36/lib/python3.6/site-packages/ppca/init.py", line 1, in
from _ppca import PPCA
ModuleNotFoundError: No module named '_ppca'

Any suggestions? Thanks

The dimention of component is always 2

Hi Allen, first of thanks for your great work. I have tried your code but seems that the resulting component_mat always has the shape of (#data, 2) no matter what the value d have. Could you please take a look and see what is the reason? Thanks in advance!

can't import ppca

In [1]: from ppca import PPCA                              
-----------------------------------------------------------
ModuleNotFoundError       Traceback (most recent call last)
<ipython-input-1-e861ede3f408> in <module>
----> 1 from ppca import PPCA

~/anaconda3/envs/red/lib/python3.6/site-packages/ppca/__init__.py in <module>
----> 1 from _ppca import PPCA

ModuleNotFoundError: No module named '_ppca'

please help me if you watch it

Reference for notation and equations in the EM algorithm

Hello Allen, thank you for providing this package.

Can you please provide a reference for the matrix equations in your implementation of the EM algorithm? Appendix B of Tipping and Bishop provides an EM algorithm for probabilistic PCA, although your implementation deals with missing data in the data matrix, which is not addressed directly in the Tipping (nor any other reference that I can find which agrees with your algorithm).

Thank you.

[1] Tipping, Michael E., and Christopher M. Bishop. "Probabilistic principal component analysis." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61, no. 3 (1999): 611-622.

paper reference

Hi!

is there any literature to support this work? a published peer reviewed paper that you can link to?

thank you!

ImportError

Traceback (most recent call last): File "/Users/jiangqy/Code/ppca/bin/test.py", line 5, in <module> from ppca import PPCA File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/ppca/__init__.py", line 1, in <module> from _ppca import PPCA ImportError: No module named '_ppca'

I use Python3

Total variance explained > 1

Forgive me if this is a known counterintuitive point deemed irrelevant, but I noticed total variance explained by all components is greater than one. That's true in my dataset with missing values, but also in the complete example below.

import numpy as np
from ppca import PPCA

x = np.random.randn(50,20)
m = PPCA()
m.fit(data=x)

print(m.var_exp)
[0.11460246 0.21691676 0.30977113 0.40169889 0.4885789  0.56032857
 0.62697946 0.68458968 0.73693932 0.78439966 0.82519526 0.86416853
 0.89399395 0.9215888  0.94654215 0.9696493  0.98877802 1.00211534
 1.01186025 1.02040816]

It seems to be related to the fact that the sum of all eigenvalues is greater than the number of dimensions in the original dataset. Since sum of eigenvalues should be equal to trace of correlation matrix, I would not expect that to be the case.

print(np.cumsum(m.eig_vals)/20.)
[0.11460246 0.21691676 0.30977113 0.40169889 0.4885789  0.56032857
 0.62697946 0.68458968 0.73693932 0.78439966 0.82519526 0.86416853
 0.89399395 0.9215888  0.94654215 0.9696493  0.98877802 1.00211534
 1.01186025 1.02040816]

print(m.eig_vals.sum())
20.40816326530614

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.