Comments (8)
Just for completeness: There are two other packages that might also help?
The pyls package has implemented permutation tests:
https://pyls.readthedocs.io/en/latest/index.html
And there's also the resample package:
https://github.com/resample-project/resample
from cca_zoo.
I got code for the CCA part mentioned by @LegrandNico. The full implementation in python using cca_zoo.
It's not packaged up but if people want to work form here, this is the snippet:
# permutation testing from
# https://github.com/andersonwinkler/PermCCA/blob/6098d35da79618588b8763c5b4a519438703dba4/permcca.m#L131-L164
# from cca_zoo.models import PMD
# n_permutation = 2
# rng = np.random.RandomState(42)
# lW, cnt = np.zeros(latent_dims), np.zeros(latent_dims)
# for i in range(n_permutation):
# print(f"Permutation {1 + i} / {n_permutation} ")
# if i == 0:
# X_perm = z_transitions
# Y_perm = z_mriq
# else:
# x_idx = rng.permutation(710)
# y_idx = rng.permutation(710)
# X_perm = z_transitions[x_idx]
# Y_perm = z_mriq[y_idx]
# for k in range(latent_dims):
# print(f"Mode {1 + k} of {latent_dims}")
# perm_model = PMD(c=trained_c,
# latent_dims=(latent_dims - k),
# max_iter=100)
# perm_model.fit(X_perm[:, k:], Y_perm[:, k:])
# r_perm = perm_model.train_correlations[0][1]
# print(r_perm)
# lWtmp = -1 * np.cumsum(np.log(1 - r_perm ** 2)[::-1])[::-1]
# print(lWtmp)
# lW[k] = lWtmp[0]
# if i == 0:
# lw1 = lW
# cnt = cnt + (lW >= lw1)
# punc = cnt / n_permutation
# pfwer = pd.DataFrame(punc).cummax().values
# print(punc)
# print(pfwer)
This is an incredibly lazy attempt.
from cca_zoo.
I would guess you could adapt @htwangtw's code to look something like https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.permutation_test_score.html
Could be rudimentary initially with the goal to get close to sklearn API which could run permutation tests in parallel
from cca_zoo.
Funny you mention this as it's something vaguely on my radar!
I actually made a start porting the quickperms from the PALM toolbox (with the permission of Winkler/Smith) here: https://github.com/jameschapman19/scikit-perm/blob/main/skperm/permutation_tests/cca_permutation_test.py
Which would be a nice bonus with a ported version of permCCA (i.e. where the user can supply their own permutations based on exchangeability blocks).
from cca_zoo.
Thank you @htwangtw , I think that will be really helpful.
I don't know how you want to integrate the permutation functionalities with the rest of the package. I can try to make something, but maybe will start with an example tutorial notebook see if we have everything running.
from cca_zoo.
I've put a version of this in cca_zoo.model_selection._validation with an API that hooks into scikit-learn permutation_test_score.
It's not quite the same as what Winkler does for multiple latent dimensions (but it should be similar) but it works for 1 latent dimension. Should be able to build on this.
from cca_zoo.
I'll add a proper example but it should work like:
from cca_zoo.model_selection import permutation_test_score
from cca_zoo.models import rCCA
import numpy as np
X = np.random.rand(100, 10)
Y = np.random.rand(100, 8)
model = rCCA(c=[0.1, 0.3])
permutation_test_score(model, [X, Y], n_permutations=10)
which returns score (average correlation across dimensions), permutation scores, p-value like scikit-learn
from cca_zoo.
Although I got a lot insights from your nice discussion here, one thing I am still confused:
If I permute one variable once from all views (variable number: 1), or if I permute 1/5 of total variables in all views (variable number: around 10), the resulted significance p value is very different.
It's understandable that when permuting more variables, the random level is high, so that the canonical correlation coefficient is low compared to the reference experiment.
But do you know permuting how many variables is common? And permutation should be done for each view once or multiple views?
Many thanks in advance!
Cheers,
Wantong
from cca_zoo.
Related Issues (20)
- Documentation of penalty parameter ranges? HOT 1
- why the performance of CCA and MCCA is different when I use them to project two views? HOT 9
- GRCCA gives quite different results with original implementation HOT 9
- The cca loss of two irrelevant dataset is -20. HOT 13
- The DCCA loss of two equal view is -6 (dim=100). HOT 12
- when run "python plot_dcca.py" error occurs HOT 2
- problem of latent_dims setting of DCCA_EigenGame HOT 9
- the problem of the model.transform() HOT 4
- Problems encountered when using DCCA.fit
- Pass down parameters to BaseSearchCV in cca_zoo.model_selection.GridSearchCV
- the value of the model.fit_transform() is nan HOT 7
- Could you provide a 3-group CCA method? HOT 2
- What happened to the scale argument? HOT 19
- Cannot simulate data HOT 2
- TerminatedWorkerError when using GridSearchCV HOT 29
- Do you have a CCA method incorporated with Multivariate Granger causality? HOT 3
- cca_zoo.model_selection.GridSearchCV's param_grid parameter should accept any type of iterator HOT 3
- Implement CCA-classes that can account for sample groups HOT 2
- predict() method for models? HOT 6
- SPLS is slower than it used to be.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cca_zoo.