Comments (3)
Although in principle it is possible to break down the genotype by chroms to easy the memory usage, it would not be feasible to do permutation testing properly (which is a tremendous task anyways if there are many variants involved). Currently we stick to per chrom analysis without relying on multiple testing results.
from tensorqtl.
It looks like you're generating dense output ā are you changing the default pval_threshold
? If you want that you should use --return_dense
instead. For permutations/FDR, you don't need the dense output though. Can you specify what you're trying to do?
from tensorqtl.
Thank you @francois-a -- we set pval_threshold
by setting it to as high as 1.0 because we would like to have some association results under the null which is an input to a trans-QTL integration step downstreams. The core tensorQTL
calls in our trans analysis pipeline looks like this where by default in our pipeline pval_threshold = 1.0
:
## Trans analysis
trans_df = trans.map_trans(genotype_df,
phenotype_df,
covariates_df,
batch_size=$[batch_size],
return_sparse=True,
return_r2 = True,
pval_threshold=$[pval_threshold],
maf_threshold=$[maf_threshold])
## Filter out cis signal, again if customized cis windows are used, the windows is [start-win,end + win] where win = 0, else it is [start - win, start + win]
trans_df = trans.filter_cis(trans_df, phenotype_pos_df, variant_df, window=window)
## Permutation
if $['True' if permutation else 'False']:
perm_df = trans.map_permutations(genotype_df, covariates_df, batch_size=$[batch_size],
maf_threshold=$[maf_threshold])
perm_output = trans.apply_permutations(perm_df,trans_df)
perm_output.to_csv("$[_output:nn].transqtl_permutation.gz", sep='\t',index = None, compression={'method': 'gzip', 'compresslevel': 9})
That is why we put return_dense = True
.
I guess from your suggestion, we can do two rounds of analysis:
- Per-chrom analysis, or breaking genotypes into even smaller chunks than chroms, where we report all results (regardless of p-value) and skip permutation testing
- We then focus on permutation testing, setting p-value cutoff to some small numbers and use sparse matrix to save marginal association results. We can even just skip the trans analysis, and only get permutation results separately in a dedicated run.
Do you think it is the correct approach to take?
On the other hand, let me think into the downstreams trans-QTL association analysis methods to see if we can get away without using those summary statistics corresponding to large p-values. Then we may be able to get away with a smaller p-value cutoff here for the output, and can do both marginal association and permutation testing in the same pass rather than running it two rounds.
from tensorqtl.
Related Issues (20)
- missing tss_distance in cis output HOT 1
- Abnormal results of interaction model HOT 4
- scipy.optimize.newton failed to converge HOT 1
- Warning and Error in tensorQTL trans mode HOT 9
- Issues with TensorQTL in Trans Mode
- AttributeError: 'dict' object has no attribute 'T' tensorqtl HOT 2
- broken link HOT 1
- Tensorqtl installation HOT 3
- No credible sets output from susie.map because pval_nominal=0 HOT 2
- ValueError: array must not contain infs or NaNs
- interaction HOT 1
- [Susie] ValueError: prior variance must be non-negative HOT 4
- [trans] trans.map_permutations(): KeyError: 'r2' HOT 2
- [map_trans] R2 not returned in interaction analysis HOT 2
- [pval post-processing] Interaction output for cis/trans
- Which beta coresponds to which allele in the outputs?
- cis-mapping with interaction
- Errors in the BackgroundGenerator cause main thread to get stuck
- Questions about the `map_permutations` and `apply_permutations` functions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorqtl.