ripkelab / ricopili Goto Github PK
View Code? Open in Web Editor NEWMain ricopili repo for public releases
Main ricopili repo for public releases
The list of logs created by ./rp_config
(line 1491) is missing the clumper_info
log used by clump_nav3
. Adding $conf{loloc}/clumper_info
to that line of rp_config
should fix that issue.
Relatedly, clump_navi
references $homedir/clumper_info
. Since this specifies the home directory, it may or may not end up being the same file clump_nav3
's log. Not clear whether it's desired to have these two clumping scripts use the same log? In either case, the log for clump_navi
should probably exist in the designated $loloc
rather than the home directory.
(Credit @ce-carey for encountering this issue.)
When dividing data into the ~950 genomic chunks for imputation, it is possible to end up with chunks that contain very few genotyped markers (i.e. <5), particularly near centromeres and at the ends of chromosomes. As long as at least 1 genotyped SNP is in the chunk, ricopili will try to phase and impute. During pre-phasing, if there are any individuals that are missing for all SNPs in a chunk, shapeit will exit with an error. When there are very few SNPs in the chunk it is likely that at least one individual will be completely missing and phasing will fail. Ricopili will eventually error out once it realizes it is stuck trying to phase those chunks.
Current workaround:
./pi_sub/preph_mu1.job_list.s
) and exclude them from imputation with --refiex
flag. The cause of the error can be verified in shapeit log file ./pi_sub/haps_*/plink.*.chrX_XXX_XXX.shape.log
for relevant chunk.Possible solutions:
Hi, I noticed a likely bug in the most recent ricopili (2019_Jun_25.001). The my.wget script calls wget with a flag that used to be -nv (for --no-verbose), but now is just -n. Since there is no -n option in wget, it gets joined with the first letter of the URL, which was 'h' for me, into -nh. Since wget knows no -nh option either, it then stops.
Line 18 in 757fb9d
I am working on generating polygenic risk scores from base data that is in daner format with a hg19 genome build. My target data genome build is hg38. My base data daner file is smaller so I was hoping to lift it to hg38 to match the builds. Is there an essay way to do this for the daner format? I've been trying to find something that will work with the daner format.
pcaer_20
reliably fails at step me300
when specifying --trio
.
The cause of the error is that the trio flag defines a trio_ex_file
to be used with as ie_file
(at line 978-981) which is passed to bcomb
at line 1088, but the code to create that file (line 1049-1054) has been commented out by the if(0)
on the block starting line 1022.
Currently unclear to me whether the fix is to re-activate the code creating the trio_ex_file
or to stop that argument from being passed to bcomb
.
Excessively long SNP names are not supported by eigensoft/smartpca, meaning smartpca - and thus pcaer_20
- will crash if long SNP names are present (e.g. >20 characters). This is entirely analogous to the limits on lengths of FID/IID, just much rarer since SNP names tends to be relatively compact and don't have extra length from the ricopili id tager.
This crash is currently being observed with unusually named indels (e.g. 1:6638843-CGCTGCTGCT-CGCTGCTGCTGCTGCTGCT
). This permits at least 2 plausible solutions:
Remove indels from consideration when running PCA. This has the advantage of being a relatively simple filter to add as part of the "strict qc" filters before running smartpca, e.g. by adding --snps-only no-DI
to the plink call at line 1178 of pcaer_20
.
Do a temporary substitution/shortening of SNP names prior to running smartpca. This is more work, but allows indels to stay in the data so that the overall PCA workflow remains the same. Shortening of the SNP names could be done at lines 400-415 of eigen_pca3
while cM distances are being removed. I'm not sure any output that would need translation back (e.g. SNP weights from PCA solution) is being kept currently.
(thanks to @chiayenchen for identifying this issue)
If clump_nav3
is run with the --noindel
flag set, SNPs with lower case alleles will be excluded. This occurs because indels are identified by checking whether the lengths of A1 and A2 are 1 and whether A1 and A2 are each one of A,C,G, or T (lines 719-724).
This likely doesn't make a difference for the usage of clump_nav3
within ricopili, but may impact users running clump_nav3
as a stand-alone.
Proposed fix:
Change the regexes in lines 722 and 723 to check =~ /[ACGTacgt]/
rather than =~ /[ACGT]/
.
Plots generated by pcaer
(e.g. MDS plots, summary plot of PC association to phenotype) use phenotypes from the individuals FID tag (cas
, con
, mis
) rather than from the fam file. This means that any updates to the phenotype after initial QC (e.g. for alternative outcomes, updated phenotyping, etc) that are applied to the fam file will not be reflected in the PCA output.
This is particularly problematic for the summary pdf, since the associations between the original phenotype (indicated in the FID) and the PCs will be reported rather than association with the updated/alternative phenotype, which may be misleading when using the pcaer
output to identify necessary PC covariates for GWAS with the updated phenotype.
Affected code is in pca_plot_5
with the creation of the $mds_ow_file
and subsequent use of the CC
column in R for plotting (lines 350-600).
Hello,
I'm trying to install Ricopili on a linux HPC cluster using the directions found here:
https://docs.google.com/document/d/14aa-oeT5hF541I8hHsDAL_42oyvlHRC5FWR7gir4xco
I've gotten to the point of running "rp_test_navi" and waiting for output. The pipeline completes but doesn't write expected files. The main error seems to be that it's trying to submit a job from a compute node and it also can't understand "batch_job_output_jid" in my custom config file (honestly, I can't understand what this should be either). See below for details on these errors.
Any help is appreciated. I think I'm close to having RP working, but I need some guidance. Thanks!
This is what I get after running the test module (I removed the boilerplate "RICOPILI" text for clarity):
mchiment@argon-login-1: test_2$ rp_test_navi
Config_file: /Users/mchiment/ricopili.conf
rp_test_navi - module of ricopili pipeline
version: 2019_Feb_18.001
https://sites.google.com/a/broadinstitute.org/ricopili/home
Stephan Ripke: [email protected]
This looks OK, I think...but the manual says I should see these files:
rp_test_forest_join-nup.pdf (a merge of all PDFs)
rp_text.xls (a mock excel file)
ldsc.PGC_meta.r4.gz.tar.gz (the results from the LDScore analyses
My output (working) directory looks like this after running the test:
drwxr-x--- 2 mchiment its-rs-user 8 Apr 4 13:51 errandout
-rw-r--r-- 1 mchiment its-rs-user 271 Apr 4 13:50 heavycomp.job_list
-rwxr--r-- 1 mchiment its-rs-user 252 Apr 4 13:50 heavycomp.job_list.sub1.sh
-rwxr--r-- 1 mchiment its-rs-user 69 Apr 4 13:50 heavycomp.job_list.sub2.sh
-rw-r--r-- 1 mchiment its-rs-user 123 Apr 4 13:50 j.heavycomp.test
-rw-r--r-- 1 mchiment its-rs-user 67 Apr 4 13:50 j.heavycomp.test.id
-rw-r--r-- 1 mchiment its-rs-user 192 Apr 4 13:50 j.heavycomp.test.log
-rw-r--r-- 1 mchiment its-rs-user 120 Apr 4 13:51 j.qqplot.test
-rw-r--r-- 1 mchiment its-rs-user 0 Apr 4 13:51 j.qqplot.test.id
-rw-r--r-- 1 mchiment its-rs-user 186 Apr 4 13:51 j.qqplot.test.log
-rw-r--r-- 1 mchiment its-rs-user 69 Apr 4 13:50 j._te_test
-rw-r--r-- 1 mchiment its-rs-user 49 Apr 4 13:50 j._te_test.id
-rw-r--r-- 1 mchiment its-rs-user 185 Apr 4 13:50 j._te_test.log
-rw-r--r-- 1 mchiment its-rs-user 33K Apr 4 13:50 PGC_cohort1.ch.fl.r4.gz
-rw-r--r-- 1 mchiment its-rs-user 34K Apr 4 13:50 PGC_cohort2.ch.fl.r4.gz
-rw-r--r-- 1 mchiment its-rs-user 34K Apr 4 13:50 PGC_cohort3.ch.fl.r4.gz
-rw-r--r-- 1 mchiment its-rs-user 34K Apr 4 13:50 PGC_cohort4.ch.fl.r4.gz
-rw-r--r-- 1 mchiment its-rs-user 54K Apr 4 13:50 PGC_meta.r4.gz
-rw-r--r-- 1 mchiment its-rs-user 717 Apr 4 13:51 qqplot.job_list
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.10.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.11.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.12.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.13.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.14.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.15.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.16.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.17.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.18.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.19.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.1.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.20.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.2.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.3.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.4.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.5.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.6.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.7.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.8.finished
-rw-r--r-- 1 mchiment its-rs-user 8 Apr 4 13:51 touch.9.finished
Looking in "errandout" we have from the logs:
mchiment@argon-login-1: errandout$ cat _te_test.e7447609
Unable to run job: denied: host "argon-lc-g11-16.hpc" is not a submit host
Exiting
AND
mchiment@argon-login-1: errandout$ cat _te_test.o7447609
Config_file: /Users/mchiment/ricopili.conf
succ: touch.1.finished
succ: touch.2.finished
succ: touch.3.finished
succ: touch.4.finished
succ: touch.5.finished
succ: touch.6.finished
succ: touch.7.finished
succ: touch.8.finished
succ: touch.9.finished
succ: touch.10.finished
succ: touch.11.finished
succ: touch.12.finished
succ: touch.13.finished
succ: touch.14.finished
succ: touch.15.finished
succ: touch.16.finished
succ: touch.17.finished
succ: touch.18.finished
succ: touch.19.finished
succ: touch.20.finished
PGC_cohort1.ch.fl.r4.gz.out-qq.pdf not found, need to start job
PGC_cohort2.ch.fl.r4.gz.out-qq.pdf not found, need to start job
PGC_cohort3.ch.fl.r4.gz.out-qq.pdf not found, need to start job
PGC_cohort4.ch.fl.r4.gz.out-qq.pdf not found, need to start job
Config_file: /Users/mchiment/ricopili.conf
starting job_array, j.qqplot.test
batch_taskid: $SGE_TASK_ID
this is the job_array sent: my.start_job --parn 16 -n $SGE_TASK_ID --jobfile qqplot.job_list
qsub -l h_vmem=2g -l h_rt=2:00:00 -tc 0 -t 1-1 -o /Users/mchiment/temp/test_ricopili/test_2/errandout -e /Users/mchiment/temp/test_ricopili/test_2/errandout -N qqplot.test j.qqplot.test > j.qqplot.test.id
->system call failed: 1
Config_file: /Users/mchiment/ricopili.conf
stdout from array submission:
dependent job ID:
Config custom file:
mchiment@argon-login-1: rp_bin$ cat rp_config.custom.txt
batch_jobcommand qsub
batch_memory_request -l_SPACE_h_vmem=XXXg
batch_walltime -l_SPACE_h_rt=HH:MM:SS
batch_array -t_SPACE_1-XXX
batch_max_parallel_jobs_per_one_array -tc_SPACE_YYY
batch_jobfile XXX
batch_name -N_SPACE_XXX
batch_stdout -o_SPACE_XXX
batch_stderr -e_SPACE_XXX
batch_job_dependency -hold_jid_SPACE_XXX
batch_array_task_id $SGE_TASK_ID
batch_other_job_flags NONE
batch_job_output_jid Your_SPACE_job-array_SPACE_XXX.1-YYY:1_SPACE("ZZZ")_SPACE_has_SPACE_been_SPACE_submitted
batch_ncores_per_node 56
batch_mem_per_node 32
Using latest rp_bins (2019_Jun_25.001).
When running refdir_navi for building a reference, (after fixing #63 and #62 manually) in the job bcftools_impute2 my.prepvcf_imp2 is called.
my.prepvcf_imp2 then tries to call
[root@06f061df1b71 1KGPv3_REFERENCE_PANEL]# /ricopili/rp_dep/bcftools/bcftools-1.9_bin/bcftools convert --vcf-ids -h ALL_v5a.20130502.chr22_1KG_0517.impute loc.ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.filtnorm.gz.reform.gz
That command fails with
Failed to open loc.ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.filtnorm.gz.reform.gz: unknown file type Error systemcode: 65280
In some cases metaber7
flips the sign of the meta-analysis result to keep the effect for the intended A1/A2, but does not also flip signs in the Direction field.
The workflow for metaber7
is roughly:
metal
script for the meta-analysismetal
with the clean GWASsmetal
results, and write daner output with the meta-analysis results plus the info that metal
isn't set to track (case/control freq, info, sample size, etc).For step 5, it is possible that the order of A1/A2 reported by metal
is not the order of A1/A2 indexed by metaber7
in step 2 (see Background below). In order to ensure that the allele frequency information added to the final output is correct, metaber7
checks for this swap. If the alleles have been swapped, metaber7
flips the sign of the effect in the meta-analysis results to match the A1/A2 order indexed in step 2. This maintains A1/A2 in the order indexed from the output GWAS, and reports the corresponding freq and effect size.
This is all reasonable, but the meta-analysis results also include a "Direction" field that reports the sign of the effect from each cohort in the meta-analysis. This field gets passed through directly from the metal
output, and thus always reports the sign for the A1/A2 ordering in metal
even when the the final output is reported a swapped order with the sign of the effect size flipped.
Adapted from observed data, with numbers rounded/obscured to protect actual results
Dataset 1:
CHR SNP BP A1 A2 FRQ_A_500 FRQ_U_10000 INFO OR SE P ngt
1 rs123456789 87654321 G T 0.01000000000000000 0.02000000000000000 0.90000 1.0100000000000 0.3000000 0.9000000 0
Dataset 2:
CHR SNP BP A1 A2 FRQ_A_1500 FRQ_U_10000 INFO OR SE P ngt
1 rs123456789 87654321 G T 0.01000000000000000 0.02000000000000000 0.90000 0.9500000000000 0.2000000 0.7000000 0
Metal output:
MarkerName Allele1 Allele2 Effect StdErr P-value Direction HetISq HetChiSq HetDf HetPVal
rs123456789 t g 0.0900 0.2000 0.500 -+ 0.0 0.100 1 0.7000
Note: effect of T allele is positive, with Direction "-+" in the two cohorts, matching the above input with A1/A2 swapped.
Daner output:
CHR SNP BP A1 A2 FRQ_A_5000 FRQ_U_10000 INFO OR SE P ngt Direction HetISqt HetDf HetPVa Nca Nco Neff
1 rs123456789 87654321 G T 0.01000 0.02000 0.900 0.9000 0.2000 0.5000 0 -+ 0.0 1 0.7000 2000 20000 8000.00
Note: A1/A2 swapped to match input GWAS, and reports negative effect (OR < 1) of G allele. Direction field still reports "-+", which is result for T allele rather than G allele.
The relevant code block in metaber7
for step 5 is lines 923-1056. Should be relatively straight-forward to fix in one of two ways:
Update the Direction field when flipping the effect size from metal
. This update could be done along the effect size flip at line 956. It would require parsing the Direction field string to make the relevant substitutions. Printing this updated string would occur at line 1019 (in place of $cells[6]
).
Take A1/A2 order as reported by metal
, and flip the indexed allele frequency to match. In place of the effect size flip at 956, frequencies fca{}/nca{}
and fco{}/nco{}
currently printed lins 996-1000 wound need to be updated. Also, the A1/A2 from metal
(e.g. $a1_name
) would need to be printed rather than the indexed alleles (e.g. $a1{$snp_name}
) at line 975-976.
Not sure which of these might be preferred based on any impacts on downstream analysis/processing.
This issue arises in part due to strange allele swapping behavior from metal
. While metal
documentation states that the reported Allele1
is the first allele from the first occurrence of the marker in the input files, it appears this isn't actually the case. This alternative instruction file instead says the "lowest numeric reference allele" with be used if the input files differ, which seems closer to accurate, but still not quite right.
The relevant code block from the metal
source code is:
bool FlipAlleles(String & al1, String & al2, double & effect, double & freq)
{
al1.ToLower();
al2.ToLower();
if (al1 > al2)
{
al1.Swap(al2);
effect *= -1.0;
freq = 1.0 - freq;
}
if (al1 == "a" || al1 == "c" && al2 == "g")
return false;
FlipAllele(al1);
FlipAllele(al2);
if (al1 > al2)
{
al1.Swap(al2);
effect *= -1.0;
freq = 1.0 - freq;
}
return true;
}
This gets run for the first occurrence of every marker.
If it returns true
, then FlipAllele(al1)
and FlipAllele(al2)
are run again prior to final output. Because this code actually alters al1
and al2
, any switching from al1.Swap(al2)
will propagate though to metal
results files.
By my read, this means markers where the first occurrence has alleles C/A, C/T, G/A, G/C, G/T, and T/A markers will have their allele order swapped, while A/C, A/G, A/T, C/G, T/C, and T/G markers will have their allele orders preserved. Looking through some results from metal
seems to confirm this. It's also consistent with the motivating example above that swaps G/T to T/G.
In haps2dos4
lines 237-248, the fam file is processed to force all phenotypes to 1 or 2. The reason for this is unclear, but regardless this altered phenotype is propagated through the remainder of imputation.
As a result any individuals with non-case/control phenotypes, importantly including missing phenotypes of 0/-9, are recoded to cases. Unless an alternative phenotype is specified, this changed phenotype will also be used for GWAS.
danscore_multiplot_2 expects the config file to be located at $HOME/ricopili.conf
ricopili/rp_bin/danscore_multiplot_2
Line 9 in 757fb9d
my $conf_file = $ENV{RPHOME}."/ricopili.conf";
A perl syntax error was recently introduced in filtnorm_filter here:
ricopili/rp_bin/filtnorm_filter
Line 147 in 757fb9d
Either of
print "$icc lines read\n" if ($icc % 10000 == 0 && $debug);
or
do {print "$icc lines read\n" if ($icc % 10000 == 0)} if ($debug);
should work better.
Hi,
I'm trying to add a fresh install of ricopili on LISA, but I am running into permission errors for accessing the dependencies in /home/pgcdac/rp_bin_1018/dependencies.
This is true on multiple accounts, including on accounts where ricopili is already installed
Please can you check whether previous permission settings on this account have been lost? Do users need to be members of a specific permission group to be able to use these dependencies on LISA?
Thanks
*EDIT: Existing installation does work correctly (error in the config had slipped in).
Hello,
I was able to run rp_test_navi successfully, but I get an error when I attempt the following command to analyze a single data set (PLINK files). I try to running the following command:
preimp_dir --dis nan --pop mix --out GO_Quad_5removed_ATGC_filtered --maf 0.01
The first time I run the command it produces nan.names, for which I then change the value in the first column to "pndc1" and save it. I then run the same command as above once again:
preimp_dir --dis nan --pop mix --out GO_Quad_5removed_ATGC_filtered --maf 0.01
And it causes an error, here is the command line output of the command
-----------------------------------------------------
switched on SERIAL mode because of configuration file
_ _ _ _
_ __(_) ___ ___ _ __ (_) (_)
| '__| |/ __/ _ \| '_ \| | | |
| | | | (_| (_) | |_) | | | |
|_| |_|\___\___/| .__/|_|_|_|
|_|
#######################################################################
#######################################################################
## ###
## preimp_dir - module of ricopili pipeline ###
## version: 2019_Oct_15.001 ###
## ###
## https://sites.google.com/a/broadinstitute.org/ricopili/home ###
## Stephan Ripke: [email protected] ###
## ###
#######################################################################
#######################################################################
Warning:
--------
GO_Quad_5removed_ATGC_filtered.fam: less than 10 cases and/or less than 10 controls, so will do only restricted QC (without association)
Running job: plague
-----------------------------------------------------
switched on SERIAL mode because of configuration file
_ _ _ _
_ __(_) ___ ___ _ __ (_) (_)
| '__| |/ __/ _ \| '_ \| | | |
| | | | (_| (_) | |_) | | | |
|_| |_|\___\___/| .__/|_|_|_|
|_|
#######################################################################
#######################################################################
## ###
## preimp_dir - module of ricopili pipeline ###
## version: 2019_Oct_15.001 ###
## ###
## https://sites.google.com/a/broadinstitute.org/ricopili/home ###
## Stephan Ripke: [email protected] ###
## ###
#######################################################################
#######################################################################
Warning:
--------
GO_Quad_5removed_ATGC_filtered.fam: less than 10 cases and/or less than 10 controls, so will do only restricted QC (without association)
nan_pndc1_mix_PR with limited QC (no association)
Running job: qc
restricted QC
-------------
Value "-0.4,0.4" invalid for option xlimright (number expected)
Illegal division by zero at /data/Peter/Ricopili/rp_bin/meta2flags line 174.
meta2flags --meta pofi_nan_pndc1_mix_PR.meta
->system call failed: 255 at /data/Peter/Ricopili/rp_bin/rep_qc2_14 line 482.
-----------------------------------------------------
switched on SERIAL mode because of configuration file
_ _ _ _
_ __(_) ___ ___ _ __ (_) (_)
| '__| |/ __/ _ \| '_ \| | | |
| | | | (_| (_) | |_) | | | |
|_| |_|\___\___/| .__/|_|_|_|
|_|
#######################################################################
#######################################################################
## ###
## preimp_dir - module of ricopili pipeline ###
## version: 2019_Oct_15.001 ###
## ###
## https://sites.google.com/a/broadinstitute.org/ricopili/home ###
## Stephan Ripke: [email protected] ###
## ###
#######################################################################
#######################################################################
Warning:
--------
GO_Quad_5removed_ATGC_filtered.fam: less than 10 cases and/or less than 10 controls, so will do only restricted QC (without association)
nan_pndc1_mix_PR with limited QC (no association)
Running job: qc
##################################################################
##### Error:
##### step qc has been done repeatedly without any progress
##### preimp pipeline stopped
##### /data/PNC/Ricopili_cleaning/phg000381.v2.NIMH_NeurodevelopmentalGenomics.genotype-calls-matrixfmt.Human610-Quadv1_B.c1.GRU-NPU/preimputation preimp_dir --dis nan --pop mix --out GO_Quad_5removed_ATGC_filtered --maf 0.01
##### if reason does not appear obvious
##### have a look at the wiki page
##### https://sites.google.com/a/broadinstitute.org/ricopili/
##### or contact the developers
##### version: 2019_Oct_15.001
##################################################################
The PLINK data set I use has "-9" for all values in the phenotype column. It also has a value of "0" for all family, mother, and father IDs. Also it has "0" for some alleles in the BIM file. However, I do not think those are the issues because I tried changing the family IDs to real values and removing the SNPs with "0" as alleles from the data and the same problem continues.
The current installation process seems to demand interactive input from the user, but this makes it impossible to automatically install it in a docker image.
Would it be possible to add a configuration option that predefines all paths and other requirements to make it possible to install ricopili into a docker image? It could be something like "./rp_config --use-config config-file.conf" where config-file.conf contains everything that would otherwise have been provided manually by the person installing it. What do you think?
The motherscripts detect failing/stuck tasks by comparing the current call in send_jobarray to that last task queued (as recorded in that master log file for that motherscript). If it's submitting the same job attempted previously, it errors and stops as intended.
This comparison relies on reading the last line of the master log file to match against. If 2+ jobs with the same master log file are running simultaneously (e.g. postimp with different phenotypes, pca with and without reference population samples, etc), the previous task submission may no longer be on the last line of the log file if one of the other jobs has submitted a task in the interim.
In the worst case scenario, 2+ jobs that are all failing can cycle on the same task infinitely, never stopping since they don't see their own previous task on the last line of the log file, and depending on the task potentially generating 100,000s of temp files in the process.
Same log below based on real-world example:
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_w_cov --addcov cov.txt danerlong.000921 Thu_Dec_17_13:00:22_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_wo_cov danerlong.000921 Thu_Dec_17_13:02:02_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_w_cov --addcov cov.txt danerlong.000921 Thu_Dec_17_13:03:37_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_wo_cov danerlong.000921 Thu_Dec_17_13:05:05_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_w_cov --addcov cov.txt danerlong.000921 Thu_Dec_17_13:06:21_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_wo_cov danerlong.000921 Thu_Dec_17_13:07:39_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_w_cov --addcov cov.txt danerlong.000921 Thu_Dec_17_13:09:01_2015
/path/to/datadir postimp_navi_17 --mds file.mds_cov --coco 1,2,3,4 --out gwas_wo_cov danerlong.000921 Thu_Dec_17_13:10:18_2015
Current install is broken for me. It gets stuck while setting the sloc
variable.
The latest commit changed the use of the %clusters
variable on line 278 of rp_config
, and setting $cluster
directly.
This messed with the scratch directory creation logic. First, its not recognizing the path_to_scratchdir
variable. It looks like this relies on $clusters{custom} == 1
. For similar reasons, I get dumped into an infinite loop at line 1324 when bottoming out at the conditional statement there.
I think the fix should be just setting my %clusters = (..."custom", 1)
on line 283.
When executing the postimp_navi script with the --gwclump flag here
Lines 4579 to 4580 in 757fb9d
Lines 4635 to 4648 in 757fb9d
chdir "$reportdir";
like this:unless (-e "$distdir/$numbers_file") {
chdir "$reportdir";
my $sys = "$numbers_script basic.$outname $daner_all @daner_single\n";`
Line 573 dies: die "$! ($prefix.tdt)" unless open FILE, "< $prefix.tdt";
No tdt test is run because of lines 904-906
if ($noassoc) {
$assotdt = "";
}
Fix: Line 925 needs to be moved under the unless ($noassoc) { block
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.