Giter Site home page Giter Site logo

grexome-timc-secondary's People

Contributors

ntm avatar septiera avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

manojmw jjjk123

grexome-timc-secondary's Issues

Final result file folders Empty ----E 8_extractTranscripts.pl: couldn't find one of HV/HET/OCHV/OCHET for OM

@ntm
I tried the vep command as in 3_runVEP.pl by using a vcf file choped from the whole as the pl runs. And it got result files. So I am really confused about the empty final result. How can I find the wrong step? Thanks again for your help!

test.zip
vepStats.zip

Here is log file. So the wrong because of 8_extractSamples.pl? I just use the example sample.xls and modified sample name with original colunm name.
sampleCLN.zip

During the process, the tmpdir is not empty.
image
image

I 2022-10-10 05:12:13: 8_extractSamples.pl - starting to run
Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractSamples.pl line 161.
Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractSamples.pl line 162.
E: 8_extractSamples.pl - couldn't find OMF_HV or OMF_OTHERCAUSE_HV in header of infile OMF.csv
I 2022-10-10 05:12:14: 8_extractSamples.pl - ALL DONE, completed successfully!
I 2022-10-10 05:12:15: 8_extractTranscripts.pl - starting to run
Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractTranscripts.pl line 213.
Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractTranscripts.pl line 214.
E 8_extractTranscripts.pl: couldn't find one of HV/HET/OCHV/OCHET for OM

Thanks again for your help!
Best wishes,
Chris

use CADD v1.6 (AKA CADD-Splice) for upgrading splice variants

dbNSFP v4.2a contains CADD v1.6 (AKA CADD-Splice), which is supposedly good for predicting the impact of variants on splicing. We want to integrate the CADD score in the LOW->MODHIGH algorithm for splicing variants, in 4_vcf2tsv.pl. Unfortunately currently the dbNSFP VEP plugin only annotates missense variants.
TODO: patch the dbNSFP VEP plugin, It may be sufficient to update %INCLUDE_SO at the top of:
https://github.com/Ensembl/VEP_plugins/blob/release/104/dbNSFP.pm

command line for 0_coverage.pl

@ntm
I am trying to follow your instruction in 0_coverage.pl to make coverage files.
I am so new to perl language. And perl 0_coverage.pl --help is empty. I don't know how to add the samples xlsx, candidatesFiles $transciptsFile , $gvcf (must be tabix-indexed), and an $outDir.

In your script, it wrote that
@argv == 5) ||
die "E $0: needs 5 args: a samples file, a comma-separated list of candidatesFiles, a tsv.gz, a GVCF and an outDir\n";
my ($samplesFile, $candidatesFiles, $transcriptsFile, $gvcf, $outDir) = @argv.

What should I do?And there is something like lib "/home/nthierry/Software/VariantEffectPredictor/ensembl-vep/"๏ผŒ Should I modify it?
Thanks again!!

Best regards,
Chris

dbNSFP VEP plugin error "transcript_match parameter specified but..." -> due to corrupt dbNSFP4.3a.zip

@ntm
Thanks for your great work!

I am trying to use your pipline in local system. However, when I run the secondary part, I just met:
WARNING: Failed to instantiate plugin dbNSFP: ERROR: transcript_match parameter specified but transcript-specific field detection failed at /xxxxx/ensembl-vep-107.0-0/dbNSFP.pm line 299.

Thses were my steps.
wget ftp://dbnsfp:[email protected]/dbNSFP4.3a.zip
unzip dbNSFP4.3a.zip
zcat dbNSFP4.3a_variant.chr1.gz | head -n1 > h
zgrep -h -v ^#chr dbNSFP4.3a_variant.chr* | sort -T /path/to/tmp_folder -k1,1 -k2,2n - | cat h - | bgzip -c > dbNSFP4.3a_grch38.gz
tabix -s 1 -b 2 -e 2 dbNSFP4.3a_grch38.gz
mv dbNSFP4.3a_grch38.gz dbNSFP4.3a.gz
mv dbNSFP4.3a_grch38.gz.tbi dbNSFP4.3a.gz.tbi

And I used Homo sapiens cache 107_GRCh38.

I just can't figure out the problem.
Looking forward your help.

Best wishes,
Chirs

How to edit &subCohorts() in config.pm? W grexome-TIMC-secondary.pl: sub-cohort file defined in &subCohorts() but this file doesn't exist. Skipping this sub-cohort

@ntm
I have no subcohort file.
And I modify the confog.pm as

sub subCohorts {
my %subCohorts = ("" => "");
return(%subCohorts);
}

After I run the secondary.pl, the log file shows that W grexome-TIMC-secondary.pl: sub-cohort file defined in &subCohorts() but this file doesn't exist. Skipping this sub-cohort.

Is there better way to modify the config.pm?
Thanks again for your attention and great work!

Best regards,
Chris

generate updated list of canonical transcripts

@ntm
Nicolas,
Thanks again for your great work.
I was trying to generate updated list of canonical transcripts follow your scripts. However, I met ERROR 2003 (HY000): Can't connect to MySQL server on 'ensembldb.ensembl.org:3306' (110).
I am new to Linux system, I have tried google for the solution, but I just can't figure it out. And I don't have root access.
Could you offer more details for 'Just run it from eg nicofree'? Or could you offer the listCanonicalTranscripts_22017.tsv.gz?
Thanks again for your help!

Best regards,
Chris

debug mode on, step3 failed: 256 at sec_.pl line 283.

@ntm
Nicolas,
Thanks for your patient reply!
I fixed the dbNSFP and Transcript problem with your help. Really exciting!
However, when I run the secondary.pl, the final result of four file folders is empty.
I don't have subcohort, so I just delete the step 9 subcohort in seconry.pl. And my test file is merged-gatk-gvcf from primary.pl of two patients.
The log file shows step1-6 no problem. Howerver, it shows somthing wrong starting from step 7.
My config only has ovary for 7_filterAndReorderAll.pl.
I also tried the debug mode. And it shows:
E clnsec_.pl: debug mode on, step3 failed: 256 at clnsec_.pl line 283.

step 3
$com .= " | perl $RealBin/3_runVEP.pl --cacheFile=".&vepCacheFile()." --genome=".&refGenome()." --dataDir=".&vepPluginDataPath()." --tmpDir=$tmpdir/runVepTmpDir/ ";
($debugVep) && ($com .= "--debug ");
if ($debug) {
$com .= "2> $outDir/step3.err > $outDir/step3.out";
system($com) && die "E $0: debug mode on, step3 failed: $?";
$com = "cat $outDir/step3.out ";
}

decompress infile and step 1
my $com = "$bgzip $inFile | perl $RealBin/1_filterBadCalls.pl --samplesFile=$samples --tmpdir=$tmpdir/FilterTmp/ --jobs $numJobs1 ";
if ($debug) {
# specific logfile from step and save its output
$com .= "2> $outDir/step1.err > $outDir/step1.out";
system($com) && die "E $0: debug mode on, step1 failed: $?";
# next step will read this step's output
$com = "cat $outDir/step1.out ";
}

Here is my step.1 out file.
step1.zip

Here is my command and Log file:

perl clnsec_.pl --samples=sampleCLN.xlsx --infile=grexomes_gatk_merged_221005.g.vcf.gz --outdir=SecondaryAnalyses_TEST --config=sec_config.pm 2> grexomeTIMCsec_TEST.log

2022-10-09 09:44:17: clnsec_.pl - starting to run
I clnsec_.pl: variant-caller id GATK will be appended to all filenames
I 2022-10-09 09:44:18: 4_vcf2tsv.pl - starting to run
I 2022-10-09 09:44:18: 2_sampleData2genotypes.pl - starting to run
I 2022-10-09 09:44:18: 5_addGTEX.pl - starting to run
I 2022-10-09 09:44:18: 3_runVEP.pl - starting to run
I 2022-10-09 09:44:19: 6_extractCohorts.pl - starting to run
I 2022-10-09 09:44:19: 1_filterBadCalls.pl - starting to run
I 2022-10-09 10:08:48: 3_runVEP.pl - finished parsing/processing chrom chr1
I 2022-10-09 10:28:05: 3_runVEP.pl - finished parsing/processing chrom chr2
I 2022-10-09 10:30:08: 6_extractCohorts.pl - done processing batch 10
I 2022-10-09 10:34:37: 3_runVEP.pl - finished parsing/processing chrom chr3
I 2022-10-09 10:50:26: 3_runVEP.pl - finished parsing/processing chrom chr4
I 2022-10-09 10:54:30: 6_extractCohorts.pl - done processing batch 20
I 2022-10-09 11:02:14: 3_runVEP.pl - finished parsing/processing chrom chr5
I 2022-10-09 11:20:21: 3_runVEP.pl - finished parsing/processing chrom chr6
I 2022-10-09 11:25:54: 6_extractCohorts.pl - batchNum=30, adjusting batchSize down to 2403
I 2022-10-09 11:26:10: 6_extractCohorts.pl - done processing batch 30
I 2022-10-09 11:26:16: 6_extractCohorts.pl - batchNum=45, adjusting batchSize up to 15728
I 2022-10-09 11:26:21: 6_extractCohorts.pl - done processing batch 40
I 2022-10-09 11:30:53: 3_runVEP.pl - finished parsing/processing chrom chr7
I 2022-10-09 11:38:35: 3_runVEP.pl - finished parsing/processing chrom chr8
I 2022-10-09 11:42:04: 6_extractCohorts.pl - done processing batch 50
I 2022-10-09 11:49:35: 3_runVEP.pl - finished parsing/processing chrom chr9
I 2022-10-09 12:04:59: 3_runVEP.pl - finished parsing/processing chrom chr10
I 2022-10-09 12:11:53: 6_extractCohorts.pl - batchNum=60, adjusting batchSize down to 2873
I 2022-10-09 12:12:14: 6_extractCohorts.pl - done processing batch 60
I 2022-10-09 12:12:26: 6_extractCohorts.pl - done processing batch 70
I 2022-10-09 12:12:33: 6_extractCohorts.pl - batchNum=75, adjusting batchSize up to 10342
I 2022-10-09 12:22:39: 3_runVEP.pl - finished parsing/processing chrom chr11
I 2022-10-09 12:33:51: 6_extractCohorts.pl - done processing batch 80
I 2022-10-09 12:42:20: 3_runVEP.pl - finished parsing/processing chrom chr12
I 2022-10-09 12:52:59: 6_extractCohorts.pl - batchNum=90, adjusting batchSize down to 2131
I 2022-10-09 12:53:07: 6_extractCohorts.pl - done processing batch 90
I 2022-10-09 12:53:26: 6_extractCohorts.pl - batchNum=105, adjusting batchSize up to 11365
I 2022-10-09 12:53:28: 6_extractCohorts.pl - done processing batch 100
I 2022-10-09 12:53:54: 3_runVEP.pl - finished parsing/processing chrom chr13
I 2022-10-09 13:05:22: 3_runVEP.pl - finished parsing/processing chrom chr14
I 2022-10-09 13:08:34: 6_extractCohorts.pl - done processing batch 110
I 2022-10-09 13:18:19: 3_runVEP.pl - finished parsing/processing chrom chr15
I 2022-10-09 13:23:15: 6_extractCohorts.pl - batchNum=120, adjusting batchSize down to 3176
I 2022-10-09 13:23:32: 6_extractCohorts.pl - done processing batch 120
I 2022-10-09 13:31:43: 3_runVEP.pl - finished parsing/processing chrom chr16
I 2022-10-09 13:36:54: 6_extractCohorts.pl - done processing batch 130
I 2022-10-09 13:36:58: 6_extractCohorts.pl - batchNum=135, adjusting batchSize down to 1929
I 2022-10-09 13:37:15: 6_extractCohorts.pl - done processing batch 140
I 2022-10-09 13:37:21: 6_extractCohorts.pl - batchNum=150, adjusting batchSize up to 12077
I 2022-10-09 13:37:47: 6_extractCohorts.pl - done processing batch 150
I 2022-10-09 13:45:20: 3_runVEP.pl - finished parsing/processing chrom chr17
I 2022-10-09 13:58:15: 6_extractCohorts.pl - done processing batch 160
I 2022-10-09 13:58:23: 3_runVEP.pl - finished parsing/processing chrom chr18
I 2022-10-09 14:04:31: 6_extractCohorts.pl - batchNum=165, adjusting batchSize down to 3704
I 2022-10-09 14:17:30: 3_runVEP.pl - finished parsing/processing chrom chr19
I 2022-10-09 14:22:20: 6_extractCohorts.pl - done processing batch 170
I 2022-10-09 14:22:35: 6_extractCohorts.pl - batchNum=180, adjusting batchSize down to 1708
I 2022-10-09 14:22:42: 6_extractCohorts.pl - done processing batch 180
I 2022-10-09 14:22:51: 6_extractCohorts.pl - batchNum=195, adjusting batchSize up to 15372
I 2022-10-09 14:22:53: 6_extractCohorts.pl - done processing batch 190
I 2022-10-09 14:23:07: 3_runVEP.pl - finished parsing/processing chrom chr20
I 2022-10-09 14:30:01: 3_runVEP.pl - finished parsing/processing chrom chr21
I 2022-10-09 14:34:29: 6_extractCohorts.pl - done processing batch 200
I 2022-10-09 14:34:31: 3_runVEP.pl - finished parsing/processing chrom chr22
I 2022-10-09 14:43:06: 1_filterBadCalls.pl - ALL DONE, completed successfully!
I 2022-10-09 14:43:06: 3_runVEP.pl - finished parsing/processing chrom chrX
I 2022-10-09 14:43:06: 2_sampleData2genotypes.pl - ALL DONE, completed successfully!
I 2022-10-09 14:49:24: 3_runVEP.pl - finished parsing/processing chrom chrY
I 2022-10-09 14:49:43: 3_runVEP.pl - finished parsing/processing chrom chrM
I 2022-10-09 14:50:19: 3_runVEP.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:20: 4_vcf2tsv.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:20: 5_addGTEX.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:36: 6_extractCohorts.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:37: 7_filterAndReorderAll.pl - starting to run
E 7_reorderColumns.pl: some newOrder titles were not found: GTEX_testis_RATIO
I 2022-10-09 14:50:38: 7_filterAndReorderAll.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:39: 8_extractSamples.pl - starting to run
Use of uninitialized value $header in scalar [chomp]at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractSamples.pl line 161.
Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractSamples.pl line 162.
E: 8_extractSamples.pl - couldn't find OMF_HV or OMF_OTHERCAUSE_HV in header of infile OMF.csv
I 2022-10-09 14:50:40: 8_extractSamples.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:42: 8_extractTranscripts.pl - starting to run
Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractTranscripts.pl line 213.
Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractTranscripts.pl line 214.
E 8_extractTranscripts.pl: couldn't find one of HV/HET/OCHV/OCHET for OMF
I 2022-10-09 14:50:43: 8_addPatientIDs.pl - starting to run
I 2022-10-09 14:50:43: 8_addPatientIDs.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:43: 9_requireUndiagnosed.pl - starting to run
Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/9_requireUndiagnosed.pl line 66.
Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/9_requireUndiagnosed.pl line 67.
E: 9_requireUndiagnosed.pl couldn't find one of HV/HET for OMF
I 2022-10-09 14:50:45: 8_addPatientIDs.pl - starting to run
I 2022-10-09 14:50:46: 8_addPatientIDs.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:46: 7_filterAndReorderAll.pl - starting to run
Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/7_filterVariants.pl line 69.
Use of uninitialized value $header in concatenation (.) or string at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/7_filterVariants.pl line 70.
Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/7_filterVariants.pl line 73.
E 7_filterVariants.pl: title CANONICAL required by script but missing, some VEP columns changed?
I 2022-10-09 14:50:47: 7_filterAndReorderAll.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:49: 10_qc_checkCausal.pl - starting to run
I 2022-10-09 14:50:49: 10_qc_checkCausal.pl - ALL DONE, completed successfully!
I 2022-10-09 14:50:49: clnsec_.pl - ALL DONE, completed successfully!

Looking forward your reply! Just Thanks a lot for your help with VEP and Transcript problems!!!

Best wishes,
Chris

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.