Comments (21)
Hi @amizeranschi.
Thank you for reporting the issue. We will let you know as soon as possible
from rnasplice.
For the record, the referenced commits by @dkoppstein have solved my issue. Would be great to see this fix merged into the base pipeline.
Unfortunately, my test run still ended up failing, due to a different reason. This is the updated command that I ran:
nextflow run dkoppstein/rnasplice -r dev -latest -profile docker --outdir test-small-rnasplice --gencode --genome GRCh38 \
--input https://raw.githubusercontent.com/nf-core/test-datasets/rnasplice/samplesheet/samplesheet.csv \
--contrasts https://raw.githubusercontent.com/nf-core/test-datasets/rnasplice/samplesheet/contrastsheet.csv \
--gtf https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_43/gencode.v43.annotation.gtf.gz --rmats --rmats_read_len 75 --rmats_paired_stats --aligner star_salmon --dexseq_exon
And below is the new issue. @valentinoruggieri any help with this new error would be much appreciated.
ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:ALIGN_STAR:STAR_ALIGN (ERR188428)'
Caused by:
Process `NFCORE_RNASPLICE:RNASPLICE:ALIGN_STAR:STAR_ALIGN (ERR188428)` terminated with an error exit status (105)
Command executed:
STAR \
--genomeDir STARIndex \
--readFilesIn input1/ERR188428_1_val_1.fq.gz input2/ERR188428_2_val_2.fq.gz \
--runThreadN 12 \
--outFileNamePrefix ERR188428. \
\
--sjdbGTFfile gencode.v43.annotation.gtf \
--outSAMattrRGline 'ID:ERR188428' 'SM:ERR188428' \
--quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand gunzip -c --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend
if [ -f ERR188428.Unmapped.out.mate1 ]; then
mv ERR188428.Unmapped.out.mate1 ERR188428.unmapped_1.fastq
gzip ERR188428.unmapped_1.fastq
fi
if [ -f ERR188428.Unmapped.out.mate2 ]; then
mv ERR188428.Unmapped.out.mate2 ERR188428.unmapped_2.fastq
gzip ERR188428.unmapped_2.fastq
fi
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:ALIGN_STAR:STAR_ALIGN":
star: $(STAR --version | sed -e "s/STAR_//g")
samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
gawk: $(echo $(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*$//')
END_VERSIONS
Command exit status:
105
Command output:
STAR --genomeDir STARIndex --readFilesIn input1/ERR188428_1_val_1.fq.gz input2/ERR188428_2_val_2.fq.gz --runThreadN 12 --outFileNamePrefix ERR188428. --sjdbGTFfile gencode.v43.annotation.gtf --outSAMattrRGline ID:ERR188428 SM:ERR188428 --quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand gunzip -c --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend
STAR version: 2.7.9a compiled: 2021-05-04T09:43:56-0400 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
Jul 18 09:03:48 ..... started STAR run
Jul 18 09:03:48 ..... loading genome
Command error:
STAR --genomeDir STARIndex --readFilesIn input1/ERR188428_1_val_1.fq.gz input2/ERR188428_2_val_2.fq.gz --runThreadN 12 --outFileNamePrefix ERR188428. --sjdbGTFfile gencode.v43.annotation.gtf --outSAMattrRGline ID:ERR188428 SM:ERR188428 --quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand gunzip -c --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend
STAR version: 2.7.9a compiled: 2021-05-04T09:43:56-0400 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
Jul 18 09:03:48 ..... started STAR run
Jul 18 09:03:48 ..... loading genome
EXITING because of FATAL ERROR: Genome version: 20201 is INCOMPATIBLE with running STAR version: 2.7.9a
SOLUTION: please re-generate genome from scratch with running version of STAR, or with version: 2.7.4a
Jul 18 09:03:48 ...... FATAL ERROR, exiting
from rnasplice.
@amizeranschi thanks for the confirmation. I don't have access to a linux box with Docker to run the formal tests, so am waiting on opening a PR, if you or one of the devs can test run -profile test,docker
on my branch it would be great.
However, I am also encountering another issue downstream at the index_gff step with a Gencode GTF... will try tinkering around a bit, then may need to submit another bug report.
For your STAR issue in the short term, it seems like the AWS iGenomes are out of date -- I would suggest using the latest ENSEMBL or GENCODE fasta and gtf files, and specifying them on the command line.
from rnasplice.
OK, I will give this a try. Either way, I guess the STAR index will have to be rebuilt, correct? I will try using the latest FASTA and GTF from Gencode, I just saw that they recently made a new release, v. 44.
from rnasplice.
from rnasplice.
OK, then I'll add --save_reference to my command so that I can later reuse the new STAR index. Thanks a lot for your help.
from rnasplice.
@amizeranschi @dkoppstein Thank you both for reviewing and testing the code.
As you mentioned above, updating gffread from 0.12.1 to 0.12.7 solves the issue in our test too. In addition, we are evaluating to add the option "--keep-gene" in the gffread script (e.g. gffread $gtf -L --keep-genes ) since it seems that "genes" are not automatically added in the gff3 file, and this might cause problems in some downstream processes.
Regarding the STAR issue, as you pointed out it depends on the STAR version (2.5.1b 2016) used to generate the index in the AWS iGenome and the STAR version used for the mapping (version 2.7.9a). We are looking for a way to take into consideration this discrepancy.
However, passing the genome file with the option '--fasta' skips the issue since the index is rebuilt using the right STAR version.
from rnasplice.
OK, changing the command to the one below got things moving forward for me:
nextflow run dkoppstein/rnasplice -r dev -latest -profile docker --outdir test-small-rnasplice --gencode --igenomes_ignore --genome null --save_reference \
--input https://raw.githubusercontent.com/nf-core/test-datasets/rnasplice/samplesheet/samplesheet.csv \
--contrasts https://raw.githubusercontent.com/nf-core/test-datasets/rnasplice/samplesheet/contrastsheet.csv \
--fasta https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/GRCh38.primary_assembly.genome.fa.gz \
--gtf https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/gencode.v44.annotation.gtf.gz \
--rmats --rmats_read_len 75 --rmats_paired_stats --aligner star_salmon --dexseq_exon
@dkoppstein I am now getting some GFF-related warnings and errors, but I'm not sure if this is what you were referring to earlier. This is what I get:
ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_RUN (1)'
Caused by:
Process `NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_RUN (1)` terminated with an error exit status (1)
Command executed:
miso --run index ERR188428_sorted.bam --output-dir miso_data/ERR188428 --read-len 75
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_RUN":
misopy: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('misopy').version)")
END_VERSIONS
Command exit status:
1
Command output:
Using MISO settings file: /usr/local/lib/python2.7/site-packages/misopy/settings/miso_settings.txt
Computing Psi values...
- GFF index: index
- BAM: ERR188428_sorted.bam
- Read length: 75
- Output directory: miso_data/ERR188428
Checking your GFF annotation and BAM for mismatches...
Checking if BAM has mixed read lengths...
07/18/2023 11:25:14 AM - miso_main - WARNING - Found mixed length reads in your BAM file: ERR188428_sorted.bam
MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation.
Read lengths were: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your GFF annotation file and your BAM file might not have matching headers (chromosome names.) If this is the case, your run will fail as no reads from the BAM could be matched up with your annotation.
07/18/2023 11:25:19 AM - miso_main - WARNING - Please see:
http://genes.mit.edu/burgelab/miso/docs/
for more information.
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your BAM file (ERR188428_sorted.bam) contains 'chr' chromosomes (UCSC-style) while your GFF annotation (index) does not.
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few BAM chromosomes were: chr7,chr6,chr5,chr4,chr3,chr2,chr1,chr9,chr8,chr12,chr11,chr10
BAM references:
('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few GFF chromosomes were:
07/18/2023 11:25:19 AM - miso_main - WARNING - Run is likely to fail or produce empty output. Proceeding anyway...
Mapping genes to their indexed GFF representation, using index
Searching for index/genes_to_filenames.shelve..
- File not found.
Skipping: /home/ubuntu/work/dc/eeb28063f47f2t support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation.
Read lengths were: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your GFF annotation file and your BAM file might not have matching headers (chromosome names.) If this is the case, your run will fail as no reads from the BAM could be matched up with your annotation.
07/18/2023 11:25:19 AM - miso_main - WARNING - Please see:
http://genes.mit.edu/burgelab/miso/docs/
for more information.
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your BAM file (ERR188428_sorted.bam) contains 'chr' chromosomes (UCSC-style) while your GFF annotation (index) does not.
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few BAM chromosomes were: chr7,chr6,chr5,chr4,chr3,chr2,chr1,chr9,chr8,chr12,chr11,chr10
BAM references:
('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few GFF chromosomes were:
07/18/2023 11:25:19 AM - miso_main - WARNING - Run is likely to fail or produce empty output. Proceeding anyway...
Mapping genes to their indexed GFF representation, using index
Searching for index/genes_to_filenames.shelve..
- File not found.
Skipping: index/genes.gff
Skipping: index/compressed_ids_to_genes.shelve.bak
Skipping: index/genes_to_filenames.shelve.bak
Skipping: index/genes_to_filenames.shelve.dat
Skipping: index/compressed_ids_to_genes.shelve.dir
Skipping: index/genes_to_filenames.shelve.dir
Skipping: index/compressed_ids_to_genes.shelve.dat
07/18/2023 11:25:34 AM - miso_main - ERROR - No genes to run on. Did you pass me the wrong path to your index GFF directory? Or perhaps your indexed GFF directory is empty?
Command error:
Using MISO settings file: /usr/local/lib/python2.7/site-packages/misopy/settings/miso_settings.txt
Computing Psi values...
- GFF index: index
- BAM: ERR188428_sorted.bam
- Read length: 75
- Output directory: miso_data/ERR188428
Checking your GFF annotation and BAM for mismatches...
Checking if BAM has mixed read lengths...
07/18/2023 11:25:14 AM - miso_main - WARNING - Found mixed length reads in your BAM file: ERR188428_sorted.bam
MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation.
Read lengths were: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your GFF annotation file and your BAM file might not have matching headers (chromosome names.) If this is the case, your run will fail as no reads from the BAM could be matched up with your annotation.
07/18/2023 11:25:19 AM - miso_main - WARNING - Please see:
http://genes.mit.edu/burgelab/miso/docs/
for more information.
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your BAM file (ERR188428_sorted.bam) contains 'chr' chromosomes (UCSC-style) while your GFF annotation (index) does not.
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few BAM chromosomes were: chr7,chr6,chr5,chr4,chr3,chr2,chr1,chr9,chr8,chr12,chr11,chr10
BAM references:
('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few GFF chromosomes were:
07/18/2023 11:25:19 AM - miso_main - WARNING - Run is likely to fail or produce empty output. Proceeding anyway...
Mapping genes to their indexed GFF representation, using index
Searching for index/genes_to_filenames.shelve..
- File not found.
Skipping: /home/ubuntu/work/dc/eeb28063f47f2t support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation.
Read lengths were: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your GFF annotation file and your BAM file might not have matching headers (chromosome names.) If this is the case, your run will fail as no reads from the BAM could be matched up with your annotation.
07/18/2023 11:25:19 AM - miso_main - WARNING - Please see:
http://genes.mit.edu/burgelab/miso/docs/
for more information.
07/18/2023 11:25:19 AM - miso_main - WARNING - It looks like your BAM file (ERR188428_sorted.bam) contains 'chr' chromosomes (UCSC-style) while your GFF annotation (index) does not.
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few BAM chromosomes were: chr7,chr6,chr5,chr4,chr3,chr2,chr1,chr9,chr8,chr12,chr11,chr10
BAM references:
('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
07/18/2023 11:25:19 AM - miso_main - WARNING - The first few GFF chromosomes were:
07/18/2023 11:25:19 AM - miso_main - WARNING - Run is likely to fail or produce empty output. Proceeding anyway...
Mapping genes to their indexed GFF representation, using index
Searching for index/genes_to_filenames.shelve..
- File not found.
Skipping: index/genes.gff
Skipping: index/compressed_ids_to_genes.shelve.bak
Skipping: index/genes_to_filenames.shelve.bak
Skipping: index/genes_to_filenames.shelve.dat
Skipping: index/compressed_ids_to_genes.shelve.dir
Skipping: index/genes_to_filenames.shelve.dir
Skipping: index/compressed_ids_to_genes.shelve.dat
07/18/2023 11:25:34 AM - miso_main - ERROR - No genes to run on. Did you pass me the wrong path to your index GFF directory? Or perhaps your indexed GFF directory is empty?
from rnasplice.
adding the option "--keep-gene" in the gffread script of the GTF_2_GFF3 module ( gffread $gtf -L --keep-genes | awk -F'\t' -vOFS='\t' '{ gsub("transcript", "mRNA", \$3); print}' > ${gtf.baseName}_genes.gff3
) should solve the issue
from rnasplice.
@dkoppstein I also ran a built-in test earlier, as you suggested:
nextflow run dkoppstein/rnasplice -r dev -latest -profile test,docker
and the outcome was similar to my test above. Would you be able to implement the fix suggested by @valentinoruggieri?
ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_RUN (1)'
Caused by:
Process `NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_RUN (1)` terminated with an error exit status (1)
Command executed:
miso --run index ERR188383_sorted.bam --output-dir miso_data/ERR188383 --read-len 75
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_RUN":
misopy: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('misopy').version)")
END_VERSIONS
Command exit status:
1
Command output:
MISO (Mixture of Isoforms model)
Probabilistic analysis of RNA-Seq data for detecting differential isoforms
Use --help argument to view options.
Using MISO settings file: /usr/local/lib/python2.7/site-packages/misopy/settings/miso_settings.txt
Computing Psi values...
- GFF index: index
- BAM: ERR188383_sorted.bam
- Read length: 75
- Output directory: miso_data/ERR188383
Checking your GFF annotation and BAM for mismatches...
Checking if BAM has mixed read lengths...
07/18/2023 12:11:06 PM - miso_main - WARNING - Found mixed length reads in your BAM file: ERR188383_sorted.bam
MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation.
Read lengths were: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75
Mapping genes to their indexed GFF representation, using index
Searching for index/genes_to_filenames.shelve..
- File not found.
Skipping: index/genes.gff
Skipping: index/compressed_ids_to_genes.shelve.bak
Skipping: index/genes_to_filenames.shelve.bak
Skipping: index/genes_to_filenames.shelve.dat
Skipping: index/compressed_ids_to_genes.shelve.dir
Skipping: index/genes_to_filenames.shelve.dir
Skipping: index/compressed_ids_to_genes.shelve.dat
07/18/2023 12:11:11 PM - miso_main - ERROR - No genes to run on. Did you pass me the wrong path to your index GFF directory? Or perhaps your indexed GFF directory is empty?
Command error:
MISO (Mixture of Isoforms model)
Probabilistic analysis of RNA-Seq data for detecting differential isoforms
Use --help argument to view options.
Using MISO settings file: /usr/local/lib/python2.7/site-packages/misopy/settings/miso_settings.txt
Computing Psi values...
- GFF index: index
- BAM: ERR188383_sorted.bam
- Read length: 75
- Output directory: miso_data/ERR188383
Checking your GFF annotation and BAM for mismatches...
Checking if BAM has mixed read lengths...
07/18/2023 12:11:06 PM - miso_main - WARNING - Found mixed length reads in your BAM file: ERR188383_sorted.bam
MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation.
Read lengths were: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75
Mapping genes to their indexed GFF representation, using index
Searching for index/genes_to_filenames.shelve..
- File not found.
Skipping: index/genes.gff
Skipping: index/compressed_ids_to_genes.shelve.bak
Skipping: index/genes_to_filenames.shelve.bak
Skipping: index/genes_to_filenames.shelve.dat
Skipping: index/compressed_ids_to_genes.shelve.dir
Skipping: index/genes_to_filenames.shelve.dir
Skipping: index/compressed_ids_to_genes.shelve.dat
07/18/2023 12:11:11 PM - miso_main - ERROR - No genes to run on. Did you pass me the wrong path to your index GFF directory? Or perhaps your indexed GFF directory is empty?
from rnasplice.
@amizeranschi I just pushed the suggested fix to my fork.
from rnasplice.
Thanks, it seems like that did the trick. The built-in test ran fine now. I'll also run my own test above and see how that turns out.
from rnasplice.
Unfortunately, my test still ran into an error. Reminder, this is the command I'm running:
nextflow run dkoppstein/rnasplice -r dev -latest -profile docker --outdir test-small-rnasplice --gencode --igenomes_ignore --genome null --save_reference \
--input https://raw.githubusercontent.com/nf-core/test-datasets/rnasplice/samplesheet/samplesheet.csv \
--contrasts https://raw.githubusercontent.com/nf-core/test-datasets/rnasplice/samplesheet/contrastsheet.csv \
--fasta https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/GRCh38.primary_assembly.genome.fa.gz \
--gtf https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/gencode.v44.annotation.gtf.gz \
--rmats --rmats_read_len 75 --rmats_paired_stats --aligner star_salmon --dexseq_exon
And this was the outcome, with the latest version of @dkoppstein's fork. Any idea about this error, @valentinoruggieri?
Caused by:
Process `NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (1)` terminated with an error exit status (1)
Command executed:
sashimi_plot --plot-event ENSG00000004961 index miso_settings.txt --output-dir sashimi
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI":
python: $(python --version | sed "s/Python //g")
misopy: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('misopy').version)")
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
/usr/local/lib/python2.7/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: The mpl_toolkits.axes_grid module was deprecated in version 2.1. Use mpl_toolkits.axes_grid1 and mpl_toolkits.axisartist provies the same functionality instead.
warnings.warn(message, mplDeprecation, stacklevel=1)
Traceback (most recent call last):
File "/usr/local/bin/sashimi_plot", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 276, in main
plot_label=plot_label)
File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 142, in plot_event
%(event_name, pickle_dir)
Exception: Event ENSG00000004961 not found in pickled directory index. Are you sure this is the right directory for the event?
from rnasplice.
@amizeranschi it seems the error arises because the "miso_genes" IDs in nextflow.config are slightly different from those of the genome version you used (e.g. ENSG00000004961 vs ENSG00000004961.15) . You should change them accordingly.
from rnasplice.
I'm afraid I don't fully follow you. Could you please be a bit more specific? Where and what should I change in regards to those "miso_genes" IDs?
from rnasplice.
rnasplice relies on "miso_genes" params to specify the gene IDs you want to plot via miso/sashimi_plot function. By default, it includes 3 genes ('ENSG00000004961, ENSG00000005302, ENSG00000147403') (check the nextflow.config file).
Users need to specify the gene IDs they want to plot accordingly to the genome annotation they are using.
For example, in the gencode version you are using (version 44), genes ID are coded with an extra suffix (.*) like "ENSG00000004961.15". So to correctly instruct miso, you need to use a list of gene IDs that corresponds to that version.
You can do that:
- by modifying the gene IDs in the nextflow.config file
(for example replacing the default values with 'ENSG00000004961.15, ENSG00000005302.19, ENSG00000147403.18'); - by adding an argument to your command line
--miso_genes 'ENSG00000004961.15'
;
[https://nf-co.re/rnasplice/dev/parameters] for more details.
from rnasplice.
I see, thanks a lot for the added details. I have added the following to my command line:
--miso_genes "ENSG00000004961.15, ENSG00000005302.19, ENSG00000147403.18"
and everything worked perfectly.
It might also be helpful to update the documentation to reflect the actual default value for --miso_genes
from nextflow.config
. The website currently lists only two genes: ENSG00000004961, ENSG00000005302
:
https://nf-co.re/rnasplice/dev/parameters#miso_genes
from rnasplice.
I would also suggest changing these values automatically if --gencode
is specified.
from rnasplice.
Good point @dkoppstein. I opened a separate issue report here: #78
from rnasplice.
@dkoppstein could you please create a PR with your fixes?
from rnasplice.
Thanks a lot @jma1991 for fixing this issue, can confirm that it works. Unfortunately, now I'm running into another issue, described here: #71
from rnasplice.
Related Issues (20)
- link to example contrastsheet.csv is not working HOT 1
- Empty output for BAM+rMats HOT 2
- make contrasts file names consistent with those of the differentialabundance pipeline HOT 1
- Missing rMATS arguments HOT 2
- DEXSeq-DTU Stager pScreenAdjusted HOT 1
- sashimi_plot error HOT 2
- contrast file problem? HOT 7
- EXITING because of FATAL ERROR: Genome version: 20201 is INCOMPATIBLE with running STAR version: 2.7.9a HOT 2
- MISO error HOT 3
- The processes for splitting files are running very slowly with large numbers of input samples HOT 5
- Error: suppa_split_file.R Input_file must contain samplesheet samples. HOT 2
- ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DRIMSEQ_FILTER (1)' HOT 11
- Error in DRIMSeq::dmDSdata(counts = counts, samples = samps) HOT 3
- Error single-end execution HOT 1
- Error in SUPPA: Clustergroups are assigned incorrectly HOT 4
- Miso error HOT 10
- `NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU (1)` HOT 7
- RNA splice output
- Merged genes output HOT 3
- STAGER error: subscript out of bounds HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rnasplice.