xinglab / iris Goto Github PK

View Code? Open in Web Editor NEW

20.0 7.0 8.0 11.88 MB

IRIS: Isoform peptides from RNA splicing for Immunotherapy target Screening

License: Other

Shell 1.83% Python 98.17%

immunotherapy rna-splicing neoantigen tumor-antigen rna-seq

iris's People

Contributors

Stargazers

Watchers

Forkers

sherry-221b wangdepin bit-vs-it p-levy rnaimehaom ypnngaa-py beazhang gyd66

iris's Issues

Error in rule iris_makesubsh_extract_sjc

Hello,
thank you very much for your outstanding work.
I encountered an error when running the complete pipeline. The error message is as follows:

Error in rule iris_makesubsh_extract_sjc:
    jobid: 73
    output: results/wangxin_test_chordoma/extract_sjc_tasks/cmdlist.extract_sjc.chordoma_treatment, results/wangxin_test_chordoma/extract_sjc_tasks/bam_folder_list_chordoma_treatment.txt
    log: results/wangxin_test_chordoma/extract_sjc_tasks/iris_makesubsh_extract_sjc_chordoma_treatment_log.out, results/wangxin_test_chordoma/extract_sjc_tasks/iris_makesubsh_extract_sjc_chordoma_treatment_log.err (check log file(s) for error message)
    shell:
        echo results/wangxin_test_chordoma/process_rnaseq/chordoma_treatment.aln/Aligned.sortedByCoord.out.bam > results/wangxin_test_chordoma/extract_sjc_tasks/bam_folder_list_chordoma_treatment.txt && /data/xuxi/iris/IRIS/conda_wrapper /data/xuxi/iris/IRIS/conda_env_2 IRIS makesubsh_extract_sjc --bam-folder-list results/wangxin_test_chordoma/extract_sjc_tasks/bam_folder_list_chordoma_treatment.txt --task-name chordoma_treatment --gtf references/gencode.v26lift37.annotation.gtf --genome-fasta references/ucsc.hg19.fasta --BAM-prefix Aligned.sortedByCoord.out --task-dir results/wangxin_test_chordoma/extract_sjc_tasks 1> results/wangxin_test_chordoma/extract_sjc_tasks/iris_makesubsh_extract_sjc_chordoma_treatment_log.out 2> results/wangxin_test_chordoma/extract_sjc_tasks/iris_makesubsh_extract_sjc_chordoma_treatment_log.err
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

The error infomation is as follows:

raceback (most recent call last):
  File "/data/xuxi/iris/IRIS/conda_env_2/bin/IRIS", line 4, in <module>
    __import__('pkg_resources').run_script('IRIS==2.0.1', 'IRIS')
  File "/data/xuxi/iris/IRIS/conda_env_2/lib/python2.7/site-packages/pkg_resources/__init__.py", line 666, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/data/xuxi/iris/IRIS/conda_env_2/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1469, in run_script
    exec(script_code, namespace, namespace)
  File "/data/xuxi/iris/IRIS/conda_env_2/lib/python2.7/site-packages/IRIS-2.0.1-py2.7.egg/EGG-INFO/scripts/IRIS", line 596, in <module>
    
  File "/data/xuxi/iris/IRIS/conda_env_2/lib/python2.7/site-packages/IRIS-2.0.1-py2.7.egg/EGG-INFO/scripts/IRIS", line 60, in main
    
  File "build/bdist.linux-x86_64/egg/IRIS/IRIS_makesubsh_extractsj.py", line 40, in main
  File "build/bdist.linux-x86_64/egg/IRIS/IRIS_makesubsh_extractsj.py", line 5, in parseMappingLog
IOError: [Errno 20] Not a directory: 'results/wangxin_test_chordoma/process_rnaseq/chordoma_control.aln/Aligned.sortedByCoord.out.bam/Log.final.out'

I wonder to know how to resolve this error.

Thank you very much,
Xin Wang

Run the complete pipeline

Hi,
thank you for your super-interesting tool.

I have some questions about the snakemake_config.yaml (If I plan to run all pipeline together using snakemake)

This file has to stay in the IRIS folder? So, if I want to run IRIS on multiple batches of patients I cannot do this in parallel?
If I want to run all steps to define the degree of association I only have to set run_all_modules at "true" and insert a list of normals in tissue_matched_normal_reference_group_names and tumors in tumor_reference_group_names?
Which are the differences between tissue_matched_normal_reference_group_names and normal_reference_group_names?
Can the user add its normal controls? And, if yes, how?
What's the meaning of "blocklist"?
What does mean comparison_mode "group" or "individual"?
For which kind of analyses you suggest stat_test_type parametric or non parametric?
Has sample_fastqs a maximum?
Are novel events considered automatically?
The parameter splice_event_type: has a user to run separately each event type?

Sorry to bother you,
thank you,
Serena

TabError: inconsistent use of tabs and spaces in indentation

Python no longer accepts tabs in source files. The source needs to be converted from python2 to python3.

Running IRIS with SLURM

Hi Yang and Eric,

I tried to run the pipeline (just the ./run_example script) on our server but got the following error at the prediction step.

FileNotFoundError: [Errno 2] No such file or directory: 'qsub': 'qsub'

I guess it's because our server doesn't have SGE but uses SLURM. Is there a way to run IRIS without using SGE?

Thank you!
Pierre

./install all and ./run test fail

Hi Eric,

I am new to this software. Here are my log files. Please help.
Thanks

#.snakemake\log
Building DAG of jobs...
Using shell: /bin/bash
Provided cluster nodes: 100
Job stats:
job count min threads max threads

all 1 1 1
copy_splice_matrix_files 1 1 1
count_iris_predict_tasks 1 1 1
download_reference_file 2 1 1
iris_append_sjc 1 1 1
iris_epitope_post 1 1 1
iris_predict 1 1 1
iris_screen 1 1 1
iris_visual_summary 1 1 1
unzip_reference_file 2 1 1
write_param_file 1 1 1
total 13 1 1

Select jobs to execute...

[Mon Jun 19 16:52:29 2023]
localrule copy_splice_matrix_files:
input: /home/clin/IRIS/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt, /home/clin/IRIS/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt.idx
output: /home/clin/IRIS/IRIS_data/db/NEPC_test/splicing_matrix/splicing_matrix.SE.cov10.NEPC_test.txt, /home/clin/IRIS/IRIS_data/db/NEPC_test/splicing_matrix/splicing_matrix.SE.cov10.NEPC_test.txt.idx
jobid: 10
reason: Missing output files: /home/clin/IRIS/IRIS_data/db/NEPC_test/splicing_matrix/splicing_matrix.SE.cov10.NEPC_test.txt.idx, /home/clin/IRIS/IRIS_data/db/NEPC_test/splicing_matrix/splicing_matrix.SE.cov10.NEPC_test.txt
resources: mem_mb=1000, disk_mb=1000, tmpdir=/tmp

cp /home/clin/IRIS/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt /home/clin/IRIS/IRIS_data/db/NEPC_test/splicing_matrix/splicing_matrix.SE.cov10.NEPC_test.txt && cp /home/clin/IRIS/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt.idx /home/clin/IRIS/IRIS_data/db/NEPC_test/splicing_matrix/splicing_matrix.SE.cov10.NEPC_test.txt.idx

[Mon Jun 19 16:52:29 2023]
rule download_reference_file:
output: references/gencode.v26lift37.annotation.gtf.gz
log: references/download_reference_file_gencode.v26lift37.annotation.gtf.gz_log.out, references/download_reference_file_gencode.v26lift37.annotation.gtf.gz_log.err
jobid: 9
reason: Missing output files: references/gencode.v26lift37.annotation.gtf.gz
wildcards: file_name=gencode.v26lift37.annotation.gtf.gz
resources: mem_mb=4096, disk_mb=1000, tmpdir=, time_hours=12

curl -L 'ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_26/GRCh37_mapping/gencode.v26lift37.annotation.gtf.gz' -o references/gencode.v26lift37.annotation.gtf.gz 1> references/download_reference_file_gencode.v26lift37.annotation.gtf.gz_log.out 2> references/download_reference_file_gencode.v26lift37.annotation.gtf.gz_log.err
Submitted job 9 with external jobid '22798875'.

[Mon Jun 19 16:52:29 2023]
rule download_reference_file:
output: references/ucsc.hg19.fasta.gz
log: references/download_reference_file_ucsc.hg19.fasta.gz_log.out, references/download_reference_file_ucsc.hg19.fasta.gz_log.err
jobid: 6
reason: Missing output files: references/ucsc.hg19.fasta.gz
wildcards: file_name=ucsc.hg19.fasta.gz
resources: mem_mb=4096, disk_mb=1000, tmpdir=, time_hours=12

curl -L 'http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz' -o references/ucsc.hg19.fasta.gz 1> references/download_reference_file_ucsc.hg19.fasta.gz_log.out 2> references/download_reference_file_ucsc.hg19.fasta.gz_log.err
Submitted job 6 with external jobid '22798876'.
[Mon Jun 19 16:52:30 2023]
Finished job 10.
1 of 13 steps (8%) done
Waiting at most 60 seconds for missing files.
[Mon Jun 19 16:52:59 2023]
Finished job 9.
2 of 13 steps (15%) done
Select jobs to execute...

[Mon Jun 19 16:52:59 2023]
rule unzip_reference_file:
input: references/gencode.v26lift37.annotation.gtf.gz
output: references/gencode.v26lift37.annotation.gtf
log: references/unzip_reference_file_gencode.v26lift37.annotation.gtf_log.out, references/unzip_reference_file_gencode.v26lift37.annotation.gtf_log.err
jobid: 8
reason: Missing output files: references/gencode.v26lift37.annotation.gtf; Input files updated by another job: references/gencode.v26lift37.annotation.gtf.gz
wildcards: file_name=gencode.v26lift37.annotation.gtf
resources: mem_mb=4096, disk_mb=1000, tmpdir=, time_hours=12

gunzip -c references/gencode.v26lift37.annotation.gtf.gz 1> references/gencode.v26lift37.annotation.gtf 2> references/unzip_reference_file_gencode.v26lift37.annotation.gtf_log.err
Submitted job 8 with external jobid '22798927'.
Waiting at most 60 seconds for missing files.
[Mon Jun 19 16:53:29 2023]
Finished job 8.
3 of 13 steps (23%) done
[Mon Jun 19 16:53:31 2023]
Finished job 6.
4 of 13 steps (31%) done
Select jobs to execute...

[Mon Jun 19 16:53:31 2023]
rule unzip_reference_file:
input: references/ucsc.hg19.fasta.gz
output: references/ucsc.hg19.fasta
log: references/unzip_reference_file_ucsc.hg19.fasta_log.out, references/unzip_reference_file_ucsc.hg19.fasta_log.err
jobid: 5
reason: Missing output files: references/ucsc.hg19.fasta; Input files updated by another job: references/ucsc.hg19.fasta.gz
wildcards: file_name=ucsc.hg19.fasta
resources: mem_mb=4096, disk_mb=1810, tmpdir=, time_hours=12

gunzip -c references/ucsc.hg19.fasta.gz 1> references/ucsc.hg19.fasta 2> references/unzip_reference_file_ucsc.hg19.fasta_log.err
Submitted job 5 with external jobid '22798968'.
[Mon Jun 19 16:54:01 2023]
Finished job 5.
5 of 13 steps (38%) done
Select jobs to execute...

[Mon Jun 19 16:54:01 2023]
rule write_param_file:
input: references/ucsc.hg19.fasta
output: results/NEPC_test/screen.para
log: results/NEPC_test/write_param_file_log.out, results/NEPC_test/write_param_file_log.err
jobid: 4
reason: Missing output files: results/NEPC_test/screen.para; Input files updated by another job: references/ucsc.hg19.fasta
resources: mem_mb=4096, disk_mb=6103, tmpdir=, time_hours=12

/home/clin/IRIS/conda_wrapper /home/clin/IRIS/conda_env_3 python scripts/write_param_file.py --out-path results/NEPC_test/screen.para --group-name NEPC_test --iris-db /home/clin/IRIS/IRIS_data/db --psi-p-value-cutoffs ,,0.01 --sjc-p-value-cutoffs ,,0.000001 --delta-psi-cutoffs ,,0.05 --fold-change-cutoffs ,,1 --group-count-cutoffs ,,8 --reference-names-tissue-matched-normal '' --reference-names-tumor '' --reference-names-normal GTEx_Heart,GTEx_Blood,GTEx_Lung,GTEx_Liver,GTEx_Brain,GTEx_Nerve,GTEx_Muscle,GTEx_Spleen,GTEx_Thyroid,GTEx_Skin,GTEx_Kidney --comparison-mode group --statistical-test-type parametric --mapability-bigwig /home/clin/IRIS/IRIS_data/resources/mappability/wgEncodeCrgMapabilityAlign24mer.bigWig --reference-genome references/ucsc.hg19.fasta 1> results/NEPC_test/write_param_file_log.out 2> results/NEPC_test/write_param_file_log.err
Submitted job 4 with external jobid '22799008'.
[Mon Jun 19 16:54:11 2023]
Error in rule write_param_file:
jobid: 4
output: results/NEPC_test/screen.para
log: results/NEPC_test/write_param_file_log.out, results/NEPC_test/write_param_file_log.err (check log file(s) for error message)
shell:
/home/clin/IRIS/conda_wrapper /home/clin/IRIS/conda_env_3 python scripts/write_param_file.py --out-path results/NEPC_test/screen.para --group-name NEPC_test --iris-db /home/clin/IRIS/IRIS_data/db --psi-p-value-cutoffs ,,0.01 --sjc-p-value-cutoffs ,,0.000001 --delta-psi-cutoffs ,,0.05 --fold-change-cutoffs ,,1 --group-count-cutoffs ,,8 --reference-names-tissue-matched-normal '' --reference-names-tumor '' --reference-names-normal GTEx_Heart,GTEx_Blood,GTEx_Lung,GTEx_Liver,GTEx_Brain,GTEx_Nerve,GTEx_Muscle,GTEx_Spleen,GTEx_Thyroid,GTEx_Skin,GTEx_Kidney --comparison-mode group --statistical-test-type parametric --mapability-bigwig /home/clin/IRIS/IRIS_data/resources/mappability/wgEncodeCrgMapabilityAlign24mer.bigWig --reference-genome references/ucsc.hg19.fasta 1> results/NEPC_test/write_param_file_log.out 2> results/NEPC_test/write_param_file_log.err
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: 22799008

Error executing rule write_param_file on cluster (jobid: 4, external: 22799008, jobscript: /home/clin/IRIS/.snakemake/tmp.7v6mkgax/snakejob.write_param_file.4.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 4.
Select jobs to execute...

[Mon Jun 19 16:54:11 2023]
rule write_param_file:
input: references/ucsc.hg19.fasta
output: results/NEPC_test/screen.para
log: results/NEPC_test/write_param_file_log.out, results/NEPC_test/write_param_file_log.err
jobid: 4
reason: Missing output files: results/NEPC_test/screen.para; Input files updated by another job: references/ucsc.hg19.fasta
resources: mem_mb=4096, disk_mb=6103, tmpdir=, time_hours=12

/home/clin/IRIS/conda_wrapper /home/clin/IRIS/conda_env_3 python scripts/write_param_file.py --out-path results/NEPC_test/screen.para --group-name NEPC_test --iris-db /home/clin/IRIS/IRIS_data/db --psi-p-value-cutoffs ,,0.01 --sjc-p-value-cutoffs ,,0.000001 --delta-psi-cutoffs ,,0.05 --fold-change-cutoffs ,,1 --group-count-cutoffs ,,8 --reference-names-tissue-matched-normal '' --reference-names-tumor '' --reference-names-normal GTEx_Heart,GTEx_Blood,GTEx_Lung,GTEx_Liver,GTEx_Brain,GTEx_Nerve,GTEx_Muscle,GTEx_Spleen,GTEx_Thyroid,GTEx_Skin,GTEx_Kidney --comparison-mode group --statistical-test-type parametric --mapability-bigwig /home/clin/IRIS/IRIS_data/resources/mappability/wgEncodeCrgMapabilityAlign24mer.bigWig --reference-genome references/ucsc.hg19.fasta 1> results/NEPC_test/write_param_file_log.out 2> results/NEPC_test/write_param_file_log.err
Submitted job 4 with external jobid '22799023'.
[Mon Jun 19 16:54:22 2023]
Error in rule write_param_file:
jobid: 4
output: results/NEPC_test/screen.para
log: results/NEPC_test/write_param_file_log.out, results/NEPC_test/write_param_file_log.err (check log file(s) for error message)
shell:
/home/clin/IRIS/conda_wrapper /home/clin/IRIS/conda_env_3 python scripts/write_param_file.py --out-path results/NEPC_test/screen.para --group-name NEPC_test --iris-db /home/clin/IRIS/IRIS_data/db --psi-p-value-cutoffs ,,0.01 --sjc-p-value-cutoffs ,,0.000001 --delta-psi-cutoffs ,,0.05 --fold-change-cutoffs ,,1 --group-count-cutoffs ,,8 --reference-names-tissue-matched-normal '' --reference-names-tumor '' --reference-names-normal GTEx_Heart,GTEx_Blood,GTEx_Lung,GTEx_Liver,GTEx_Brain,GTEx_Nerve,GTEx_Muscle,GTEx_Spleen,GTEx_Thyroid,GTEx_Skin,GTEx_Kidney --comparison-mode group --statistical-test-type parametric --mapability-bigwig /home/clin/IRIS/IRIS_data/resources/mappability/wgEncodeCrgMapabilityAlign24mer.bigWig --reference-genome references/ucsc.hg19.fasta 1> results/NEPC_test/write_param_file_log.out 2> results/NEPC_test/write_param_file_log.err
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: 22799023

Error executing rule write_param_file on cluster (jobid: 4, external: 22799023, jobscript: /home/clin/IRIS/.snakemake/tmp.7v6mkgax/snakejob.write_param_file.4.sh). For error details see the cluster log and the log files of the involved rule(s).
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-06-19T165228.360928.snakemake.log

#install_log
Package libgcc conflicts for:
r-base==3.4.1 -> libgcc
seaborn==0.9.0 -> scipy[version='>=0.15.2'] -> libgcc
seq2hla==2.2 -> bowtie==1.1.2 -> libgcc
statsmodels==0.10.2 -> scipy[version='>=0.14'] -> libgcc

Package _openmp_mutex conflicts for:
r-base==3.4.1 -> libgcc-ng[version='>=4.9'] -> _openmp_mutex[version='>=4.5']
rmats==4.1.2 -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex
star==2.7.10b -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
python=2.7 -> libgcc-ng[version='>=7.3.0'] -> _openmp_mutex[version='>=4.5']
rmats==4.1.2 -> _openmp_mutex[version='>=4.5']
scipy==1.2.0 -> libgcc-ng[version='>=7.3.0'] -> _openmp_mutex[version='|>=4.5',build=_llvm]
pysam==0.15.4 -> libgcc-ng[version='>=7.3.0'] -> _openmp_mutex[version='>=4.5']
seq2hla==2.2 -> r-base -> _openmp_mutex
bedtools==2.29.0 -> libgcc-ng[version='>=7.3.0'] -> _openmp_mutex[version='>=4.5']
pybigwig==0.3.13 -> libgcc-ng[version='>=7.3.0'] -> _openmp_mutex[version='>=4.5']
numpy==1.16.5 -> libgcc-ng[version='>=9.3.0'] -> _openmp_mutex[version='|>=4.5',build=_llvm]
statsmodels==0.10.2 -> libgcc-ng[version='>=7.3.0'] -> _openmp_mutex[version='>=4.5']

Package xz conflicts for:
r-base==3.4.1 -> cairo[version='>=1.14.12,<2.0.0a0'] -> xz[version='5.0.|>=5.2.10,<6.0a0|>=5.2.3,<6.0a0|>=5.2.4,<6.0a0|>=5.2.6,<6.0a0|>=5.2.6,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<6.0a0|>=5.2.8,<6.0a0']
bedtools==2.29.0 -> xz[version='>=5.2.4,<5.3.0a0']
r-base==3.4.1 -> xz[version='5.2.|>=5.2.3,<5.3.0a0']
rmats==4.1.2 -> python[version='>=3.9,<3.10.0a0'] -> xz[version='5.2.|>=5.2.10,<6.0a0|>=5.2.5,<5.3.0a0|>=5.2.6,<6.0a0|>=5.4.2,<6.0a0|>=5.2.8,<6.0a0|>=5.2.5,<6.0a0|>=5.2.4,<5.3.0a0|>=5.2.4,<6.0a0|>=5.2.6,<5.3.0a0|>=5.2.3,<5.3.0a0|>=5.2.3,<6.0a0']
pysam==0.15.4 -> python[version='>=3.6,<3.7.0a0'] -> xz[version='5.2.|>=5.2.3,<5.3.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0|>=5.2.10,<6.0a0|>=5.2.6,<6.0a0']
statsmodels==0.10.2 -> python[version='>=3.6,<3.7.0a0'] -> xz[version='5.2.|>=5.2.3,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0|>=5.2.6,<6.0a0|>=5.4.2,<6.0a0|>=5.2.10,<6.0a0']
star==2.7.10b -> htslib[version='>=1.17,<1.18.0a0'] -> xz[version='>=5.2.6,<5.3.0a0|>=5.2.6,<6.0a0']
seaborn==0.9.0 -> python -> xz[version='5.0.|5.2.|>=5.2.3,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.6,<5.3.0a0|>=5.2.6,<6.0a0|>=5.4.2,<6.0a0|>=5.2.10,<6.0a0|>=5.2.8,<6.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0']
pysam==0.15.4 -> xz[version='>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0']
scipy==1.2.0 -> python[version='>=3.7,<3.8.0a0'] -> xz[version='5.2.|>=5.2.10,<6.0a0|>=5.2.3,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.6,<6.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0']
numpy==1.16.5 -> python[version='>=3.7,<3.8.0a0'] -> xz[version='5.2.|>=5.2.10,<6.0a0|>=5.2.3,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.6,<6.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.4.2,<6.0a0|>=5.2.3,<6.0a0']
cufflinks==2.2.1 -> python[version='>=3.5,<3.6.0a0'] -> xz[version='5.0.|5.2.|>=5.2.3,<5.3.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0|>=5.2.5,<5.3.0a0|>=5.2.4,<5.3.0a0']
seq2hla==2.2 -> r-base -> xz[version='5.2.|>=5.2.3,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.6,<6.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0']
pybigwig==0.3.13 -> python[version='>=3.6,<3.7.0a0'] -> xz[version='5.0.|5.2.|>=5.2.3,<5.3.0a0|>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.5,<6.0a0|>=5.2.4,<6.0a0|>=5.2.3,<6.0a0|>=5.2.10,<6.0a0|>=5.2.6,<6.0a0']

Package intel-openmp conflicts for:
seaborn==0.9.0 -> scipy[version='>=0.15.2'] -> intel-openmp[version='>=2021.4.0,<2022.0a0|>=2023.1.0,<2024.0a0']
statsmodels==0.10.2 -> scipy[version='>=0.14'] -> intel-openmp[version='>=2021.4.0,<2022.0a0|>=2023.1.0,<2024.0a0']
numpy==1.16.5 -> mkl[version='>=2019.4,<2021.0a0'] -> intel-openmp
scipy==1.2.0 -> mkl[version='>=2019.1,<2021.0a0'] -> intel-openmp

Package llvm-openmp conflicts for:
numpy==1.16.5 -> mkl[version='>=2019.4,<2021.0a0'] -> llvm-openmp[version='>=10.0.0|>=11.0.0|>=9.0.1|>=16.0.5|>=14.0.4|>=13.0.1|>=12.0.1|>=11.1.0|>=11.0.1|>=16.0.1|>=10.0.1']
rmats==4.1.2 -> _openmp_mutex[version='>=4.5'] -> llvm-openmp[version='>=9.0.1']
scipy==1.2.0 -> blas=[build=openblas] -> llvm-openmp[version='>=10.0.0|>=11.0.0|>=11.0.1|>=11.1.0|>=12.0.1|>=13.0.1|>=14.0.4|>=16.0.5|>=9.0.1|>=16.0.1|>=10.0.1']

Package readline conflicts for:
python=2.7 -> sqlite[version='>=3.30.1,<4.0a0'] -> readline[version='>=8.1,<9.0a0|>=8.1.2,<9.0a0|>=8.2,<9.0a0']
python=2.7 -> readline[version='6.2.|7.0|7.0.|>=7.0,<8.0a0|>=8.0,<9.0a0|7.*']

Package statsmodels conflicts for:
statsmodels==0.10.2
seaborn==0.9.0 -> statsmodels[version='>=0.5.0']

Package libgcc-ng conflicts for:
python=2.7 -> openssl[version='>=1.1.1a,<1.1.2a'] -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0']
python=2.7 -> libgcc-ng[version='>=11.2.0|>=4.9|>=7.3.0|>=7.2.0']

Package zlib conflicts for:
python=2.7 -> sqlite[version='>=3.30.1,<4.0a0'] -> zlib[version='>=1.2.12,<1.3.0a0|>=1.2.13,<2.0a0']
python=2.7 -> zlib[version='1.2.|1.2.11|1.2.11.|>=1.2.11,<1.3.0a0|1.2.8|>=1.2.13,<1.3.0a0']

Package libstdcxx-ng conflicts for:
python=2.7 -> ncurses[version='>=6.1,<7.0.0a0'] -> libstdcxx-ng[version='>=11.2.0|>=7.5.0|>=9.4.0']
python=2.7 -> libstdcxx-ng[version='>=4.9|>=7.3.0|>=7.2.0']

Package star conflicts for:
rmats==4.1.2 -> star[version='>=2.5']
star==2.7.10b

Package ncurses conflicts for:
python=2.7 -> readline[version='>=8.0,<9.0a0'] -> ncurses[version='5.9|>=6.2,<7.0.0a0|>=6.4,<7.0a0']
python=2.7 -> ncurses[version='5.9.|>=6.1,<7.0.0a0|>=6.3,<7.0a0|>=6.2,<7.0a0|>=6.1,<7.0a0|>=6.0,<7.0a0|6.0.']The following specifications were found to be incompatible with your system:

feature:/linux-64::__glibc==2.27=0
feature:|@/linux-64::__glibc==2.27=0
bedtools==2.29.0 -> libgcc-ng[version='>=7.3.0'] -> __glibc[version='>=2.17']
numpy==1.16.5 -> libgcc-ng[version='>=9.3.0'] -> __glibc[version='>=2.17']
pybigwig==0.3.13 -> libgcc-ng[version='>=7.3.0'] -> __glibc[version='>=2.17']
pysam==0.15.4 -> libgcc-ng[version='>=7.3.0'] -> __glibc[version='>=2.17']
r-base==3.4.1 -> libgcc-ng[version='>=4.9'] -> __glibc[version='>=2.17']
rmats==4.1.2 -> libgfortran-ng -> __glibc[version='>=2.17']
scipy==1.2.0 -> libgcc-ng[version='>=7.3.0'] -> __glibc[version='>=2.17']
statsmodels==0.10.2 -> libgcc-ng[version='>=7.3.0'] -> __glibc[version='>=2.17']

Your installed version is: 2.27

Note that strict channel priority may have removed packages required for satisfiability.

Workflow Errors

Hi, Eric,

I'm running into a new workflow error, mostly with "Resource Usage" in the cluster_status.py file.

This is the error I am seeing:

WorkflowError:
Cluster status command python ./snakemake_profile/cluster_status.py --retry-status-interval-seconds 30,120,300 --resource-usage-dir ./snakemake_profile/job_resource_usage --resource-usage-min-interval 300 returned None but just a single line with one of running,success,failed is expected.

Is there anyway for me to disable "resource usage" altogether? I don't really need to see resource utilization, just need IRIS to run. Adapting from slurm to LSF was an endeavor, so I might find value in taking some optional functionality of the SnakeMake pipeline out.

Please let me know,
Walid

Install IRIS error

Hello,

We are trying to install IRIS in our research cluster and we get this error.

CondaFileIOError: 'conda_requirements_py2.txt'. [Errno 2] No such file or directory: 'conda_requirements_py2.txt'

Also, is it possible to run IRIS on Python 3? As there is a: 'conda_requirements_py3.txt'

This is the terminal installing output:


[root@afrodita-login IRIS]# ./install core
 
checking conda
WARNING: A conda environment already exists at '/root/conda_env_2'
Remove existing environment (y/[n])? y
 
Collecting package metadata (current_repodata.json): done
Solving environment: done
 
## Package Plan ##
 
  environment location: /root/conda_env_2

Proceed ([y]/n)? y
 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate /root/conda_env_2
#
# To deactivate an active environment, use
#
#     $ conda deactivate
 
WARNING: A conda environment already exists at '/root/conda_env_3'
Remove existing environment (y/[n])? y
 
Collecting package metadata (current_repodata.json): done
Solving environment: done
 
## Package Plan ##
 
  environment location: /root/conda_env_3
 
Proceed ([y]/n)? y 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate /root/conda_env_3
#
# To deactivate an active environment, use
#
#     $ conda deactivate
 
WARNING: A conda environment already exists at '/root/conda_env_samtools'
Remove existing environment (y/[n])? y
 
Collecting package metadata (current_repodata.json): done
Solving environment: done
 
## Package Plan ##
 
  environment location: /root/conda_env_samtools
 
Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

#
# To activate this environment, use
#
#     $ conda activate /root/conda_env_samtools
#
# To deactivate an active environment, use
#
#     $ conda deactivate

checking python dependencies

CondaFileIOError: 'conda_requirements_py2.txt'. [Errno 2] No such file or directory: 'conda_requirements_py2.txt'

[root@afrodita-login IRIS]# nano conda_requirements_py2.txt

[root@afrodita-login IRIS]# nano conda_requirements_py2.txt

[root@afrodita-login IRIS]# ./install core

checking conda

WARNING: A conda environment already exists at '/root/conda_env_2'

Remove existing environment (y/[n])? y

Collecting package metadata (current_repodata.json): done

Solving environment: done

## Package Plan ##
  environment location: /root/conda_env_2

Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

#
# To activate this environment, use
#
#     $ conda activate /root/conda_env_2
#
# To deactivate an active environment, use
#
#     $ conda deactivate

WARNING: A conda environment already exists at '/root/conda_env_3'

Remove existing environment (y/[n])? y

Collecting package metadata (current_repodata.json): done

Solving environment: done

## Package Plan ##

  environment location: /root/conda_env_3

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

#
# To activate this environment, use
#
#     $ conda activate /root/conda_env_3
#
# To deactivate an active environment, use
#
#     $ conda deactivate

WARNING: A conda environment already exists at '/root/conda_env_samtools'

Remove existing environment (y/[n])? y

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /root/conda_env_samtools
  
Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

#
# To activate this environment, use
#
#     $ conda activate /root/conda_env_samtools
#
# To deactivate an active environment, use
#
#     $ conda deactivate

checking python dependencies

CondaFileIOError: 'conda_requirements_py2.txt'. [Errno 2] No such file or directory:  'conda_requirements_py2.txt'

[root@afrodita-login IRIS]# cat conda_requirements_py2.txt
bedtools=2.29.0
numpy=1.16.5
pybigwig=0.3.13
python=2.7.*
scipy=1.2.0
seaborn=0.9.0
statsmodels=0.10.2

Retrieving the reference db using google_drive_download.py

Hi,

I was attempting to install IRIS on our HPC cluster, and I am running into trouble with the data download script google_drive_download.py.

I set up a Google service-account, and I passed the key as the instructions note. Here is the command I am running:
python google_drive_download.py --iris-folder-id 1zhmXoajD5RyjxVTYbGZ-ebic1VPfEYKz --dest-dir IRIS_data --download-all --api-key-json-path iris-module-install-1a0df2bdaffd.json

I am running this command from inside the parent directory, ./IRIS/ and I want to use the --download-all flag. I keep getting back the following error:

  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 187, in <module>
    main()
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 63, in main
    download_all_files(args.iris_folder_id, args.dest_dir,
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 132, in download_all_files
    write_file_tsv(all_files, TOP_DIR_NAME, temp_handle)
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 121, in write_file_tsv
    write_file_tsv(folder_dict['files'], full_path, tsv_handle)
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 121, in write_file_tsv
    write_file_tsv(folder_dict['files'], full_path, tsv_handle)
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 121, in write_file_tsv
    write_file_tsv(folder_dict['files'], full_path, tsv_handle)
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 117, in write_file_tsv
    write_tsv_line([full_path, file_dict['id']], tsv_handle)
  File "/shortened_path/iris/vendor/IRIS/google_drive_download.py", line 101, in write_tsv_line
    tsv_handle.write('{}\n'.format('\t'.join(columns)))
  File "/shortened_path/python/install/3.9.9/lib/python3.9/tempfile.py", line 478, in func_wrapper
    return func(*args, **kwargs)
TypeError: a bytes-like object is required, not 'str'

Any pointers on where to go from here? I know that the script needs editing to take string instead of bytes, but I have not looked through the entire file yet.

Cheers,
Walid

hg38

Hello,

Is the download data available with hg38 annotations? If so, can you point to where they are?

screening error when using individual mode

Thank you very much for your patient and detailed explanations earlier. Recently, when I was using IRIS in individual mode with a single sample, the following error occurred:

/IRIS/IRIS/conda_wrapper /IRIS/IRIS/conda_env_2 IRIS screen --parameter-fin results/docker_test/screen.para --splicing-event-type SE --outdir results/docker_test/screen --translating --gtf references/gencode.v26lift37.annotation.gtf 1> results/docker_test/iris_screen_log.out 2> results/docker_test/iris_screen_log.err
[Mon Mar 11 03:37:12 2024]
Error in rule iris_screen:
    jobid: 7
    output: results/docker_test/screen/docker_test.SE.test.all_guided.txt, results/docker_test/screen/docker_test.SE.test.all_voted.txt, results/docker_test/screen/docker_test.SE.notest.txt, results/docker_test/screen/docker_test.SE.tier1.txt, results/docker_test/screen/docker_test.SE.tier2tier3.txt
    log: results/docker_test/iris_screen_log.out, results/docker_test/iris_screen_log.err (check log file(s) for error message)
    shell:
        /IRIS/IRIS/conda_wrapper /IRIS/IRIS/conda_env_2 IRIS screen --parameter-fin results/docker_test/screen.para --splicing-event-type SE --outdir results/docker_test/screen --translating --gtf references/gencode.v26lift37.annotation.gtf 1> results/docker_test/iris_screen_log.out 2> results/docker_test/iris_screen_log.err
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

and here is the log:

[Ended] no test performed because no testable events. Check input or filtering parameteres.

additionally, here is my "snakemake_config.yaml" file:

# Resource allocation
create_star_index_threads: 200
create_star_index_mem_gb: 140
create_star_index_time_hr: 12
iris_append_sjc_mem_gb: 180
iris_append_sjc_time_hr: 24
# TODO 16 threads hardcoded in iris process_rnaseq
iris_cuff_task_threads: 200
iris_cuff_task_mem_gb: 180
iris_cuff_task_time_hr: 12
iris_epitope_post_mem_gb: 180
iris_epitope_post_time_hr: 12
iris_exp_matrix_mem_gb: 180
iris_exp_matrix_time_hr: 12
iris_extract_sjc_task_mem_gb: 180
iris_extract_sjc_task_time_hr: 12
iris_format_mem_gb: 180
iris_format_time_hr: 12
# TODO seq2HLA defaults to 6 threads since IRIS does not supply the -p argument
iris_hla_task_threads: 200
iris_hla_task_mem_gb: 180
iris_hla_task_time_hr: 12
iris_parse_hla_mem_gb: 180
iris_parse_hla_time_hr: 12
iris_predict_mem_gb: 180
iris_predict_time_hr: 12
iris_predict_task_mem_gb: 180
iris_predict_task_time_hr: 12
# TODO 8 hardcoded in makesubsh_rmats
iris_rmats_task_threads: 200
iris_rmats_task_mem_gb: 180
iris_rmats_task_time_hr: 12
# TODO 8 hardcoded in makesubsh_rmatspost
iris_rmatspost_task_threads: 200
iris_rmatspost_task_mem_gb: 180
iris_rmatspost_task_time_hr: 12
iris_screen_mem_gb: 180
iris_screen_time_hr: 12
iris_screen_sjc_mem_gb: 180
iris_screen_sjc_time_hr: 12
iris_sjc_matrix_mem_gb: 180
iris_sjc_matrix_time_hr: 12
# TODO 6 threads hardcoded in iris process_rnaseq
iris_star_task_threads: 200
iris_star_task_mem_gb: 200
iris_star_task_time_hr: 12
iris_visual_summary_mem_gb: 180
iris_visual_summary_time_hr: 12
# Command options
run_core_modules: false
# run_all_modules toggles which rules can be run by
# conditionally adding UNSATISFIABLE_INPUT to certain rules.
run_all_modules: true
should_run_sjc_steps: true
star_sjdb_overhang: 100
run_name: 'docker_test'  # used to name output files
splice_event_type: 'SE'  # one of [SE, RI,A3SS, A5SS]
comparison_mode: 'individual'  # group or individual
stat_test_type: 'parametric'  # parametric or nonparametric
use_ratio: false
tissue_matched_normal_psi_p_value_cutoff: ''
tissue_matched_normal_sjc_p_value_cutoff: ''
tissue_matched_normal_delta_psi_p_value_cutoff: ''
tissue_matched_normal_fold_change_cutoff: ''
tissue_matched_normal_group_count_cutoff: ''
tissue_matched_normal_reference_group_names: ''
tumor_psi_p_value_cutoff: ''
tumor_sjc_p_value_cutoff: ''
tumor_delta_psi_p_value_cutoff: ''
tumor_fold_change_cutoff: ''
tumor_group_count_cutoff: ''
tumor_reference_group_names: ''
normal_psi_p_value_cutoff: '0.01'
normal_sjc_p_value_cutoff: '0.000001'
normal_delta_psi_p_value_cutoff: '0.05'
normal_fold_change_cutoff: '1'
normal_group_count_cutoff: '8'
normal_reference_group_names: 'GTEx_Heart,GTEx_Blood,GTEx_Lung,GTEx_Liver,GTEx_Brain,GTEx_Nerve,GTEx_Muscle,GTEx_Spleen,GTEx_Thyroid,GTEx_Skin,GTEx_Kidney'
# Input files
# sample_fastqs are not needed when just running the core modules
sample_fastqs:
    DN2222153:
     - '/IRIS/inputs/T001332989/SD221201094FTT_01_R1.fq'
     - '/IRIS/inputs/T001332989/SD221201094FTT_01_R2.fq'
#   sample_name_2:
#     - '/path/to/sample_2_read_1.fq'
#     - '/path/to/sample_2_read_2.fq'


blocklist: ''
####---------------------------------- Do not need to change the following arguments ----------------------------------####
mapability_bigwig: '/IRIS/IRIS_data/resources/mappability/wgEncodeCrgMapabilityAlign24mer.bigWig'
# mhc_list: '/path/to/example/hla_types_test.list'
# mhc_by_sample: '/path/to/example/hla_patient_test.tsv'
gene_exp_matrix: ''
#splice_matrix_txt: '/path/to/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt'
#splice_matrix_idx: '/path/to/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt.idx'
#sjc_count_txt: '/path/to/example/sjc_matrix/SJ_count.NEPC_example.txt'
#sjc_count_idx: '/path/to/example/sjc_matrix/SJ_count.NEPC_example.txt.idx'
# Reference files
gtf_name: 'gencode.v26lift37.annotation.gtf'
fasta_name: 'ucsc.hg19.fasta'
reference_files:
  gencode.v26lift37.annotation.gtf.gz:
    url: 'ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_26/GRCh37_mapping/gencode.v26lift37.annotation.gtf.gz'
  ucsc.hg19.fasta.gz:
    url: 'http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz'
# Additional configuration
rmats_path: '/IRIS/IRIS/conda_env_2/bin/rmats.py'  # should be written by ./install
conda_wrapper: '/IRIS/IRIS/conda_wrapper'  # should be written by ./install
conda_env_2: '/IRIS/IRIS/conda_env_2'  # should be written by ./install
conda_env_3: '/IRIS/IRIS/conda_env_3'  # should be written by ./install
iris_data: '/IRIS/IRIS_data'  # should be written by ./install
iedb_path: '/IRIS/IRIS/IEDB/mhc_i/src'  # should be written by ./install
rmats_path: '/IRIS/IRIS/conda_env_2/bin/rmats.py'

I would greatly appreciate it if you could provide any suggestions. If possible, I would also like a "snakemake_config.yaml" file template for the individual mode.

Looking forward to your reply. Thanks again.

missing rmats_mat_path_manifest and rmats_sample_order example files

Hello,

I am not able to find the files rmats_mat_path_manifest and rmats_sample_order for the format function in the "example" folder of the IRIS github, where can I find them?

Thank you,
Maria

Run IRIS on hg38

Hello,

As seen in the methods section of the paper the IRIS RNA-seq data processing uses the reference human genome hg19.
Is it possible to use IRIS with the reference human genome hg38?

Also, can the IRIS functions be run individually one by one without configuring the snakemake files?

Thank you,

Solving environment for python dependencies fails

Hello,

I was just trying to install IRIS using the ./install all command. The three ## Package Plan ## environments get built, but when checking python dependencies, the solving environment buffers indefinitely until it fails. I am attaching the terminal output below for reference.

Do you know why this issue is happening? What is the recommended way to go about this?

Thanks,
Walid

...
#
# To deactivate an active environment, use
#
#     $ conda deactivate


checking python dependencies
Solving environment: failed

CondaError: KeyboardInterrupt

Terminated

(base) [wabuala@noderome102 IRIS]$ (base) [wabuala@splprhpc07 ~]$

xinglab / iris Goto Github PK

iris's People

Contributors

Stargazers

Watchers

Forkers

iris's Issues

Recommend Projects

Recommend Topics

Recommend Org