Our group also cannot run docker on our cluster and have tried using Singularity and c

Process ends in Permanent Fail with other than mg37,about ncbi/pgap

Comments (25)

TheBigFatTony commented on August 19, 2024

how did you get the test genome to run?

from pgap.

ChristyPeterson commented on August 19, 2024

I followed the instructions and created the yaml file using the mg37 genome that was packaged in with the docker container. Here is the example yaml I used to run it

# pipeline data files
16s_blastdb_dir:
  class: Directory
  location: input/16S_rRNA
23s_blastdb_dir:
  class: Directory
  location: input/23S_rRNA
5s_model_path:
  class: File
  location: input/RF00001.cm
AntiFamLib:
  class: Directory
  location: input/AntiFamLib
asn2pas_xsl:
  class: File
  location: input/asn2pas.xsl
blast_rules_db_dir:
  class: Directory
  location: input/uniColl_path/blast_dir
CDDdata: # ${GP_HOME}/third-party/data/CDD/cdd - this is rpsblastdb
  class: Directory
  location: input/CDD
CDDdata2: # ${GP_HOME}/third-party/data/cdd_add
  class: Directory
  location: input/cdd_add
defline_cleanup_rules: # defline_cleanup_rules # ${GP_HOME}/etc/product_rules.prt
  class: File
  location: input/product_rules.prt
gene_master_ini:
  class: File
  location: input/gene_master.ini
genemark_path:
  class: Directory
  location: input/GeneMark
hmm_path:
  class: Directory
  location: input/uniColl_path/real_hmms
hmms_tab:
  class: File
  location: input/uniColl_path/real_hmms.tab
naming_blast_db: # NamingDatabase
  class: Directory
  location: input/uniColl_path/blast_dir # this one might have created problems for assign_cluster, let's try this:
naming_hmms_combined: # ${GP_HOME}/third-party/data/BacterialPipeline/uniColl/ver-3.2/naming_hmms_combined.mft
  class: Directory
  location: input/uniColl_path/naming_hmms
naming_hmms_tab:
  class: File
  location: input/uniColl_path/naming_hmms.tab
naming_sqlite: # /panfs/pan1.be-md.ncbi.nlm.nih.gov/gpipe/home/badrazat/local-install/2018-05-17/third-party/data/BacterialPipeline/uniColl/ver-3.2/naming.sqlite
  class: File
  location: input/uniColl_path/naming.sqlite
rfam_amendments:
  class: File
  location: input/rfam-amendments.xml
rfam_model_path:
  class: File
  location: input/Rfam.selected1.cm
rfam_stockholm:
  class: File
  location: input/Rfam.seed
selenoproteins: # /panfs/pan1.be-md.ncbi.nlm.nih.gov/gpipe/home/badrazat/local-install/2018-05-17/third-party/data/BacterialPipeline/Selenoproteins/selenoproteins, it's blastdb
  class: Directory
  location: input/selenoproteins
taxon_db:
  class: File
  location: input/uniColl_path/taxonomy.sqlite3
thresholds:
  class: File
  location: input/thresholds.xml
uniColl_cache:
  class: Directory
  location: input/uniColl_path/cache
#uniColl_path:
#  class: Directory
#  location: input/uniColl_path
univ_prot_xml:
  class: File
  location: input/uniColl_path/universal.xml
val_res_den_xml:
  class: File
  location: input/validation-results.xml
wp_hashes:
  class: File
  location: input/uniColl_path/wp-hashes.sqlite

#
#   Setup for template prepartion
# 

submit_block_template_static: 
    class: File
    location: input_template/submit_block_static.template
molinfo_complete_asn: 
    class: File
    location: input_template/molinfo_complete.asn
molinfo_wgs_asn: 
    class: File
    location: input_template/molinfo_wgs.asn
submit_block_template:
  class: File
  location: MG37/ASM2732v1.1.template
fasta:
  class: File
  location: MG37/ASM2732v1.annotation.nucleotide.1.fa
taxid: 243273
gc_assm_name: MG37
completeness: complete

I used the wf_pgap_simple.cwl file to submit the run to our cluster.

When this completed, I then copied a draft genome and closed genome into the same dir for testing, using the same method as above, but both instances end in the permanentFail error.

from pgap.

azat-badretdin commented on August 19, 2024

Thank you for additional information! Could you please also post the complete log, if you saved it?

Thanks.

from pgap.

ChristyPeterson commented on August 19, 2024

Okay so I'm running the program on a cluster that is handled with slurm. These logs are from running the closed genome through the pgap_simple with both srun and sbatch (wanted to rule out a problem with interactivity).

Unfortunately the files are larger than the 10MB cut off, so I've shared them in the following location:
pgap-logs

from pgap.

azat-badretdin commented on August 19, 2024

Thank you! I saved the logs. Feel free to remove them from the storage if you wish so.

from pgap.

azat-badretdin commented on August 19, 2024

I have noticed that you are using more recent package in your runs (pgap-2018-11-07.build3190)

How did you obtain this package?

from pgap.

ChristyPeterson commented on August 19, 2024

It was just downloaded as part of the instructions on the repo as far as I know. From the instillation instructions on the git repo:
(cwl) $ wget -qO- https://github.com/ncbi/pgap/archive/2018-11-07.build3190.tar.gz | tar xvz

from pgap.

azat-badretdin commented on August 19, 2024

Found it

Thanks!

from pgap.

azat-badretdin commented on August 19, 2024

I tried, unsuccessfully, to reproduce the problem with given taxid (in the logs) and other options.

Can you reproduce this problem with the genomic input that you can publish here without violating the confidentiality of your project?

from pgap.

ChristyPeterson commented on August 19, 2024

Sorry for the delay, I am trying to get permission to share out the fasta files but I'm not getting a reply back.
In the meantime I've set up another run using a publicly available Listeria genome. When attempting to start this run I came upon a python error that didn't happen before. Long story short, we had to upgrade python in the env to 3.6 (from 2.7) and change some cwl stuff around. The Listeria genome is running now. I'll post the results once it finishes. Who knows maybe this was the issue in the first place and it'll allow me to complete that original ecoli run I posted about.

from pgap.

azat-badretdin commented on August 19, 2024

Thanks. Hopefully you will be able to reproduce the same error with Listeria.

from pgap.

ChristyPeterson commented on August 19, 2024

Alright, I now have something to report. The listeria (CP001602.2) run was successful, as well as an ecoli I'd downloaded from ncbi (NZ_KK583188.1). Thinking it might have to do with the header, I tried replacing the ncbi ecoli (NZ_KK583188.1) header with my failing genome header and low and behold this caused the run to fail. Here is an example of what that header looks like:
>SAMPLE0001_contig1 [organism=Escherichia coli] [location=chromosome] [topology=circular] [completeness=complete]
I will try removing everything in the header after the first space, and see if the run completes. I believe the issue is related to the [] that are in the header example above.

from pgap.

azat-badretdin commented on August 19, 2024

Thanks for testing!

The result of your experimentation makes this puzzle even more intriguing. I am presuming you have replaced your original private organism with E. coli. Is that organism registered in NCBI Taxonomy?

I will try removing everything in the header after the first space

OK.

I believe the issue is related to the [] that are in the header example above.

Does the organism in the header match the taxid in the input yaml file?

from pgap.

ChristyPeterson commented on August 19, 2024

The private organism was an inhouse closed ecoli genome so it has not been registered in NCBI taxonomy. Therefore, when I was creating the yaml I used the general taxid562 for ecoli (which failed).
I downloaded NZ_KK583188.1 because it was has been registered (taxid866789).
Run's I've tried:
Inhouse ecoli with taxid562(general ecoli) = Fail
Inhouse ecoli with taxid866789(NZ_KK strain specific) = Fail
NZ_KK583188.1 with taxid866789 = Pass
NZ_KK583188.1 with taxid562 = Pass

CP001602.2(listeria) with taxid1639(general lmo) = Pass
CP001602.2 with taxid653938(08-5578 specific) = Pass

NZ_KK583188.1 with inhouse ecoli header replacing its original header, taxid866789 = Fail

Its interesting because the header that comes from ncbi does have some special characters in it (equals signs)
>NZ_KK583188.1 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 strain DSM 30083 Scaffold1, whole genome shotgun sequence
The header I copied from my inhouse ecoli (see above comment for what this looks like) is formatted for when you submit your strain to NCBI genome (using the contig description tags).

from pgap.

azat-badretdin commented on August 19, 2024

It looks like you tried to run public genome with your in-house header and it failed.

For the sake of getting results for your research, have you tried to run your in-house genome with the public header? I understand that output files will contain incorrect markup in terms of organism, etc, but at least for the annotations you will get the correct features (proteins, etc).

BTW, it is perfectly fine in terms of getting results to specify a species taxid in the input yaml.

Could it be some that you have some special characters in your in-house headers that confuse the pipeline?

from pgap.

ChristyPeterson commented on August 19, 2024

Yes I'm thinking to try replacing the inhouse header with the public header, it couldn't hurt to try.
I'm actually trying to get this system set up for a larger group of people in my organization. So the inhouse ecoli that I was trying was only used as it was convenient (though clearly didn't end up being convenient as this text chain will prove). We are in the process of submitting this sequence to genome, at which point it would be run through pgap anyway, so its not super critical for my research at this point in time.

Its very possible that there might be some special characters in the in-house header. I suspect it's going to be the []'s that were in the tags. Once I get this straightened out, and figure out how to run more than one strain at a time, the researchers in my organization are going to be very pleased.

from pgap.

ChristyPeterson commented on August 19, 2024

Sorry didn't mean to close the issue

from pgap.

azat-badretdin commented on August 19, 2024

I suspect it's going to be the []'s that were in the tags.

Single nested pairs of tags are OK. We expect them

seqid My organsim [topology=circular]

It could potentially be a difficult case if you had double nested tags (some organisms do include them in the name), but as far as I remember our FASTA parsers are smart enough to handle that.

the researchers in my organization are going to be very pleased

We will be very pleased as well. Happy landings!

from pgap.

ChristyPeterson commented on August 19, 2024

After some final runs, it looks like as long as I strip out the []'s in the header, then the file runs just fine. This shouldn't be an issue going forward for our research genomes.

from pgap.

npavlovikj commented on August 19, 2024

@ChristyPeterson , I am sorry for the bother, but can you please share the Singularity command you used for PGAP, and the Dockerfile as well? I am trying to use Singularity with the available NCBI Dockerfile, but the directory /panfs/pan1.be-md.ncbi.nlm.nih.gov can not be found...

from pgap.

ChristyPeterson commented on August 19, 2024

Hi @npavlovikj, so we took the docker file directly from the published pgap docker registry. The file was called ncbi-pgap-2018-11-07.build3190.img and we didn't modify it at all. We built a conda environment and followed the instillation instructions of the prerequisites (getting the docker file and all the database files, etc) within that environment. I think the only deviation of the instillation was to also install singularity version 3.0.2-1.el7 within the conda environment, although that might just be installed on our cluster so wasn't required to be installed in the conda file.

If that dir panfs couldn't be found within the docker, this is maybe pointing to an issue with the docker container you built?

So I don't know if there is a singularity configuration to use the docker or not, though I don't believe so. To start the run the command that I use is the following:
cwltool --singularity ./wf_pgap_simple.cwl genome_input.yaml
I hope this helps.

from pgap.

azat-badretdin commented on August 19, 2024

am trying to use Singularity with the available NCBI Dockerfile, but the directory /panfs/pan1.be-md.ncbi.nlm.nih.gov can not be found...

@npavlovikj . Could you please post the tag of your docker image similar how @ChristyPeterson posted the tag of their docker image?

from pgap.

npavlovikj commented on August 19, 2024

@ChristyPeterson , many thanks for the detailed and useful explanation! I am not familiar with cwltool and didn't know about the --singularity flag (I was trying to combine the main two Dockerfiles earlier)!

As you suggested, I created conda environment with all dependencies for PGAP. Next, I downloaded the newest PGAP code and required data as described here, https://github.com/ncbi/pgap/wiki/Installation. To try the test example with the MG37 genome., in mg37_input.yaml, I added the following lines:
hints:
DockerRequirement:
dockerPull: ncbi/pgap:2018-11-07.build3190
specifying the newest 2018-11-07.build3190 tag of the PGAP image.

The job ran for about an hour before I got the error Permission denied: 'seq_id_chunk for job BLAST_against_nS_rRNA_db_gpx_qsubmit:
Exception while running job Traceback (most recent call last): File "/work/npavlovikj/pgap_env/lib/python3.6/site-packages/cwltool/job.py", line 326, in _execute inplace_update=self.inplace_update) File "/work/npavlovikj/pgap_env/lib/python3.6/site-packages/cwltool/job.py", line 151, in relink_initialworkdir shutil.rmtree(host_outdir_tgt) File "/work/npavlovikj/pgap_env/lib/python3.6/shutil.py", line 480, in rmtree _rmtree_safe_fd(fd, path, onerror) File "/work/npavlovikj/pgap_env/lib/python3.6/shutil.py", line 418, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) File "/work/npavlovikj/pgap_env/lib/python3.6/shutil.py", line 438, in _rmtree_safe_fd onerror(os.unlink, fullname, sys.exc_info()) File "/work/npavlovikj/pgap_env/lib/python3.6/shutil.py", line 436, in _rmtree_safe_fd os.unlink(name, dir_fd=topfd) PermissionError: [Errno 13] Permission denied: 'seq_id_chunk' [job BLAST_against_nS_rRNA_db_gpx_qsubmit_2] completed permanentFail

Does anyone has encountered this error before? I have the proper rights (read/write) to write in the PGAP directory (from where I run PGAP), as well as the conda environment. Also, I used 100GB of RAM for the job.
In my environment, I have installed python=3.6.7, cwltool=1.0.20181217162649, and cwl_runner=1.0.

Any help and directions would be highly appreciated!

from pgap.

ChristyPeterson commented on August 19, 2024

I did find that I needed not only write access in the dir from which I ran PGAP but I also needed write access to the PGAP build dir. For example, when we initially installed this on our cluster, the conda environment was on a drive that I only have read access too. I kept getting permission issues until I copied the build file into a drive that I had both read/write access to. Then using the copied build file as my working directory, I copied in my genome, created my input files (yaml, template etc) and ran from there. This had no issue.
I'm not sure if you were trying to run the program keeping the working directory and the actual program directory separate? I believe this is because the program writes temporary files within the actual program build dir, though that is a guess.

from pgap.

npavlovikj commented on August 19, 2024

Thanks @ChristyPeterson . My directory for the conda environment, as well as the PGAP main directory (from where I run PGAP itself) are in location where I have read/write permissions to... I will expand these permissions to 777 just to test whether that is the case...

UPDATE: The same error persists...

from pgap.

Process ends in Permanent Fail with other than mg37 about pgap HOT 25 CLOSED

Comments (25)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent