Comments (17)
1/ Salmonella is not "species", it's "genus"
2/ We lost the functionality of supporting "genus" option in this release and we are working on restoring it soon
3/ Case might be important (usually biologists always capitalize genus in binomials, so I am not familiar with this use case).
Please try
genus_species: 'Salmonella enterica'
or other legitimate Salmonella species.
from pgap.
Thank you for the information.
It works, but when I run it with location: 'plasmid' it generates the same error "Final process status is permanentFail".
Please let me know what change I should make in the submol.yaml file.
Here is the information of my current submol.yaml file that I am trying to run for plasmid genome.
topology: 'circular'
location: 'plasmids'
organism:
genus_species: 'Salmonella enterica'
strain: 'P1122481'
from pgap.
location: 'plasmids'
Should be strictly 'plasmid' or 'chromosome'
from pgap.
You can also try using our relatively new way of running pgap.py specified in quick notes, where all the information is in FASTA file and species qualification:
./pgap.py .... -s 'My species' -g My.fasta
In this case you can specify plasmid molecules by appending [location=plasmid] to your FASTA definition lines for corresponding sequences
from pgap.
I tried the following way python3 /scripts/pgap.py -r -o P1122481_results -s 'Salmonella enterica' -g P1122481.fasta
I am using the fasta file with the header
1_length=4998493_depth=1.00x_circular=true_[location=chromosome]
But still generating the issue ""Final process status is permanentFail".
from pgap.
In another way, I used correctly location: 'plasmid' in in the submol.yaml but it still unable to run.
topology: 'circular'
location: 'plasmid'
organism:
genus_species: 'Salmonella enterica'
strain: 'P1122481'
from pgap.
1_length=4998493_depth=1.00x_circular=true_[location=chromosome]
Please review https://github.com/ncbi/pgap/wiki/Input-Files#Genome-assembly-sequence-file. There are several characters that are not allowed in this SeqID (the SeqID is everything before the first space). You can try SeqID of 1 and add modifiers:
1 [topology=circular] [location=chromosome]
Length and depth are not supported modifiers according to: https://www.ncbi.nlm.nih.gov/genbank/mods_fastadefline/
from pgap.
But still generating the issue ""Final process status is permanentFail".
Could you please post the resulting cwltool.log
file? Thanks!
from pgap.
It seems the header line is correct. And it is still showing an error "WARNING Final process status is permanentFail
" with plasmid sequence. However, it works with 'chromosome' even I did not change any in the header ">1 length=4998493 depth=1.00x circular=true".
##used command
python3 /scripts/pgap.py -r -o P1122481_plasmid input_P1122481_plasmid.yaml
##plasmid fasta file header
contig001 [location = plasmid] [plasmid-name = pPSU1122481] [topology=circular]
##Here is the .yaml file
fasta:
class: File
location: P1122481_plasmid.fasta
submol:
class: File
location: P1122481_plasmid1_submol.yaml
cwltool.log
topology: 'circular'
location: 'plasmid'
organism:
genus_species: 'Salmonella enterica'
strain: 'pPSU1122481'
from pgap.
It seems, it does not work with small genomes like plasmid. I used pgap earlier and it worked perfectly without concerning about any specify header and special letters. Should I download old version and try?
from pgap.
Try ./pgap.py --ignore-errors ....
from pgap.
It works when I use both chromosome and plasmid in one fasta file. I think the latest pgap version has issue of having small genome like plasmid.
##command
python3 /scripts/pgap.py -r -o P2226300_results input_P2226300.yaml
Thank you for your help!
from pgap.
It works when I use both chromosome and plasmid in one fasta file.
Because with chromosome, the total size of the genome matches the expectation for this particular species.
It does not reject plasmids per se (you can try to replace kewword plasmid with chromosome in that small plasmid FASTA file) and see for yourself - the result will be the same, because it rejects by size, not by molecule type
Have you tried inserting --ignore-errors
into the list of command line switches?
from pgap.
@azat-badretdin I have a similar issue. Please find attached my cwtool.log file
cwltool.log
from pgap.
User @vappiah I am not so sure. It says
'contig001[location=chromosome]' is not a valid local ID (m_Pos = 1)
which most likely means that you omitted quite crucial space delimiter separating seq-id from the rest of FASTA definition line
It's a different error from the same ballpark "things that users do in FASTA definition line"
from pgap.
Thanks @azat-badretdin . I made the necessary correction and it works now.
from pgap.
Glad to hear that, user @vappiah !
from pgap.
Related Issues (20)
- [BUG] Failing to run my own sequence HOT 19
- [FEATURE REQUEST] Support for Charlie Cloud Docker compatible (but more secure) container system for HPC HOT 1
- pgap --update showing huge file size during installation HOT 19
- [BUG] A YAML file argument cannot be used in combination with either the -s/--organism or -g/--genome arguments HOT 8
- [BUG] -c flag not received: /mnt/shared/scratch/theaven/uncompressed/hogenhout/pgap-s7 HOT 1
- [BUG] Fail in GenBank output file HOT 9
- product protein name issues HOT 3
- [BUG] Final process status is permanentFail HOT 1
- [Error] Docker exited with rc = 1 HOT 3
- [Error] Docker exited with rc =1 HOT 2
- Get_Proteins_app issues HOT 14
- Error: Final process status is permanentFail HOT 7
- source code for gc_get_molecules HOT 2
- PGAP for multiple users HOT 2
- ORF prediction issue HOT 3
- [BUG] Genus species requirement HOT 8
- [BUG] PGAP fails on test genome when PGAP_INPUT_DIR set to other than default HOT 4
- Species_Requirement_option HOT 1
- [BUG] Failing to run some sequences HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pgap.