Giter Site home page Giter Site logo

Comments (17)

azat-badretdin avatar azat-badretdin commented on July 20, 2024

1/ Salmonella is not "species", it's "genus"
2/ We lost the functionality of supporting "genus" option in this release and we are working on restoring it soon
3/ Case might be important (usually biologists always capitalize genus in binomials, so I am not familiar with this use case).

Please try

genus_species: 'Salmonella enterica'

or other legitimate Salmonella species.

from pgap.

sekhwal avatar sekhwal commented on July 20, 2024

Thank you for the information.
It works, but when I run it with location: 'plasmid' it generates the same error "Final process status is permanentFail".

Please let me know what change I should make in the submol.yaml file.
Here is the information of my current submol.yaml file that I am trying to run for plasmid genome.

topology: 'circular'
location: 'plasmids'
organism:
genus_species: 'Salmonella enterica'
strain: 'P1122481'

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

location: 'plasmids'

Should be strictly 'plasmid' or 'chromosome'

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

You can also try using our relatively new way of running pgap.py specified in quick notes, where all the information is in FASTA file and species qualification:

./pgap.py .... -s 'My species' -g My.fasta

In this case you can specify plasmid molecules by appending [location=plasmid] to your FASTA definition lines for corresponding sequences

from pgap.

sekhwal avatar sekhwal commented on July 20, 2024

I tried the following way python3 /scripts/pgap.py -r -o P1122481_results -s 'Salmonella enterica' -g P1122481.fasta

I am using the fasta file with the header

1_length=4998493_depth=1.00x_circular=true_[location=chromosome]

But still generating the issue ""Final process status is permanentFail".

from pgap.

sekhwal avatar sekhwal commented on July 20, 2024

In another way, I used correctly location: 'plasmid' in in the submol.yaml but it still unable to run.

topology: 'circular'
location: 'plasmid'
organism:
genus_species: 'Salmonella enterica'
strain: 'P1122481'

from pgap.

thibaudnis avatar thibaudnis commented on July 20, 2024

1_length=4998493_depth=1.00x_circular=true_[location=chromosome]

Please review https://github.com/ncbi/pgap/wiki/Input-Files#Genome-assembly-sequence-file. There are several characters that are not allowed in this SeqID (the SeqID is everything before the first space). You can try SeqID of 1 and add modifiers:
1 [topology=circular] [location=chromosome]
Length and depth are not supported modifiers according to: https://www.ncbi.nlm.nih.gov/genbank/mods_fastadefline/

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

But still generating the issue ""Final process status is permanentFail".

Could you please post the resulting cwltool.log file? Thanks!

from pgap.

sekhwal avatar sekhwal commented on July 20, 2024

It seems the header line is correct. And it is still showing an error "WARNING Final process status is permanentFail
" with plasmid sequence. However, it works with 'chromosome' even I did not change any in the header ">1 length=4998493 depth=1.00x circular=true".

##used command
python3 /scripts/pgap.py -r -o P1122481_plasmid input_P1122481_plasmid.yaml

##plasmid fasta file header

contig001 [location = plasmid] [plasmid-name = pPSU1122481] [topology=circular]

##Here is the .yaml file
fasta:
class: File
location: P1122481_plasmid.fasta
submol:
class: File
location: P1122481_plasmid1_submol.yaml

cwltool.log
topology: 'circular'
location: 'plasmid'
organism:
genus_species: 'Salmonella enterica'
strain: 'pPSU1122481'

from pgap.

sekhwal avatar sekhwal commented on July 20, 2024

It seems, it does not work with small genomes like plasmid. I used pgap earlier and it worked perfectly without concerning about any specify header and special letters. Should I download old version and try?

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

Try ./pgap.py --ignore-errors ....

from pgap.

sekhwal avatar sekhwal commented on July 20, 2024

It works when I use both chromosome and plasmid in one fasta file. I think the latest pgap version has issue of having small genome like plasmid.
##command
python3 /scripts/pgap.py -r -o P2226300_results input_P2226300.yaml
Thank you for your help!

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

It works when I use both chromosome and plasmid in one fasta file.

Because with chromosome, the total size of the genome matches the expectation for this particular species.

It does not reject plasmids per se (you can try to replace kewword plasmid with chromosome in that small plasmid FASTA file) and see for yourself - the result will be the same, because it rejects by size, not by molecule type

Have you tried inserting --ignore-errors into the list of command line switches?

from pgap.

vappiah avatar vappiah commented on July 20, 2024

@azat-badretdin I have a similar issue. Please find attached my cwtool.log file
cwltool.log

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

User @vappiah I am not so sure. It says

'contig001[location=chromosome]' is not a valid local ID (m_Pos = 1)

which most likely means that you omitted quite crucial space delimiter separating seq-id from the rest of FASTA definition line

It's a different error from the same ballpark "things that users do in FASTA definition line"

from pgap.

vappiah avatar vappiah commented on July 20, 2024

Thanks @azat-badretdin . I made the necessary correction and it works now.

from pgap.

azat-badretdin avatar azat-badretdin commented on July 20, 2024

Glad to hear that, user @vappiah !

from pgap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.