Giter Site home page Giter Site logo

illegal character 0 about uni-fold HOT 16 CLOSED

dptech-corp avatar dptech-corp commented on June 17, 2024
illegal character 0

from uni-fold.

Comments (16)

dominik-handler avatar dominik-handler commented on June 17, 2024 2

I was able to fix my problem by manually removing all sequences containing invalid characters according to this:
google-deepmind/alphafold#569

tac pdb_seqres.txt | sed '/^CT05/,+1d' | tac > pdb_seqres_fixed.txt

Dominik

from uni-fold.

ZiyaoLi avatar ZiyaoLi commented on June 17, 2024 1

Copy that. It looks that the script mishandled the DNA chains as fastas so that 0 is invalid. We are looking into the solution.

from uni-fold.

ZiyaoLi avatar ZiyaoLi commented on June 17, 2024 1

Glad to see this solved, and thank you for the update. I believe they do not include dna & rna in the pdb_seqres.txt now, which is the reason of the problem.

from uni-fold.

ZiyaoLi avatar ZiyaoLi commented on June 17, 2024

It seems that the parsing of file pdb_seqres.txt failed:

Parse failed (sequence file /home/data/pdb_seqres/pdb_seqres.txt):
Line 1360234: illegal character 0

Would you please upload the file you used? I understand that the file is big, so perhaps you can locate Line 1360234 and show the context nearby.

from uni-fold.

ZiyaoLi avatar ZiyaoLi commented on June 17, 2024

Also @BaozCWJ may look into this.

from uni-fold.

superantichrist avatar superantichrist commented on June 17, 2024

7ooo_B mol:na length:11 DNA (5'-D(CPTP*(RWQ)PTPCPTPTPTPG)-3')
CT05ATCTTTG
7ooo_E mol:na length:11 DNA (5'-D(CPTP*(RWQ)PTPCPTPTPTPG)-3')
CT05ATCTTTG

second line is Line 1360234, and as error says, '0' character involved.

I used databases from alphafold2 reference site, and it works with their code.

from uni-fold.

BaozCWJ avatar BaozCWJ commented on June 17, 2024

7ooo_B mol:na length:11 DNA (5'-D(_CP_TP*(RWQ)P_TP_CP_TP_TP_TP_G)-3')
CT05ATCTTTG
7ooo_E mol:na length:11 DNA (5'-D(_CP_TP*(RWQ)P_TP_CP_TP_TP_TP_G)-3')
CT05ATCTTTG

second line is Line 1360234, and as error says, '0' character involved.

I used databases from alphafold2 reference site, and it works with their code.

It looks like the error of hmmsearch command
/usr/bin/hmmsearch --noali --cpu 8 --F1 0.1 --F2 0.1 --F3 0.1 --incE 100 -E 100 --domE 100 --incdomE 100 -A /tmp/tmpil8uqp24/output.sto /tmp/tmpil8uqp24/query.hmm /home/data/pdb_seqres/pdb_seqres.txt
could you please try the same fasta with alphafold's code and upload the coressponding hmmsearch command?

from uni-fold.

superantichrist avatar superantichrist commented on June 17, 2024

I checked alphafold code and It appears hmmsearch only used in multimer model, and when I run multimer, same error occured.

command seems same. (see below)
I0809 13:25:16.533803 140494127470400 run_docker.py:255] I0809 04:25:16.531341 139681435416384 hmmsearch.py:103] Launching sub-process ['/usr/bin/hmmsearch', '--noali', '--cpu', '8', '--F1', '0.1', '--F2', '0.1', '--F3', '0.1', '--incE', '100', '-E', '100', '--domE', '100', '--incdomE', '100', '-A', '/tmp/tmp69wsql5x/output.sto', '/tmp/tmp69wsql5x/query.hmm', '/mnt/pdb_seqres_database_path/pdb_seqres.txt']
....(omitted)
I0809 13:25:28.802923 140494127470400 run_docker.py:255] stderr:
I0809 13:25:28.803056 140494127470400 run_docker.py:255] Parse failed (sequence file /mnt/pdb_seqres_database_path/pdb_seqres.txt):
I0809 13:25:28.803190 140494127470400 run_docker.py:255] Line 1360234: illegal character 0

Sorry for inaccurate info that alphafold2 ref works.

So, Is my pdb_seqres.txt file a problem?

from uni-fold.

superantichrist avatar superantichrist commented on June 17, 2024

ftp://ftp.wwpdb.org/pub/pdb/derived_data/pdb_seqres.txt seems modified. I checked my old pdb_seqres.txt and find differences.
I re-run with old one and error not occured!
Thanks for your reply.

from uni-fold.

dominik-handler avatar dominik-handler commented on June 17, 2024

Hi, I ran into the same problem. Is there a fix that works with the new PDB_seqres version? Or alternatively, do you know if the old file is still available somewhere?

Thank you,
Dominik

from uni-fold.

ZiyaoLi avatar ZiyaoLi commented on June 17, 2024

@dominik-handler Thank you for the report, and the hot-fix for this issue. Since it's been reported twice, we are now looking into a solution to automatically fix this issue.

from uni-fold.

dominik-handler avatar dominik-handler commented on June 17, 2024

Thank you!
By the way the database file-structure created by download_all does not fit the expected file naming of run_unifold.sh.
I had to manually fix the structure of the uniprot database.

from uni-fold.

ZiyaoLi avatar ZiyaoLi commented on June 17, 2024

Thank you! By the way the database file-structure created by download_all does not fit the expected file naming of run_unifold.sh. I had to manually fix the structure of the uniprot database.

Maybe @BaozCWJ can look into this?

from uni-fold.

BaozCWJ avatar BaozCWJ commented on June 17, 2024

Thank you!
By the way the database file-structure created by download_all does not fit the expected file naming of run_unifold.sh.
I had to manually fix the structure of the uniprot database.

Thanks for your issue, could you please confirm the correct path of uniprot database as following?
--uniprot_database_path=$database_dir/uniprot/uniprot.fasta

from uni-fold.

dominik-handler avatar dominik-handler commented on June 17, 2024

That should be correct. Thank you!

from uni-fold.

BaozCWJ avatar BaozCWJ commented on June 17, 2024

That should be correct. Thank you!

Thx! It's already fixed in the #42

from uni-fold.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.