Giter Site home page Giter Site logo

create fake pdb70 msa file about parafold_dev HOT 3 OPEN

ken83715 avatar ken83715 commented on September 24, 2024
create fake pdb70 msa file

from parafold_dev.

Comments (3)

Zuricho avatar Zuricho commented on September 24, 2024

Actually I cannot answer your question because I did not skip the template selection step by this. My approach is to use a template with identical sequence as a "man-made" template, which is not that reasonable for selecting template. I agreed that create a fake pdb70 msa file is a better solution, but I did not make it till now. Maybe I will look into this in near future, and we can have some discussion on that (and I'm also curious how you did that 🤣). Thanks a lot for your attention.

from parafold_dev.

ken83715 avatar ken83715 commented on September 24, 2024

Thanks for replying, I'm currently using original Alphafold repo with some modified.

My goal is also use a specific pdb file as template,
so I have to made Alphafold not to select templates from database (which takes several hours long),
and with -use_precomputed_msas as true, the create_fakemsa generates files in output_folder/msas,
so that alphafold will use these files instead of spending very long time searching templates.
Then the create_manual_template transform pdb file to features.pkl. (am I understand it right?)

The multimer mode generates two folders A and B in msas folder (two sequence for example),
each contain files generated by create_fakemsa. There is also a chain_id_map.json file needed in msas folder.

Im not sure if monomer mode and multimer mode search from the same genetic databases set or there are some difference,
multimer mode search result include bfd_uniref_hits.a3m, mgnify_hits.sto, pdb_hits.sto, uniprot_hits.sto, uniref90_hits.sto,
monomer mode search result include bfd_uniref_hits.a3m, mgnify_hits.sto, pdb_hits.hhr, uniref90_hits.sto

I struggled for several days until found your solution, couldn't make it without these two python files, thanks again and if you know other way to stop Alphafold search templates, just predict straightly, I will be very thankful.

from parafold_dev.

Zuricho avatar Zuricho commented on September 24, 2024

Actually we are working on the same objective. I previously worked on a adapted version called ParaFold to split the CPU and GPU part. In ParaFold, I use the feature.pkl file to link the CPU (actually MSA part) and GPU part (actually the AlphaFold model). So, the AlphaFold input is solely depend on the feature.pkl file

My approach is to edit the feature.pkl or create fake feature.pkl to hack AlphaFold inputs, but I only have tried in monomer (multimer models might be more complex, but similar).

In this repo, I added 3 different ways to play with feature.pkl:

  1. create an empty feature.pkl: no MSA, no template, nothing except input sequence information is in the feature.pkl file. You can find it here: https://github.com/Zuricho/ParaFold_dev/blob/main/parafold/create_empty_feature.py.
  2. read a manual template: You can find my code in the function named make_manual_template_features in https://github.com/Zuricho/ParaFold_dev/blob/main/parafold/create_manual_template.py. Actually, I read coordinates from a .pdb file and set the coordinates corresbond to the input .pdb file.

I know what you are doing, like making manual pdb_hits files to modify the alignment between template pdb and input sequence. I also tried this before but found it might be more complicate than I thought (maybe it is because I did not fully understand the align process🤣). Maybe I can look into this sooner or later.

from parafold_dev.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.